Commit Graph

924 Commits

Author SHA1 Message Date
Christophe Monniez
75e56d944a [IMP] runbot: use a separate method for git fetch command
At the end of the _update_git method, the "git fetch" command is run.
That makes it diffcult to override to change its behavior (for example
to avoid fetching pull requests).

With this commit, the command is separated in a new small method that
can be easily overriden.
2019-05-03 16:52:54 +02:00
Christophe Monniez
5dd889de3c [IMP] runbot: split branch creation and pending builds creation
When searching for new builds by parsing git refs, the new branches are
created as well as the pending builds in the same _find_new_commits
method.

With this commit, this behavior is splitted into two methods, that way,
it's now possible to create missing branches without creating new
builds. The closest_branch detection is enhanced because all the new
branches are created before the builds (separated loops).

The find_new_commits method uses an optimized way to search for
existsing builds. Before this commit, a build search was performed for
each git reference, potentially a huge number.

With this commit, a raw sql query is performed to create a set of tuples
(branch_id, sha) which is enough to decide if a build already exists.

A test was added to verify that new refs leads to pending builds.

Also, a performance test was added but is skipped by default because it
needs an existing repo with 20000 branches in the database so it will
not work with an empty database. This test showed a gain of performance
from 8sec to 2sec when creating builds from new commits.

co-authored by @Xavier-Do
2019-05-03 10:34:49 +02:00
xmo-odoo
4206d75256
[IMP] runbot_merge: wait for (and log) repo update / staging visibility
The race condition which prompted STAGING_SLEEP rears its ugly head
again: when pushing a base repo and its dependents, it's possible for
the update to the base repo's new head to take much longer to be visible
than the dependents (or so it seems?).

In this case, CI might pick up the correct dependent but pick an older /
incorrect revision of the base, leading to a staging failing for no good
reason.

This change uses info/refs to check for the updated staging head to be
visible at the repo level after it's been set / updated via the API. It
assumes repos are in topological order.
2019-04-29 12:42:54 +02:00
Christophe Monniez
e5420f7a3a [FIX] runbot: add missing repo parameter 2019-04-25 21:51:01 +02:00
XavierDo
e323aa888d [IMP] runbot: add dependencies to build
Before this commit, dependencies (i.e. community commit to use when testing enterprise)
were computed at checkout, when the build was going from pending to testing state and
were not stored.

Since the duplicate detection was done at create, the get_closest_branch_name was called
in a loop for each posible duplicate candidate, then a last time at checkout. The main idea of this
pr is to store the build dependecies on build at create, making the duplicate detection
faster (especially when the build name is matching many indirect builds).

The side effect of this change is that the build dependencies won't be affected if a new
commit is pushed between the build creation and the checkout. The build is fully
determined at creation. get_closest_branch is only called once per build

The duplicate detection will also be more precise since we are matching on the commits groups
that were used to run the build, and not only the branch name.

Some work has also been done to rework the closest branch detection in order to manage new corner
cases. Hopefully, everything should work as before (or in a better way).

In a soon future, it will also be possible to use this information to make an "exact rebuild"
or to find corresponding community build.

Pr: #117
2019-04-25 17:58:51 +02:00
XavierDo
5b22d57566 [IMP] runbot: move build closest_branch_name to branch
Closest_branch is more branch scope related, puting it in branch instead of build
will ease testing and refactoring.

PR: #117
2019-04-25 17:58:51 +02:00
Christophe Monniez
8aeabb01e3 [IMP] runbot: give priority to normal builds
When some special builds are scheduled during the night, free slots on
runbot instances are used. Depending on the number of scheduled builds,
all the slots can be used. That prevents people to use the runbot for
normal builds during this time.
To mitigate the problem, the scheduled builds were postponed to the
middle of the night ... the CET night. It means that it could be morning
in India.

With this commit, a build priority is given to normal builds. On the
other hand, scheduled builds are pushed at the end of the queue.

So even if there are plenty of builds during the Belgian night, if
someone pushes a commit in between, it will be built in priority before
the scheduled pending builds.
2019-04-23 13:24:32 +02:00
Christophe Monniez
84ea0e7ef2 [FIX] runbot: use a repo short name to avoid bugs in qweb template
When using a local git repo, the git name does not have colon, making
the frontend crash.

With this commit, a non-stored computed field 'short_name' is added to
compute a shortest version of the name.
2019-04-19 15:52:28 +02:00
Christophe Monniez
82f881d9e6 [IMP] runbot: detach odoo command from docker_run
When the docker_run function is called, the odoo command is decorated
with a pip command to install required packages.
This pollute the docker_run function if a runbot job_ method wants to
use docker for something else that starting an odoo instance (like
pg_dump) for example.

With this commit, command modification is made in an optional helper
function named build_odoo_cmd.

the docker_run function now needs the command to run as a string instead
of a list of odoo cmd and its parameters.
2019-04-16 16:41:50 +02:00
Xavier Morel
aa614c6077 [IMP] runbot_merge: more reliable blocked attribute
Use the proper / actual "is there any stageable PR" query to check if
a PR is blocked as well, that way they shoudn't be diverging all the
time even if it might make PR.blocked a bit more expensive.

fixes #111
2019-04-05 08:23:56 +02:00
Christophe Monniez
6a8d34bb68 [IMP] runbot: add a small test for the _ask_kill method
The previous commit 574105b fixed the fact that killing a duplicate was
not possible.

This commit adds a small test to avoid regression.
2019-03-29 09:50:39 +01:00
Christophe Monniez
574105b66c [FIX] runbot: allow to kill a duplicate
Asking for the kill of a build which is the duplicate of another fails
because the state of this build is "duplicate", so the _ask_kill method
has no effect on it.

With this commit, the effect of _ask_kill is applied on the duplicate in
the above mentioned case.
2019-03-28 13:46:20 +01:00
Christophe Monniez
eb68de40f3 [FIX] runbot: speedup and limit search in frontend
When searching the builds for the frontend the resulting query can last
a very long time (up to 7sec).

With this commit, the search result is strictly limited to 100 builds,
the limit query parameter is removed and the search string length is limited to
60 chars.

The guess_result method is now optimized to guess results for testing
builds only. The others have the same value as the final result.
A few tests were added for this method.
Thanks @KangOl for the optimization code.
2019-03-19 16:55:36 +01:00
Christophe Monniez
f50b13172d [FIX] runbot: ivalidate cache to get valid hook_time
When github reaches the hook controller, the repo hook_time field is
updated. That way, a git fetch is done only when the hook_time is newer
that the last fetch. If the hook_time is updated during the long running
cron that runs the _cron_fetch_and_schedule method, the hook_time is cached
and only the old hook time is seen until the cron's end. The cursor
commit is not enough. As a result, the new builds are scheduled in the
next cron run.

With this commit, the cache is invalidated after the commit, that way,
the hook_time field contains the correct value.
2019-03-15 08:08:22 +01:00
Christophe Monniez
64694a6b0b [FIX] runbot: find duplicate when no PR in community
When a PR is created in odoo/enterprise but without a corresponding
PR in odoo/community BUT a corresponding branch in odoo-dev/community,
the closest_branch detection fails. Moreover, the duplicate detection
fails too.

As a consequence, the PR build will probably fail because it will be
built with the default target branch that could not be suited for it.
If the branch built succeeds, it leads to inconsistent results.

With this commit, a new case is added on the _get_closests_branch_name
to handle this case.
The serever_match field also reflects the difference as this case will
be marked as 'no PR'.
When a PR also exists in odoo/community, the server_match field will be
'exact PR'. This change should not imply migration.

This commit also adds a bunch of tests to test the closests branch name
detection and the duplicates.

Co-authored by @Xavier-Do
2019-03-14 08:59:13 +01:00
Christophe Monniez
da6551c28c [FIX] runbot: properly convert update frequency into integer 2019-03-11 22:01:17 +01:00
Xavier Morel
e12e6db653 [IMP] runbot_merge: lock commit to update its status in hook
A status being updated on a commit is a read/modify/update, meaning
it's possible for somebody else (including a concurrent event?) to
concurrently update the commit and conflict leading to the webhook
blowing up, which is undesirable as it's a data loss (whereas if it
blows up on the other side e.g. in the cron's commit processor the
cron will just take it up next iteration).
2019-03-11 14:54:58 +01:00
Xavier Morel
f5d783eb4b [FIX] runbot_merge: error in template
8d011e03d2 contains poopy bits in the
template, fix them.
2019-03-11 14:51:15 +01:00
Xavier Morel
e0320664f9 [ADD] runbot_merge: sentry integration
Might eventually extract / generalise, but for now it's simpler to
just do it in runbot_merge's post_load, that way there's no setup
change (just a small bit of configuration), and it's only enabled on
the instances runbot_merge is installed on.

fixes #97, closes #103
2019-03-07 11:56:45 +01:00
Christophe Simonis
10b456deda [FIX] runbot: update git before logging last commit 2019-03-07 10:40:54 +01:00
xmo-odoo
4944d6a503
[FIX] runbot_merge: small typo in error message 2019-03-06 22:50:55 +01:00
Xavier Morel
8d011e03d2 [IMP] runbot_merge: styling of awaiting and blocked lists on dashboard 2019-03-06 09:46:57 +01:00
Xavier Morel
48e08b657b [IMP] runbot_merge: send feedback on CI failure following r+
Will comment any time a statuses update folds to a CI failure on a
reviewed pull request. Might be somewhat spammy, we'll see.

No notification if the PR is not reviewed yet.

fixes #87
2019-03-05 09:03:26 +01:00
Xavier Morel
5aa9f5a567 [IMP] runbot_merge: extract commit validation to cron
Before this, impacting a commit's statuses on the relevant PR or
staging would be performed immediatly / inline with its
consumption. This, however, is problematic if we want to implement
additional processing like #87 (and possibly though probably not #52):
webhook handlers should be kept short and fast, feeding back into
github would not be acceptable.

- flag commits as needing processing instead of processing them
  immediately, this uses a partial index as it looks like the
  recommended / proper way to index a boolean column in which one of
  the values is searched much more than the other (todo: eventually
  check if that actually does anythnig)
- add a new cron for commits processing
- alter tests so they use this new cron (mostly by migrating them to
  `run_crons` though not solely as some still need more detailed
  management to properly check intermediate steps)

Fix an issue with closing a staged PR while at it (the "merging" tag
would potentially never be removed).
2019-03-05 08:07:19 +01:00
Xavier Morel
360d0e17ca [IMP] runbot_merge: don't quote signoff
Proper RFC5322 makes for much noisier messages, and seems completely
unnecessary as examples of sign-off on the internet don't quote spaces
/ names.

closes #102
2019-03-04 13:17:10 +01:00
Xavier Morel
1f30af4345 [IMP] runbot_merge: dashboard clarity
* split out truly awaiting PRs from those waiting on an event of some
  sort
* if a staging is active but doesn't have a state yet, it should be
  considered pending not cancelled

closes #74
2019-03-04 12:11:34 +01:00
Xavier Morel
b699ea7f47 [FIX] runbot_merge: validate PRs on head update
If a PR gets sync'd to a known-valid commit, it should be marked as
valid rather than get in this weird state where it's merely open but
github knows it passes CI.

Fixes #72
2019-03-04 10:34:40 +01:00
Xavier Morel
1d2c264728 [FIX] runbot_merge: properly update squash flag on PR retarget
closes #82
2019-03-04 09:52:21 +01:00
Xavier Morel
8ab72ce8d1 [FIX] runbot_merge: gap in PR names in awaiting list
Repeatedly fixed it live, but apparently still forgot to fix it in the
source.
2019-03-01 21:34:31 +01:00
Christophe Monniez
51938247d8 [FIX] runbot: send failure status to github when result is warn
When a runbot build ends without error but with one or more warning,
status are not sent to github. As a result, the PR stays in pending
state.

With this commit, the github status is set to failure when a build ends
in a "warn" result.
2019-03-01 17:33:03 +01:00
Xavier Morel
c693a7f841 [ADD] runbot_merge: button to manually cancel stagings
This is somewhat less useful with runbot's fail-fast as a runbot
failure (false positive or not) will now very quickly trigger an end
to the current staging.

Still, could be of use.

closes #89
2019-03-01 17:29:37 +01:00
Xavier Morel
eea3211f2b [IMP] runbot_merge: add logging to PR sync and reset error PRs to open
The choice to keep sync'd PRs in error means it's possible to update
the code and re-run the PR directly without it going through review &
CI again, which is a bit odd. Remove the special case and always reset
a sync'd PR to opened for clarity and simplicity.

closes #71
closes #83
2019-03-01 16:46:09 +01:00
Xavier Morel
c34e8ca083 [FIX] runbot_merge: race condition between closes #x and merging/FF
Turns out skipping locks is not very useful when there are no locks
being held because we only touch the PRs *after* the merge has been
applied.

So finally do that, lock all of a staging's PRs before we try to
fast-forward the relevant repositories, so a close command coming back
from github (from having seen the closes #xxx annotation) doesn't
screw us over.
2019-03-01 16:46:09 +01:00
Xavier Morel
0cd587fce7 [FIX] runbot_merge: don't blow the fetch loop when a PR has no label
No test because I don't understand how / why it's triggered, it's just
that some PRs don't have a label. I assumed the issue occurred when
the source branch or even repo (cross-repo PR) was deleted, but it
doesn't seem to trigger the issue (or in any case not in as short a
time as a test, maybe GH eventually does some vacuuming which causes
the issue?

Anyway we may eventually want to reclaim these PRs (allowing a lack of
label and treating them like the patch-\d labels: with no semantic
value) however the simplest thing to do for now is to just ignore the
corresponding PR.

closes #101
2019-03-01 16:42:58 +01:00
Xavier Morel
55ece42d8f [IMP] runbot_merge: delete repos being created in remote tests
In remote tests, if the deletion of a test repository fails (because
gh glitch) or the repo creation succeeded but reported a failure (for
some reason) the entire run is hosed because every test trying to
create a similarly named repository will explode.

Alter repomaker to just try to delete the repo, unless --no-delete
mode in which case just skip any further test trying to use the same
repository (not deleting the repo is the entire point of --no-delete,
as its purpose is the ability to do post-mortem debugging on
repository state).

closes #99
2019-03-01 16:42:57 +01:00
Xavier Morel
79b03a6995 [IMP] runbot_merge: retry FF on failure in case it's transient
Further improvements are possible, but that seems like a good
start (hopefully).

closes #94
2019-03-01 16:42:57 +01:00
Xavier Morel
42046cb21c [IMP] runbot_merge: logging on github requests failures
Github is subject to a fair amount of transient failures, which are
currently ill-logged: an exception is raised and the caller /
responsible might eventually log something, but it's not really
formalised and centralised, and is thus inconvenient to try and
post-mortem issues with github's support.

Change this such that *almost* all github API calls get extensively
logged (status, reason, all headers, body) on failure.

Also automatically sets debug logging for odoo in local tests, and
alter the fake response constructor thing so it doesn't set a json
mimetype when the body is not valid json.

Closes #98
2019-03-01 16:42:57 +01:00
Xavier Morel
f579b28a93 [FIX] runbot_merge: logging.warn is deprecated, use logging.warning 2019-02-28 13:50:25 +01:00
Christophe Monniez
286b1a3d30 [FIX] runbot: update dependency repo before checkout
At checkout time when a build has no server (e.g. enterprise),
the dependency repo that contains the server needs to be extracted too.
It happens that this dependency repo is not up to date.

With this commit, the dependency repo is updated before its extracttion.
2019-02-26 17:34:34 +01:00
Christophe Monniez
6e57b0954d [FIX] runbot: limit duplicate builds search to one
When searching for duplicate builds, a git ls-remote is used to verify
that the branch still exists. This command is time consuming (up to 2
seconds).

If the number of build is significant, it can last a very long time.

When a user push one ore more new branches without new commits, the
number of duplicate builds found may be very large (more than 92).
This loop blocks the cron wroker in charge of creating new builds.

This quick fix will limit the number of duplicate to 1 but if the
closest name is not the same, it will not be considered as a duplicate.
2019-02-26 15:44:58 +01:00
Christophe Simonis
19ffcdd4a2 [ADD] runbot_merge: sign off commits by reviewer
closes #50
closes #54
2019-02-26 13:36:46 +01:00
Christophe Monniez
41ce858c93 [FIX] runbot: fix oversight of logging reason in _skip
When giving a reason of a build skip, the reason paramater was missing
in the logging string.

Test is also added to verify the build skip.
2019-02-26 10:19:54 +01:00
Christophe Monniez
1617a2e339 [FIX] runbot: force repo update on runbot builders
When a runbot execute the cron_fetch_and_build method, the repo is
updated only if the webhook time is newer than the last fetch
time.

As the cron is now split into long running crons, the hook_time field is
cached. The runbot instance that sees a new build pending use this
cached value to estimate if the repo update is needed.

With this commit, the repo update is done right before exporting the
repo and only if the commit hash is not found.

As a bonus, the environment is reset in the long running cron of the
runbot builders to update the cached values.
2019-02-25 17:35:22 +01:00
Christophe Monniez
fe018aeefa [REF] runbot: split cron to speed up builds
The Runbot Cron is executed on each runbot instance. When the number of
instances scales, the time needed for an instance to obtain the cron
increases.

With this commit, the original runbot_cron is removed. Instead, a cron
have to be created to run the _cron_fetch_and_schedule method.
This method will fetch the repo and create pending builds. This cron is
intended to run only on one runbot instance. This method needs a host
parameter to specify which runbot instance will be in charge of this
task.

On the other hand, a dedicated cron have to be manually created for each
runbot instance that will have the build task.
Those cron's only have to call the _cron_fetch_and_build method with the
runbot hostname as a parameter. This method will then self
assign pending builds if there are slots available.
All available build slots are reserved in a single LOCKED SQL query.

Both methods are intended to last a large amount of time, just a few
minutes below the cron timeout to maximize the cron productivity.
The timeout is randomized to avoid deadlocks if the runbot instances are
started at the same time.

So the --limit-time-real parameter have to be set to a minimum of 180
sec (600 or 1200 are probably better targets).
2019-02-25 11:25:41 +01:00
Christophe Monniez
ffd27739a4 [FIX] runbot: limit number of log lines shown on build page
When displaying build logs, all the messages from ir_logging about this
particular build are fetched from the database.

From time to times, it happens that the number of logged messages is
really huge. Those messages lines could also contain multiple lines,
multiplying the number of row to generate in the html page.
When this happens, the process that generates the template last a long
time and ends with a MemoryError. If the end user, bored, hits the
refresh button multiple times, all the workers will be busy building
this template. In the end, all users get a Bad Gateway from nginx.

With this commit, the number of messages that will be taken into account
will be limited to 10000.
2019-02-21 15:57:14 +01:00
Christophe Monniez
602298330a [IMP] runbot: send status earlier when the build fails
When a user checks the runbot frontend, the guess_result field is used
to change the color of the build state. But github is not notified of
this guessed result.

As a consequence, the runbot_merge is not aware the build is failed and
will continue to wait.

With this commit, as soon as the guess_result detects a failure, the
status is sent to github, that way, runbot_merge will stop waiting
sooner.
2019-02-21 13:11:46 +01:00
Christophe Monniez
2c7feffd5e [IMP] runbot: avoid running test when installing base
When running the _job_10 method, a database is created with base module
alone. Tests are enabled during this job. Those tests are run again with
the _job_20 method. Moreover, even if the tests fail during _job_10,
they are not taken into account for the final result. The _job_10 method
duration is approximately 4 min.

With this commit, the tests are not enabled during _job_10.
2019-02-21 13:11:46 +01:00
Christophe Monniez
809f5639c2 [FIX] runbot: remove common ancestor case
Useless merge-base command is causing timeouts on runbots.
2019-02-20 10:09:36 +01:00
Christophe Monniez
4791c3a82e [IMP] runbot: add pyCrypto to Dockerfile
A new module in Odoo needs pyCrypto but this module alone is too limited
to justify an addition in the requirement file.

PR: https://github.com/odoo/odoo/pull/28816
2019-02-12 15:54:29 +01:00
Christophe Simonis
338166e474 [FIX] runbot: correct pg version detection
`local_cr` is directly a psycopg cursor, not an odoo one.
2019-01-31 16:29:21 +01:00