Might eventually extract / generalise, but for now it's simpler to
just do it in runbot_merge's post_load, that way there's no setup
change (just a small bit of configuration), and it's only enabled on
the instances runbot_merge is installed on.
fixes#97, closes#103
Will comment any time a statuses update folds to a CI failure on a
reviewed pull request. Might be somewhat spammy, we'll see.
No notification if the PR is not reviewed yet.
fixes#87
Before this, impacting a commit's statuses on the relevant PR or
staging would be performed immediatly / inline with its
consumption. This, however, is problematic if we want to implement
additional processing like #87 (and possibly though probably not #52):
webhook handlers should be kept short and fast, feeding back into
github would not be acceptable.
- flag commits as needing processing instead of processing them
immediately, this uses a partial index as it looks like the
recommended / proper way to index a boolean column in which one of
the values is searched much more than the other (todo: eventually
check if that actually does anythnig)
- add a new cron for commits processing
- alter tests so they use this new cron (mostly by migrating them to
`run_crons` though not solely as some still need more detailed
management to properly check intermediate steps)
Fix an issue with closing a staged PR while at it (the "merging" tag
would potentially never be removed).
Proper RFC5322 makes for much noisier messages, and seems completely
unnecessary as examples of sign-off on the internet don't quote spaces
/ names.
closes#102
* split out truly awaiting PRs from those waiting on an event of some
sort
* if a staging is active but doesn't have a state yet, it should be
considered pending not cancelled
closes#74
If a PR gets sync'd to a known-valid commit, it should be marked as
valid rather than get in this weird state where it's merely open but
github knows it passes CI.
Fixes#72
When a runbot build ends without error but with one or more warning,
status are not sent to github. As a result, the PR stays in pending
state.
With this commit, the github status is set to failure when a build ends
in a "warn" result.
This is somewhat less useful with runbot's fail-fast as a runbot
failure (false positive or not) will now very quickly trigger an end
to the current staging.
Still, could be of use.
closes#89
The choice to keep sync'd PRs in error means it's possible to update
the code and re-run the PR directly without it going through review &
CI again, which is a bit odd. Remove the special case and always reset
a sync'd PR to opened for clarity and simplicity.
closes#71closes#83
Turns out skipping locks is not very useful when there are no locks
being held because we only touch the PRs *after* the merge has been
applied.
So finally do that, lock all of a staging's PRs before we try to
fast-forward the relevant repositories, so a close command coming back
from github (from having seen the closes #xxx annotation) doesn't
screw us over.
No test because I don't understand how / why it's triggered, it's just
that some PRs don't have a label. I assumed the issue occurred when
the source branch or even repo (cross-repo PR) was deleted, but it
doesn't seem to trigger the issue (or in any case not in as short a
time as a test, maybe GH eventually does some vacuuming which causes
the issue?
Anyway we may eventually want to reclaim these PRs (allowing a lack of
label and treating them like the patch-\d labels: with no semantic
value) however the simplest thing to do for now is to just ignore the
corresponding PR.
closes#101
In remote tests, if the deletion of a test repository fails (because
gh glitch) or the repo creation succeeded but reported a failure (for
some reason) the entire run is hosed because every test trying to
create a similarly named repository will explode.
Alter repomaker to just try to delete the repo, unless --no-delete
mode in which case just skip any further test trying to use the same
repository (not deleting the repo is the entire point of --no-delete,
as its purpose is the ability to do post-mortem debugging on
repository state).
closes#99
Github is subject to a fair amount of transient failures, which are
currently ill-logged: an exception is raised and the caller /
responsible might eventually log something, but it's not really
formalised and centralised, and is thus inconvenient to try and
post-mortem issues with github's support.
Change this such that *almost* all github API calls get extensively
logged (status, reason, all headers, body) on failure.
Also automatically sets debug logging for odoo in local tests, and
alter the fake response constructor thing so it doesn't set a json
mimetype when the body is not valid json.
Closes#98
At checkout time when a build has no server (e.g. enterprise),
the dependency repo that contains the server needs to be extracted too.
It happens that this dependency repo is not up to date.
With this commit, the dependency repo is updated before its extracttion.
When searching for duplicate builds, a git ls-remote is used to verify
that the branch still exists. This command is time consuming (up to 2
seconds).
If the number of build is significant, it can last a very long time.
When a user push one ore more new branches without new commits, the
number of duplicate builds found may be very large (more than 92).
This loop blocks the cron wroker in charge of creating new builds.
This quick fix will limit the number of duplicate to 1 but if the
closest name is not the same, it will not be considered as a duplicate.
When a runbot execute the cron_fetch_and_build method, the repo is
updated only if the webhook time is newer than the last fetch
time.
As the cron is now split into long running crons, the hook_time field is
cached. The runbot instance that sees a new build pending use this
cached value to estimate if the repo update is needed.
With this commit, the repo update is done right before exporting the
repo and only if the commit hash is not found.
As a bonus, the environment is reset in the long running cron of the
runbot builders to update the cached values.
The Runbot Cron is executed on each runbot instance. When the number of
instances scales, the time needed for an instance to obtain the cron
increases.
With this commit, the original runbot_cron is removed. Instead, a cron
have to be created to run the _cron_fetch_and_schedule method.
This method will fetch the repo and create pending builds. This cron is
intended to run only on one runbot instance. This method needs a host
parameter to specify which runbot instance will be in charge of this
task.
On the other hand, a dedicated cron have to be manually created for each
runbot instance that will have the build task.
Those cron's only have to call the _cron_fetch_and_build method with the
runbot hostname as a parameter. This method will then self
assign pending builds if there are slots available.
All available build slots are reserved in a single LOCKED SQL query.
Both methods are intended to last a large amount of time, just a few
minutes below the cron timeout to maximize the cron productivity.
The timeout is randomized to avoid deadlocks if the runbot instances are
started at the same time.
So the --limit-time-real parameter have to be set to a minimum of 180
sec (600 or 1200 are probably better targets).
When displaying build logs, all the messages from ir_logging about this
particular build are fetched from the database.
From time to times, it happens that the number of logged messages is
really huge. Those messages lines could also contain multiple lines,
multiplying the number of row to generate in the html page.
When this happens, the process that generates the template last a long
time and ends with a MemoryError. If the end user, bored, hits the
refresh button multiple times, all the workers will be busy building
this template. In the end, all users get a Bad Gateway from nginx.
With this commit, the number of messages that will be taken into account
will be limited to 10000.
When a user checks the runbot frontend, the guess_result field is used
to change the color of the build state. But github is not notified of
this guessed result.
As a consequence, the runbot_merge is not aware the build is failed and
will continue to wait.
With this commit, as soon as the guess_result detects a failure, the
status is sent to github, that way, runbot_merge will stop waiting
sooner.
When running the _job_10 method, a database is created with base module
alone. Tests are enabled during this job. Those tests are run again with
the _job_20 method. Moreover, even if the tests fail during _job_10,
they are not taken into account for the final result. The _job_10 method
duration is approximately 4 min.
With this commit, the tests are not enabled during _job_10.
A new module in Odoo needs pyCrypto but this module alone is too limited
to justify an addition in the requirement file.
PR: https://github.com/odoo/odoo/pull/28816
The number is probably the most common search criteria for PRs (to
track their status / issues). Having to go through custom filters to
find one is a pain in the ass.
Already done live by editing the view, but means it's getting lost
every time the module gets updated.
closes#73
Adapt test for eb7f5de . The mentioned commit fixes an issue that occurs
when updating github status. A test already exists but was assuming that
the build is in 'done' state when reaching the job_29.
With this commit, the build used in test is set to 'testing' state like
in the real cases.
Also, a new test is added to test the job_00_init which also send
github status but with a minimal build.
Finally, the runtime that should appear in status description was
forgotten in the previous commit. Now the runtime is always sent with
the github status.
Since d7c7e54 the github status is send in job_29. At this moment, the
state of the build is still 'testing'. For that reason, the github
status is set to 'pending'.
With this commit, once a result is available, the github status is
updated with the right value even if the build state is 'testing'.
When a build reach the job30_run method, results from a previous testing
methods are computed.
With the previous commit 8c73e6a901 this
job can now be skipped. In that case, the results are not set.
With this commit, the results are computed in a separate method.
Since 8c73e6a it's possible to skip jobs from a build by using the
job_type field on the branch. If a branch job_type is set to 'none', the
builds are created but they stay in 'pending' state.
With this commit, the build is not even created if the 'job_type' is
'none'.
Since the runbot_merge module, some branches does not need to be built.
For example the tmp.* branches.
Some other branches does need to be tested but it could be useless to
keep them running. For example the staging branches.
Finally, some builds are generated by server actions during the night.
Those builds does not need to be kept running despite the branch configuration.
For example, the master branch can be configured to create builds with
testing and running but nightly multiple builds can be generated with
testing only.
For that purpose, this commit adds a job_type selection field on the
branch. That way, a branch can be configured by selecting the type of
jobs wanted.
A same kind of job_type was also added on the build that uses the
branch's value if nothing is specified at build creation.
A decorator is used on the job_ methods to specify their job types.
For example, a job method decorated by 'testing' will run if the
branch/build job_type is 'testing' or 'all'.
When a build is created, the --log-db command line argument is built
using the same db and credentials that the one used by the runbot.
With this commit, this argument is built based on a postgress connect
URI given as ir.config_parameter in the settings.
A dedicated role must be created beforehand on the runbot postgresql
server, accordingly to the given URI.
Also, care should be taken to give minimal privileges to this user only
granting "update" on the table ir_logging_id_seq and
"insert,select,udpate" on the table ir_logging.
To test the last resort branch matching when nothing in common can be
found, two PR were used leading to PR's target branch as the default
one.
Also, the test was never run beacause of a bad indentation.
With this commit, the indentation is fixed and the test uses regular
branches.
In frontend.py, the whole odoo.http module is imported but request is
imported separately. This make it difficult to mock the different things
comming from http in tests.
With this commit, only the needed parts are imported from odoo.http.
When a build is running, the stmp is the localhost.
Since Docker builds, the localhost is the container which does not catch
port 25 smtp. Mails are lost in the limbo.
With this commit, the default gateway of the Docker network is used as
smtp host for the builds. It's the responsability of the runbot host to
catch smtp traffic from the container.
This bridge interface exists by default on a system where Docker is
running. However, Docker is affected by this issue:
https://github.com/moby/moby/issues/26799
The first time the Docker daemon is installed, the Gateway is not
defined on the bridge interface. When the Docker daemon is restarted,
the gateway is correctly defined. Pay attention that restarting the
Docker daemon will kill all the running/testing builds.
When starting a container, the .odoorc|.openerp_serverrc file is not
used by the build.
With this commit, if a .odoorc or .openerp_serverrc file is found in the
home directory of the runbot user, this file is mounted read-only in the
container, allowing some customization.
When building Odoo, the instance is started on the same host as the
runbot. It means that all the required python packages have to be
installed on each runbot hosts with the same versions. Also there is no
real separation between builds. Finally, from a security point of view,
arbitrary code could be executed on the runbot host.
With this commit, the runbot uses Docker containers to build Odoo.
During the tests, Odoo http ports are not exposed to the outside,
meaning that nobody could interact with that instance.
The Docker image used for containers is valid for Odoo branches 10.0,
11.0, 12.0 and master.
When building, right before starting the Odoo tests, the tested branch's
requirements.txt is now taken into account to adapt the container.
On a runbot host, the "docker ps -a" command can be used to have the
list of the current builds. The containers are named using the build
dest field and the current running job. For example:
123456-12-0-123456_job_30_run
Prerequisites:
Docker have to be installed on the runbot hosts and the user that runs
the runbot should be able to use Docker. Typically, the runbot user have
to be added to the docker unix group.
On the first build, the Docker image will be built from scratch. It
can last several minutes locking the runbot cron during this time.
It means that on a multi-runbot configuration, this process will be
repeated for each runbot and during this time there will be no builds.
To avoid such a situation, the Docker image can be built from the
command line. The container.py file can be started like this:
python3 container.py build /tmp/build_dir
The /tmp/build_dir directory will be created to store the Dockerfile.
When the process is done, the "docker images" command should show an
image tagged runbot_tests in the odoo repository. At that time, the
runbot instance can be started, it will use this image for the builds.
Api change:
The 'job_*' methods signature has changed, the lock_path is not needed anymore.
Docker image informations:
Currently, the Docker image is built based on Ubuntu bionic to
benefit of the python 3.6 version.
Chrome and phantomjs are both installed.
The latest wkhtmltopdf (0.12.5) is installed as recommended on our wiki:
https://github.com/odoo/odoo/wiki/Wkhtmltopdf