When two steps in the same build needs to exchange informations, some
hacks have to be used. E.g. using the extra_params fields to store comma
separated values.
With this commit, a config_data field is added alongside with a
JsonDictField that automatically transform the data into json.
When changing the sticky value in a branch form, it triggers the
computes of previous version and intermediate versions.
In the onchange situation, the branch.id is a NEW id and it fails with a
traceback.
With this commit, a verification is made to ensure that the id is there.
As the Odoo requirements were recently updated, the last RUN entry in
the Docker image was rebuilt on our runbot instances.
Since that moment the coverage builds are failing with an import error.
After investigation, it appears that the latest coverage version modifies the sys.path.
A bug report was written for coverage: https://github.com/nedbat/coveragepy/issues/919
For that reason, this commit freeze the coverage version until this bug
is fixed.
Also, the os import problem in conatainer.py is fiexed.
ORM does not support non_searchable.non_stored dependency.
thus, the closest_sticky.previous_version dependency will log an error
when previous_version is written.
this dependency is usefull to make the compute recursive, avoiding to have
both record and record.closest_sticky in self, in that order, making the record.previous_version
empty in all cases.
Writing self on sticky will mitigate the problem. but it is still posible to
have computation errors if defined_sticky is not sticky. (which is not a normal use case)
This reverts commit 54f9b9b546.
The main reason is linked to inconsistency in state compute because
of error in nb_ computations.
This was to avoid concurrent update, witch is not a such a big problem
now since workers are no longer using crons. (retry on failure is faster).
When trying to remove test_tags on a build_error, the validation fails
because it tries to iterate on False. Also, the ValidationError
exception was not properly imported.
With this commit, the validation is fixed and a test is added.
When a repo is set to "no_build", there is no way to force a branch to
build from the backend.
With this commit, a button is added on a branch form to ask rebuild of the
branch, even when the repo is set to "no_build".
Migration tests comming on runbot, it will be usefull to have quick
way to obtain branches related to current build.
This commit adds a field for the colsest sticky branch, previous version
and intermediates stickies.
Example when last sticky is saas-13.2:
branch_name: master-test-tri
closest_sticky: master
previous_version: 13.0
intermediate_stickies: saas-13.1, saas-13.2
This commit add the possibility to add custom checks to python steps,
as well as ignoring triggered result if log of level error/warning
is not considered as a problem.
As the --upgrade-paths options does not work as expected in Odoo, a
symbolic link has to be created in odoo/addons/base/maintenance pointing
to the migration scripts.
The runbot uses Docker read-only volumes to access the sources that are
shared between builds, preventing the creation of such a link.
With this commit, a symbolic link is created right after the export of a
commit only when the repo is a "server" repo.
This link is broken outside of the Docker volumes but uses the mount
points of the sources inside the container.
Two ir.config_parameter's are used to enable and configure this feature:
* runbot_migration_ln: the relative path where the link should be placed
(typically odoo/addons/base/maintenance)
* runbot_migration_repo_id: the id of the migration scripts repo, used
to determine the name of the mount point inside the Docker container
A change is also made in the "reverse dependcy build" to avoid the
creation of a build in the migration repo for each push in its
dependency. Simply set the no_build field on this migration repo.
`getmtime` will return a 6 digit float when postgress will only store 5.
Depending on rounding, _get_refs have 1/2 chance to make an update
when it shouldn't. Rounding below psql precision before comparing and
storing should fix this.
Sometimes, sheduler may have a hard time to create build.
The transaction can be verry long when there are many repo and
a lot of new commits. Writing get_ref_time on repo will fail
due to concurrent update rollbacking the whole transaction.
This is supposedly because of hook occuring during the transaction.
With this new model, hook will only perform an insert, and shouldn't
interfer with ref_times.
docker_is_running is ambiguous since we dont know if it was started once.
This new feature tries to add tools to know if a docker was started or not.
The main reason of this is that sometimes docker_run may take more than 15 seconds
creating unpredictable errors on build when the second step is launched and
the previous one is still running. Hopefully this fix will help to solve this
issue and detect late docker run.
When rebuilding a build/subbuild, the last commit of the branch
won't have the same sha of the branch last commit. find_new_commit
will kill the running builds in the branch and create a new one,
that may be a duplicate of one of the killed commits.
This fix only take normal builds into account when checking for
existing builds.
This commit was initially there for tests, when no repo exist, but
get_param will also crash if commit does not exist, wich may be
a problem on user rebuild.
A build can sometimes fail and be stuck in a running state
without corresponding sources. In this case, source are not gc anymore
This commit fixes that by always applying gc even when an inconsistency
is detected.
When creating an .odoorc file to store configuration that are proper for
the runbot, the command line options were used as key.
The problem is that the Odoo config file use undescore instead of dash
for options keys.
Also, the Command object was not recreated with the config_tuples
parameter. Because of that, when adding a command in the list, the
config_tuples were losts.
As a consequence, on the Odoo runbot instance, the data dir was created
in the default dir and thus, not included in the zip file of the dump,
causing some runbot steps to fail.
A lot of things have to be mocked during runbot tests, as a consequence,
a lot of patch decorators accumulate in a big stack uppon some tests
methods.
Also, a lot of mocks are used multiple times among tests.
With this commit, a new RunbotClass is added that comes with patches
ready to be started. A start_patcher helper method is available to start
a patch and add the appropriate stop in a cleanup.
Also, when a build is created in the tests, the _get_params method is
always called, resulting in an annoying git warning.
With this commit, a create_build method is added on the test class, that
way the _get_params is always mocked when a build is created.
_find_new_commits will check if a build exists with current branch HEAD
before creating build. This is crutial to avoid to create a new build
at each loop turn. The problem is that in some rare cases, when
force-pushing an old head on a branch, the build won't appear and the only
way to update the branch is to find the corresponding build that may be
hidden in hitory. This may be confusing for the user that will rebuild the
created build with a commit that doesn't represent the head of the branch.
This commit only search for the last build of each branch, in order to
only skip build creation if the last build as the same hash. The new
created build should be marked as the duplicate of the first one.
When starting an odoo instance with Docker, a very long command line is
computed and appears in the logs.
With this commit, an .odoorc configuration file is written ind the build
dir and mounted in the Docker container.
Previously, the runbot .odoorc/.openerprc file was mounted to share some
parameters. Now, if that file exsists, its content is merged with build
.odoorc.
When using auto_tags, most of the time, the enhanced version are used.
For example, using "-:TestPoSStock" to disable the test class.
If the tested Odoo version does not support this kind of tag, they are
considered as simple tags, thus disabling all tests.
It 's the case for Odoo saas-11.3.
With this commit, the auto_tags are only used on Odoo versions that
support the new test tags.
When runbot is installed to test customs addons, we don't
want to build all odoo commit, but we need to update branches
in order to make _get_closest_branch work.
This commit will allow a user to set odoo in poll mode
with no_build set to True, to create branches only.
(And a small fix for additionnal_env)
Since 81fefee, the container.py CLI does not work as expected.
With this commit, the CLI is working, a new arg was added to test
flamegraphs and the dump is adapted to mimic the runbot.
Also, a small issue is fixed in the zip file creation. Before the zip
creation, the directory is changed, if the directory change fails, the
zip is created from the current directory which is removed by zip at the
end of the process. That could lead to the deletion of the build dir.
A typical use case when an error is detected is to disable
this test by adding a negated test-tags on config
step 'all' and 'split_all'. This commit will help
to do that by adding a test_tags management on build error.
The user define a test_tag that will only execute failling test.
if a config step has the flag enable_auto_tags, the test tag will
be negated and added to config test-tags.
This commit also add some information for monitoring.
Current dump version doesn't include filestore. This new
version adds the filestore trying to match odoo backup format
in order to ease restore.
manifest.json file is not create since it isn't usefull,
but an info.json is added, with build info.
Creating multi builds configs can be tedious. One must create 2 build
configs and 2 build config steps in the right order.
With this commit, a simple wizard is added that creates those 4
configurations by simply filling 4 fields.
Also, a new field, group, is added in order to be able to gather
config's and config steps into groups. The group is a Many2one on a
config.
While at it, the runbot menu has been a bit rearranged with everything
about config's in a parent menu named Configs.
Config's and config's steps tree views have been enhanced to show the
config group and add some filters in the search views.
With this commit, a new boolean field "flamegraph" is added on the
build_config to allow a flamegraph generation.
In order to be able to generate a flamegraph during a runbot build, the
flamegraph package is added to the Docker image as well as the
flamegraph.pl tool.
Dump a db at the end of a build, using a new 'finals' cmd part
added in order to execute dump even if build fails.
Add a link in last step log to download dump.
In different situations, a docker container may stay alive even if the
build global_state is done. This can lead to a build failure when a
build wants to go in running state and tries to expose the same ports as
the left over build.
This reverts commit 1207daded1.
A too quick review, setting a default value is a good idea but since field is a float now,
default value should be time.time
Actually some Odoo modules are black_listed from a set hardcoded in the
runbot code. In some cases, one needs to blacklist custom modules,
preferably in a config_step.
With this commit, the repo.modules, branch.modules,
config_step.install_modules fields are concatained in a comma separated
list of fnmatch patterns. The patterns can be prefixed with a dash to
exclude the matching module(s).
Co-authored by @Xavier-Do
When a build is running, a cron, an evil query or something else can
start to fill and bloat the runbot ir_logging table.
With this commit, a log_counter field is added on the build, starting at
100. The SQL trigger decrement this counter after a line is inserted.
When the counter drops to 0, a the last log line contains a message
stating that the limit has been reached. Further log lines are dropped
for this build step.
The counter is reset to a default of 100 before each step.
This value is configurable through an optional ir.config_parameter
runbot_maxlogs.
The runbot itself is still able to add logs lines through the build _log
method.
Thanks @Xavier-Do for the smart idea.
When a build only create sub-builds, the build_time is verry small (a few seconds),
and this information is not relevant. This commit propagates end_time to parent_build
if parent_build is done or running.
When a build_error active field is changed, the onchange leads to a
traceback. Anyway, the onchange was not a good idea as it only reflects
UI changes.
With this commit, the write method is overwritten to change the
child_ids active fields too. Also, the active_test context is used to
correctly compute the childs_ids and children_build_ids.
A test is also added for all that.
The new feature in odoo/enterprise#4879 needs the firebase-admin
package. As it cannot be added to the requirements.txt, the package is
added in the Dockerfile to be able to test it on the runbot instances.
- Add a keep running flag on the build to allow a build to stay in
running state until the flag is switched off ( or the build killed)
- Do not update configs and config_steps data
- Add a first/last_seen_build and first/last_seen_date on build.error
- Children error builds now include the parent builds too
- Use a notebook on build.error form view to display builds and linked
errors
- Update result when a build triggers a change from 'warn' to 'ko' too
- Add the sticky flag on the error logs stored sql view
When a build error appears with the same fingerprint as already known
one which was supposedly fixed, the build is simply added to the known
build error.
In order to keep an eye on such reappearing bugs and keep the fixing
history separated, this commit simply creates a new build_error.
Old build errors with the same hash (or child_ids 's hashes) appears in
a computed field error_history_ids.
When using the runbot frontend, it's sometimes very frustrating when
trying to copy branch name, some mouse gym is necessary.
With this commit, a copy to clipboard button is added near the branch
name on the frontend.
The new feature in odoo/enterprise#4892 needs the dbfread module.
As this lib is not required for other Odoo modules, it cannot be added
to the requirements.txt file.
In order to run the tests and use this new feature on the runbot, this
commit installs the dbfread lib in the Docker image.
When finding new commits, if there is more pending builds on a repo than
the running_max parameter, the exceeding builds are skipped.
As a result, when nightly builds are created on the runbot, it happens
that some of them are skipped.
Also, since e51412d , only refs newer than max_age are builded; thus the
logic is not needed to prevent rebuild of olds refs in case of a fresh
runbot install.
The Many2many related on a Many2many does not map the ids as expected.
With this commit, the records are mapped in a compute.
It also fixes an uppercase letter was used in the children_build_ids field name.
When killed a build could have his build end changed (problematic when
killing a running since build_time must represent the testing time)
-> if a build already has a build end, don't overwrite it.
Port also needs to be reset on wake_up since another build would have
recycle the current one since port unicity is limited to build not in
done state. This was working most of the time before since port unicity
was determined cross runbots, thus we only had one chance over 17
to have a conflict on wake up. (this changed with prevous commit)
With the increasing number of runbot servers (17), the total number of docker
instances can reach more than 3570 only for running build. Starting at 2000,
this covers the posrt 5432 used by postgress and make the build run step fail.
This commit simply limit the port unicity constraint by host.
With this commit, a new model is introduced to facilitate the tracking
of the build errors.
Its based on an SQL view (Thanks @Xavier-Do), that way, there is no new
table in DB and this view is also useful from the PSQL CLI.
In the UI, the search for errors easier than manipulate the ir_logging
view because the builds informations can be used in search and filters.
The since google chrome 74, a random bug makes it crash at startup,
making the odoo tests crash.
With this commit, an odoo custom deb repo is used on nightly with a
known working chrome version.
When a build is wake-up and something goes wrong during the
_run_odoo_run method, the "fetch and build" cron is broken and the
concerned runbot host stops working.
With this commit, the exception is catched and the build goes back to
the "done" state whith a log.
With this commit, a new RunbotBuilError model is added in order to
classify and manage errors that appears during runbot builds. This is
an helper to find undeterministic bugs in Odoo builds. Build logs can
be parsed on demand, during the parsing, the logs are cleaned with some
regexes stored on the RunbotErrorRegex model. A hash is computed on the
cleaned log, if a build error already exists with the same fingerprint,
the build is appended on the build error.
Errors can also be manually linked together with a parent/children
relation in case of a related error log. e.g. the error message is
different in two different branches but the bug is the same.
Also, a new build_url field is added to the runbot_build in order to
access the build web page from the backend.
Add a new model runbot.host to keep info and configuration about
hosts (worker servers), like number of worker, reserved or not,
ping times (last start loop, successful iteration, end loop, ...)
and also last errors, number of testing per host, psql connection
count, ...
A new monitoring frontend page is created, similar to glances
but with additionnal information like hosts states and
last_monitored builds (for nightly)
Later this model will be used for runbot_build host instead of char.
Host are automaticaly created when running _scheduler.
In some case _force can return an empty recordset,
if the corresponding branch is in no_build mode in other
repo may be an explanation here.
This commit avoid to stuck the fetch and build loop in this case.
Hook can represent label changes, closed pr, ....
We only want to fetch is some push or synchronize are sent.
TODO We also may want to catch retarget later in order to update branch.
indirect state was writen on parent leading to unconsistent info.
indirect was using last build regardless of build_type. Now, indirect
will only use normal build to avoid red-chain after a sticky rebuild.
A prototype of feature was added some times a go.
No really tested, this commit improves parmater format
and makes dependency closest_branch_id not required
since a repo/sha is all we need.
Docker can take some time to be considered as running after docker_run. This
issue can appear when we speedup sheduling loop. To avoid that when can add an
time condition to consider if a docker is running, but we want to avoid to wait
to much since some jobs are fast.
This solution check if a job is a docker run before waiting, and will also
update job_start after a checkout since this can take some time if
a git fetch is performed.
When a build is killed, result will be set to manually killed,
removing the 'error' or 'warn' result.
This commit removes this behaviour in order to keep error result
in this case.
If children are killed, they will all look the same in the parent view
making difficult to find the failed one in staging branches.
This commit displays result rather than status in priority if build
is in failure.
If a user really wants to keep a database up for a long time, he has the possibility
to wake it up multiple times.
Using last job end as reference will allow to keep a database alive longer.
The main motivation of this commit is to be able to notify github status only
when all children are done.
Until today, children where only used for dev branches and nightly. The needs to
use this system for staging need to enforce github_status behaviour.
Before this commit, a parent won't send github status since he will only
create childrens. And childrens are not awared of other children state,
so sending a succes may be wrong if another one failed.
Asking the parent to make the github_status looks the easiest solution:
-If top parent config will have update_github_state False, we also want to take that into account.
-If a child wants to contact github for failfast, parent will be in global_state error too and will send
message immediatly.
-If a child want to contact github for succsess, we actually want to wait for last child, parent will
be in waiting global_state and notify nothing (or pending).
Only last child will be able to notiffy success since global_state will be running or done at this step.
Orphan builds wont have any impact on result in with this scenario.
Some data are logged at each loop turn even if nothing interresting was done:
- ... builds [] where allocated to runbot
- reload nginx
That kind of info was interresting for debug but now this noise makes
logs heavier and more difficult to read.
Reload ngnix will be done only if file changed and this this will avoid
a log at each loop turn.
We also display difference between existing sources and source that should
be there instead of complete lists.
Sources can be easily exported if needed since they are in the bare repository
most of the time. To avoid using to much space, this commit will garbage collect
all sources at the beginning of a long_running cron.
Only real side effect of this is that it will be impossible to wake up
a build that was force pushed since source cannot be fetched anymore.
We may imagine that we could keep sources of recent build, maybe for
48 hours, but keeping build specific data (logs, database)
is more interresting.
This commit add a wake up button in place of connect button when build is
not running and may be wake up.
Connect button will also be visible immediatly when local_state is 'running'
since we don't need to wait sub build to finish.
When a build is completely dead, with directory and db deleted, the
wake-up system fails.
With this commit, a wake-up is not allowed on such dead builds.
In order to stores other things than logs, that could be accessible by
end users, for example screenshots and screencasts, a "tests" directory
is allowed thruough the nginx template in the builds directories.
Also, the "with" context manager is used to open the nginx configuration
to ensure that the file descriptor is released during long running crons.
When an Odoo instance is run in a Docker container, it listen on all
interfaces by default and a bridge interface is used to communicate with
the outside world.
This bridge interface is necessary to allow the instance to send logging
messages into the runbot postgresql database.
Even though Docker isolate the container from the oustide world, it's
still possible to reach the Odoo instace from the runbot host and worse,
from other Docker containers using the same bridge interface.
To confirm the Murphy's law, it finally happened with this commit
odoo/enterprise@0ba0ef99de that scanned the ip range of the interface,
disturbing other builds.
One solution would be to create a bridge interface for each instance to
isolate each Docker but that would imply a big change to garbage collect
forgotten bridges ...
An 'icc' (Inter Container Communication) option exists for the Docker
daemon which defaults to True. Setting it to False was tested and works
but it appears that this option can be messed-up by the firewall on the
runbot host.
Finally, if applied, this commit will prevent the Odoo instance to
listen on the bridge interface by only listening to 127.0.0.1 during
tests. A local_only parameter is a added on the _cmd method which is
true by default and means that the Odoo instance will only listen on
127.0.0.1. This parameter is set to False for the running step to allow
the running to be contacted through the Docker exposed ports.
The requirements path and python version where defined from
server in cmd. Since in coverage we add a 'python' before server,
it is difficult to define which element of the cmd is the server.
A solution here is simply to define requirements install and
python version when building cmd since we have access to all
build/source informations. We also add python part in every
cases, and coverage params are now a _cmd python_params.
The _cmd method now returns a Command object instead of a
list, which behave has a list for the cmd part but also contains
a pres and posts list.
pres are requirement install, preparation, ...
cmd is the original cmd list, element can be append or added, this
will allow to keep existing python job without to much changes.
posts are post cmd commands, like coverage result making.
This commit also fix issue with create_job dependencies.
Multibuild can create generate a lots of checkout, especially for small
and fast jobs, which can overload runbot discs since we are trying not
to clean build immediatly. (To ease bug fix and allow wake up)
This commit proposes to store source on a single place, so that
docker can add them as ro volume in the build directory.
The checkout is also moved to the installs jobs, so that
builds containing only create builds steps won't checkout
the sources.
This change implies to use --addons-path correctly, since odoo
and enterprise addons wont be merged in the same repo anymore.
This will allow to test addons a dev will do, with a closer
command line.
This implies to change the code structure a litle, some changes
where made to remove no-so-usefull fields on build, and some
hard-coded logic (manifest_names and server_names) are now
stored on repo instead.
This changes implies that a build CANNOT write in his sources.
It shouldn't be the case, but it means that runbot cannot be
tested on runbot untill datas are written elsewhere than in static.
Other possibilities are possible, like bind mounting the sources
in the build directory instead of adding ro volumes in docker.
Unfortunately, this needs to give access to mount as sudo for
runbot user and changes docjker config to allow mounts
in volumes which is not the case by default. A plus of this
solution would be to be able to make an overlay mount.
In some conditions, Docker can take a little time to start a container.
In that case, if the runbot checks that the container is running before
it starts, runbot consider the job as finished. It the tries to grep the
logs and, as expected, it does not find the "Modules loaded".
With this commit, we consider young builds (less than 15 sec) as
running, giving more time to Docker for starting it.
When creating multi builds config steps, the force_build option is often
forgotten. In that case, the multi builds are detected as duplicate of
the first one.
With this commit, when asking for more that one multi build, the
force_build is chnaged to to True.
When updating repo, order is odoo/odoo, odoo-dev/odoo, odoo/enterprise, odoo-dev/enterprise
The problem with this is that if a community hash and an enterprise hash arrives exactly
between odoo/odoo and odoo/enterprise update, enterprise could have a newest version
than community when creating builds.
By inversing this order, we have less chances to have this cornercase (as unlikelly as
it could be)
get_ref_time was cast from float to datetime and thus,
milliseconds where lost. Storing it as float make code
easier to read and avoid this rounding that was breaking
this feature.
When getting new refs, a lot of them are really old and the
find_new_commits is called for each one and thus browsing branches.
With this commit, refs older than configured max_age are ignored.
Co-authored-by: Xavier Dollé (xdo@odoo.com)
Accessing childrens can create rollback, especially for builds with
a lot of them, since other runbot will concurently access and update
states. This commit tries to improve this by incrementing and
decrementing counters instead of counting all of them each time.
A slow query was detected on the runbot, causing a latency when loading
any page. After some investigations (thnks jle), we found that an index
on local_state could improve speed.
When updating github statuses, it happens that we face a "Bad gateway"
from github. In that case, the error is logged in the runbot logs and
that's it. As a consequence, when the runbot_merge is waiting status for
the staging branch and this kind of error occurs, the runbot_merge
timeouts and the users vainly search the reason.
With this commit, the runbot tries to update the status at least twice.
If it fails, an INFO message is logged on the build itself.
Nightly build have a low priority but once they have a slot, they keep it.
When pushing a branch or asking robodoo to nicelly merge a branch for the
third time at 22:00, there may be no slot left since all the nightly build are created.
This commit will only assign scheduled build if there is no other build to create and
will always keep a free slot for other builds.
When using a "python job", it's sometimes useful to write a file or
create a directory.
Instead of giving a wide open access to the os module in the
"_run_python" context, this commit adds a write_file and make_dirs
methods on the build which is usable in the _run_python eval context.
When using the read_method in a "python job", it's sometimes needed to
read a file in binary mode.
With this commit, the mode can be specified when calling the method.
Since the use of the "python jobs", we spotted various needs that were
not fulfilled. In order to add flexibility to "python jobs", this commit
adds some useful objetcs in the _run_python eval context.
Also, the glob.glob function is given instead of the whole glob module
to avoid giving access to the os module via glob.os.
When a runbot instance is scheduling builds, the numbers of builds
depends of a global ir.config_parameter. Even if one of the runbot
instance is running on a more powerful systsem, its number of workers is
limited by this global parameter.
With this commit, this parameter still exists but can be overriden by
specific ir.config_parameter.
For example, if the host 'runbot24.odoo.com' has more cpu power, the
number of workers for this host can be specified in the
ir.config_parameter named 'runbot24.odoo.com.workers'.
When installing software with apt-get in a Dockerfile, it should be
preceded with an apt-get update in the same RUN. Otherwise, the step may
fail if the needed package has been updated.
In a create config, a parent result is computed based on children
results
In some situations, it could be handy to ignore the result of some
sub-builds.
Example: the nightly tests are just the children of one nightly build
with a create config. The external tests are failing randomly and as a
consequence, the nightly result is always red. On the other hand,
keeping the test running, just to have logs is a good idea.
With this commit, a config_step of type create can be marked as
orphan_result, that way, the result is not taken into account in the
parent build result.
When the quickconnect button is used, the last running build is
searched in the last 10 builds. If no running build is found, the last
one is rebuilt, even if it's a nightly build.
With this commit, the quickconnect build is choosen only among the ones
with the same config.
With recursive states computation, schedule is
most likely to have transactionnal errors.
This is particularly a problem when external
operations are done during the transaction,
like running a docker.
Adding some commits will help to reduce
transactionnal errors, and ensure that the db
is consistent with docker states.
As the public user needs to be in runbot user group to display the
frontend, the public user is allowed to kill or rebuild a build.
With this commit, only the logged in users have access to the Rebuil/Kill
menu entry.
When searchings for new refs to create builds, the for-each-ref git
commit is run and each ref is searched in the database which is a
somewhat heavy operation.
With this commit, the timestamp of the last database update with the
refs is stored in a field on the repo. This timestamp is checked each
time a for-each-ref is needed, running the operation only when
necessary.
This commit aims to replace static jobs by fully configurable build config.
Each build has a config (custom or inherited from repo or branch).
Each config has a list of steps.
For now, a step can test/run odoo or create a new child build. A python job is
also available.
The mimic the previous behaviour of runbot, a default config is available with
three steps, an install of base, an install+test of all modules, and a last step
for run.
Multibuilds are replace by a config containing cretaion steps.
The created builds are not displayed in main views, but are available
on parent build log page. The result of a parent takes the result of
all children into account.
This new mechanics will help to create some custom behaviours for specifics
use cases, and latter help to parallelise work.
The cpu limit used in job_20 uses the runbot_timeout config_parameter
since b539112a7e. When measuring coverage, this parameter is multiplied
and leads to an error because the type of ir.config_parameter.get_param
method is str.
With this commit, the this value is converted into integer before usage
in job_20.
When running Odoo in the Docker container, the username used to connect
to the database is the username defined in the docker container
(actually odoo).
A problem may arise if the user of the runbot process is not the same.
An authentication error is then raised by postgres because of the
username mismatch.
With this commit, the '-r' parameter of Odoo is added to the command
with the username used by the runbot process.
While at it, unused imports are removed.
When a build exceeds the cpu limit, it is simply killed by the kernel.
As a safeguard the "Initiating shutdown." sentence should be searched
in the log file, and the build marked as "ko" if not found.
Unfortunateley, there is no period (.) at the end of the sentence in the
Odoo logs (see: https://github.com/odoo/odoo/blob/12.0/odoo/service/server.py#L444)
Thus, this condition is never fulfilled.
On top of that, this was masked by the first part of the condition,
checking that the 'test/common.py' has no "post_install" string.
The "test" directory does not exists in Odoo ( but "tests" exists) , so
the condition was always falsy.
Finally, a build can be marked as "ok" when he is killed and no errors
are found until the kill.
With this commit:
* The legacy grep for post_install is removed as it now exists in
all Odoo supported versions.
* The period typo is fixed.
* A log is inserted when the final sentence is not found.
* The cpu_limit is set as the same as the runbot_timeout parameter
for better consitency.
* The time exceeded log message is now logged in the build instead
of the runbot log.
Co-authored-by: @Xavier-Do
At the end of the _update_git method, the "git fetch" command is run.
That makes it diffcult to override to change its behavior (for example
to avoid fetching pull requests).
With this commit, the command is separated in a new small method that
can be easily overriden.
When searching for new builds by parsing git refs, the new branches are
created as well as the pending builds in the same _find_new_commits
method.
With this commit, this behavior is splitted into two methods, that way,
it's now possible to create missing branches without creating new
builds. The closest_branch detection is enhanced because all the new
branches are created before the builds (separated loops).
The find_new_commits method uses an optimized way to search for
existsing builds. Before this commit, a build search was performed for
each git reference, potentially a huge number.
With this commit, a raw sql query is performed to create a set of tuples
(branch_id, sha) which is enough to decide if a build already exists.
A test was added to verify that new refs leads to pending builds.
Also, a performance test was added but is skipped by default because it
needs an existing repo with 20000 branches in the database so it will
not work with an empty database. This test showed a gain of performance
from 8sec to 2sec when creating builds from new commits.
co-authored by @Xavier-Do
Before this commit, dependencies (i.e. community commit to use when testing enterprise)
were computed at checkout, when the build was going from pending to testing state and
were not stored.
Since the duplicate detection was done at create, the get_closest_branch_name was called
in a loop for each posible duplicate candidate, then a last time at checkout. The main idea of this
pr is to store the build dependecies on build at create, making the duplicate detection
faster (especially when the build name is matching many indirect builds).
The side effect of this change is that the build dependencies won't be affected if a new
commit is pushed between the build creation and the checkout. The build is fully
determined at creation. get_closest_branch is only called once per build
The duplicate detection will also be more precise since we are matching on the commits groups
that were used to run the build, and not only the branch name.
Some work has also been done to rework the closest branch detection in order to manage new corner
cases. Hopefully, everything should work as before (or in a better way).
In a soon future, it will also be possible to use this information to make an "exact rebuild"
or to find corresponding community build.
Pr: #117
When some special builds are scheduled during the night, free slots on
runbot instances are used. Depending on the number of scheduled builds,
all the slots can be used. That prevents people to use the runbot for
normal builds during this time.
To mitigate the problem, the scheduled builds were postponed to the
middle of the night ... the CET night. It means that it could be morning
in India.
With this commit, a build priority is given to normal builds. On the
other hand, scheduled builds are pushed at the end of the queue.
So even if there are plenty of builds during the Belgian night, if
someone pushes a commit in between, it will be built in priority before
the scheduled pending builds.
When using a local git repo, the git name does not have colon, making
the frontend crash.
With this commit, a non-stored computed field 'short_name' is added to
compute a shortest version of the name.
When the docker_run function is called, the odoo command is decorated
with a pip command to install required packages.
This pollute the docker_run function if a runbot job_ method wants to
use docker for something else that starting an odoo instance (like
pg_dump) for example.
With this commit, command modification is made in an optional helper
function named build_odoo_cmd.
the docker_run function now needs the command to run as a string instead
of a list of odoo cmd and its parameters.
Asking for the kill of a build which is the duplicate of another fails
because the state of this build is "duplicate", so the _ask_kill method
has no effect on it.
With this commit, the effect of _ask_kill is applied on the duplicate in
the above mentioned case.
When searching the builds for the frontend the resulting query can last
a very long time (up to 7sec).
With this commit, the search result is strictly limited to 100 builds,
the limit query parameter is removed and the search string length is limited to
60 chars.
The guess_result method is now optimized to guess results for testing
builds only. The others have the same value as the final result.
A few tests were added for this method.
Thanks @KangOl for the optimization code.
When github reaches the hook controller, the repo hook_time field is
updated. That way, a git fetch is done only when the hook_time is newer
that the last fetch. If the hook_time is updated during the long running
cron that runs the _cron_fetch_and_schedule method, the hook_time is cached
and only the old hook time is seen until the cron's end. The cursor
commit is not enough. As a result, the new builds are scheduled in the
next cron run.
With this commit, the cache is invalidated after the commit, that way,
the hook_time field contains the correct value.
When a PR is created in odoo/enterprise but without a corresponding
PR in odoo/community BUT a corresponding branch in odoo-dev/community,
the closest_branch detection fails. Moreover, the duplicate detection
fails too.
As a consequence, the PR build will probably fail because it will be
built with the default target branch that could not be suited for it.
If the branch built succeeds, it leads to inconsistent results.
With this commit, a new case is added on the _get_closests_branch_name
to handle this case.
The serever_match field also reflects the difference as this case will
be marked as 'no PR'.
When a PR also exists in odoo/community, the server_match field will be
'exact PR'. This change should not imply migration.
This commit also adds a bunch of tests to test the closests branch name
detection and the duplicates.
Co-authored by @Xavier-Do
When a runbot build ends without error but with one or more warning,
status are not sent to github. As a result, the PR stays in pending
state.
With this commit, the github status is set to failure when a build ends
in a "warn" result.
At checkout time when a build has no server (e.g. enterprise),
the dependency repo that contains the server needs to be extracted too.
It happens that this dependency repo is not up to date.
With this commit, the dependency repo is updated before its extracttion.
When searching for duplicate builds, a git ls-remote is used to verify
that the branch still exists. This command is time consuming (up to 2
seconds).
If the number of build is significant, it can last a very long time.
When a user push one ore more new branches without new commits, the
number of duplicate builds found may be very large (more than 92).
This loop blocks the cron wroker in charge of creating new builds.
This quick fix will limit the number of duplicate to 1 but if the
closest name is not the same, it will not be considered as a duplicate.
When a runbot execute the cron_fetch_and_build method, the repo is
updated only if the webhook time is newer than the last fetch
time.
As the cron is now split into long running crons, the hook_time field is
cached. The runbot instance that sees a new build pending use this
cached value to estimate if the repo update is needed.
With this commit, the repo update is done right before exporting the
repo and only if the commit hash is not found.
As a bonus, the environment is reset in the long running cron of the
runbot builders to update the cached values.
The Runbot Cron is executed on each runbot instance. When the number of
instances scales, the time needed for an instance to obtain the cron
increases.
With this commit, the original runbot_cron is removed. Instead, a cron
have to be created to run the _cron_fetch_and_schedule method.
This method will fetch the repo and create pending builds. This cron is
intended to run only on one runbot instance. This method needs a host
parameter to specify which runbot instance will be in charge of this
task.
On the other hand, a dedicated cron have to be manually created for each
runbot instance that will have the build task.
Those cron's only have to call the _cron_fetch_and_build method with the
runbot hostname as a parameter. This method will then self
assign pending builds if there are slots available.
All available build slots are reserved in a single LOCKED SQL query.
Both methods are intended to last a large amount of time, just a few
minutes below the cron timeout to maximize the cron productivity.
The timeout is randomized to avoid deadlocks if the runbot instances are
started at the same time.
So the --limit-time-real parameter have to be set to a minimum of 180
sec (600 or 1200 are probably better targets).
When displaying build logs, all the messages from ir_logging about this
particular build are fetched from the database.
From time to times, it happens that the number of logged messages is
really huge. Those messages lines could also contain multiple lines,
multiplying the number of row to generate in the html page.
When this happens, the process that generates the template last a long
time and ends with a MemoryError. If the end user, bored, hits the
refresh button multiple times, all the workers will be busy building
this template. In the end, all users get a Bad Gateway from nginx.
With this commit, the number of messages that will be taken into account
will be limited to 10000.
When a user checks the runbot frontend, the guess_result field is used
to change the color of the build state. But github is not notified of
this guessed result.
As a consequence, the runbot_merge is not aware the build is failed and
will continue to wait.
With this commit, as soon as the guess_result detects a failure, the
status is sent to github, that way, runbot_merge will stop waiting
sooner.
When running the _job_10 method, a database is created with base module
alone. Tests are enabled during this job. Those tests are run again with
the _job_20 method. Moreover, even if the tests fail during _job_10,
they are not taken into account for the final result. The _job_10 method
duration is approximately 4 min.
With this commit, the tests are not enabled during _job_10.
A new module in Odoo needs pyCrypto but this module alone is too limited
to justify an addition in the requirement file.
PR: https://github.com/odoo/odoo/pull/28816
Adapt test for eb7f5de . The mentioned commit fixes an issue that occurs
when updating github status. A test already exists but was assuming that
the build is in 'done' state when reaching the job_29.
With this commit, the build used in test is set to 'testing' state like
in the real cases.
Also, a new test is added to test the job_00_init which also send
github status but with a minimal build.
Finally, the runtime that should appear in status description was
forgotten in the previous commit. Now the runtime is always sent with
the github status.
Since d7c7e54 the github status is send in job_29. At this moment, the
state of the build is still 'testing'. For that reason, the github
status is set to 'pending'.
With this commit, once a result is available, the github status is
updated with the right value even if the build state is 'testing'.
When a build reach the job30_run method, results from a previous testing
methods are computed.
With the previous commit 8c73e6a901 this
job can now be skipped. In that case, the results are not set.
With this commit, the results are computed in a separate method.
Since 8c73e6a it's possible to skip jobs from a build by using the
job_type field on the branch. If a branch job_type is set to 'none', the
builds are created but they stay in 'pending' state.
With this commit, the build is not even created if the 'job_type' is
'none'.
Since the runbot_merge module, some branches does not need to be built.
For example the tmp.* branches.
Some other branches does need to be tested but it could be useless to
keep them running. For example the staging branches.
Finally, some builds are generated by server actions during the night.
Those builds does not need to be kept running despite the branch configuration.
For example, the master branch can be configured to create builds with
testing and running but nightly multiple builds can be generated with
testing only.
For that purpose, this commit adds a job_type selection field on the
branch. That way, a branch can be configured by selecting the type of
jobs wanted.
A same kind of job_type was also added on the build that uses the
branch's value if nothing is specified at build creation.
A decorator is used on the job_ methods to specify their job types.
For example, a job method decorated by 'testing' will run if the
branch/build job_type is 'testing' or 'all'.
When a build is created, the --log-db command line argument is built
using the same db and credentials that the one used by the runbot.
With this commit, this argument is built based on a postgress connect
URI given as ir.config_parameter in the settings.
A dedicated role must be created beforehand on the runbot postgresql
server, accordingly to the given URI.
Also, care should be taken to give minimal privileges to this user only
granting "update" on the table ir_logging_id_seq and
"insert,select,udpate" on the table ir_logging.
To test the last resort branch matching when nothing in common can be
found, two PR were used leading to PR's target branch as the default
one.
Also, the test was never run beacause of a bad indentation.
With this commit, the indentation is fixed and the test uses regular
branches.
In frontend.py, the whole odoo.http module is imported but request is
imported separately. This make it difficult to mock the different things
comming from http in tests.
With this commit, only the needed parts are imported from odoo.http.
When a build is running, the stmp is the localhost.
Since Docker builds, the localhost is the container which does not catch
port 25 smtp. Mails are lost in the limbo.
With this commit, the default gateway of the Docker network is used as
smtp host for the builds. It's the responsability of the runbot host to
catch smtp traffic from the container.
This bridge interface exists by default on a system where Docker is
running. However, Docker is affected by this issue:
https://github.com/moby/moby/issues/26799
The first time the Docker daemon is installed, the Gateway is not
defined on the bridge interface. When the Docker daemon is restarted,
the gateway is correctly defined. Pay attention that restarting the
Docker daemon will kill all the running/testing builds.
When starting a container, the .odoorc|.openerp_serverrc file is not
used by the build.
With this commit, if a .odoorc or .openerp_serverrc file is found in the
home directory of the runbot user, this file is mounted read-only in the
container, allowing some customization.
When building Odoo, the instance is started on the same host as the
runbot. It means that all the required python packages have to be
installed on each runbot hosts with the same versions. Also there is no
real separation between builds. Finally, from a security point of view,
arbitrary code could be executed on the runbot host.
With this commit, the runbot uses Docker containers to build Odoo.
During the tests, Odoo http ports are not exposed to the outside,
meaning that nobody could interact with that instance.
The Docker image used for containers is valid for Odoo branches 10.0,
11.0, 12.0 and master.
When building, right before starting the Odoo tests, the tested branch's
requirements.txt is now taken into account to adapt the container.
On a runbot host, the "docker ps -a" command can be used to have the
list of the current builds. The containers are named using the build
dest field and the current running job. For example:
123456-12-0-123456_job_30_run
Prerequisites:
Docker have to be installed on the runbot hosts and the user that runs
the runbot should be able to use Docker. Typically, the runbot user have
to be added to the docker unix group.
On the first build, the Docker image will be built from scratch. It
can last several minutes locking the runbot cron during this time.
It means that on a multi-runbot configuration, this process will be
repeated for each runbot and during this time there will be no builds.
To avoid such a situation, the Docker image can be built from the
command line. The container.py file can be started like this:
python3 container.py build /tmp/build_dir
The /tmp/build_dir directory will be created to store the Dockerfile.
When the process is done, the "docker images" command should show an
image tagged runbot_tests in the odoo repository. At that time, the
runbot instance can be started, it will use this image for the builds.
Api change:
The 'job_*' methods signature has changed, the lock_path is not needed anymore.
Docker image informations:
Currently, the Docker image is built based on Ubuntu bionic to
benefit of the python 3.6 version.
Chrome and phantomjs are both installed.
The latest wkhtmltopdf (0.12.5) is installed as recommended on our wiki:
https://github.com/odoo/odoo/wiki/Wkhtmltopdf
When a PR is a duplicate of a branch, only the branch CLA status are
update. The same issue for build status was fixed in commit 4f1a55da9.
With this commit, there is new method than can be used in runbot_cla.
This method updates commit given status in each repo.
When commit is built serveral times, each time the status is updated, a
request is sent to github for each build.
As a side effect, if the first build is a failure and the last one a
success, github is wrongly updated to failure.
This bug is particulary annoying on PR in conjunction with the
runbot_merge, preventing a the PR to be merged even after several
rebuild.
With this commit, only the latest build per repository is used to update
the github status. The amount of github status updates is also reduced.
In the case of PR, the name contains 'refs/pull/3175', the branch_name
contains '3175' and because of the previous fix e095170f8c, the
pull_head_name sontains something like 'blah:12.0-something'.
In that case, the _get_closest_branch_name reaches the fallback.
With this commit, the target_branch_name is stored in a new field and is
used as the fallback.
When searching for a matching PR in a target repo, the match was made on
the 'base ref' which is the base branch that the PR targets.
This means that the second case of the _get_closest_branch method was
never reached.
Example:
An enterprise PR is created that targets Odoo branch 12.0 but with a
matching community PR with a branch with the same name.
The corresponding PR is never found by the runbot because the first
rule match: '12.0' is the base ref of the PR.
This could have been fixed by using the branch pull_head_name field to
build the domain but that leads to a problem with the second rule:
If a PR branch is named 'patch-1', the domain will match each PR with
the same pull_head_name.
With this commit, the pull_head_name field will store the pull head
label. It's not a problem with older PR as a newly created PR in
enterprise will not accidentally match with older ones.
Another issue appeared with github branch naming like 'patch-'.
If someone creates a PR with a branch auto-named 'patch-1',
targeting the community repo and later creates an unrelated PR,
targetting another repo depending of the community, they will be
matched.
To avoid that, the pull head names that ends with 'patch-n' are not
stored. Those pull requests will be built against the target branch
head.
The actual behavior, before this commit, is a blocking point for the
runbot_merge which is based on PR's only. It means that when a community
PR is wrongly matched with with an enterprise PR runbot_merge will not
merge the PR's.
When examining a particular build with the build view, one can be
frustrated being forced to navigate the frontend page to ask for a
rebuild.
With this commit, the rebuild menu entry is also visible on the build
page when the build is the last one of the branch.
Also the build host is now visible on the build page for the same
usability reason.
This commit permits to prioritize a branch when scheduling builds.
It's main purpose is for the runbot_merge module. It avoids to have
staging branches as sticky and pollutes the main repo view.
Also, that branch can benefit of the autokill feature when a newer build
is found, freeing ressources for other builds.
Closes#43
The frontend view shows only the four last builds by branch, which is a
little bit small to explore builds and search for failed builds.
With this commit, a new frontend template displays a paged list of
builds for a specified branch.
Closes#39
In a near future, Odoo will use Chrome Headless instead of phantomjs.
Chrome needs a port to listen to and it was decided that it will be
http_port + 2.
With this commit, we ensure that this port is not used by another build.
Closes#30
Avoids overly generic vhost declaration, and makes config easier to
understand.
We only want to allow such hosts:
<build_dest>.runbotX.odoo.com
<build_dest>-<db_name_extension>.runbotX.odoo.com
Where <db_name_extension> is usually "all" or "base".
The unicode icon added in the build subject is not clear for the users.
In that state, it's not easy to add a title on the icon or the subject.
With this commit, a build type field is added to differentiate the
builds and add the appropriate icon and title.
Closes: #24
When a build with coverage is killed during the tests, the coverage
result is set to 100%.
With this commit, coverage result is verbosely skipped if the coverage
file does not exists.
Also, the coverage module is no longer user to retrieve the coverage
value as it was already calculated. It's faster to grep the html file
than to recompute the value.
As the coverage builds takes a longer time than normal builds, the
timeout is increased for those kind of builds (as it was already done for the
CPU limit).
Finally, the omitted patterns were wrong and are now fixed with this
commit.
Closes#25
When the Odoo instances are spawned for tests, they have a time limit
set on their CPU usage. This limit is hard-coded in the _spawn method
calls.
As the number of tests are increasing, their duration increases too.
As this limit is inherited by subprocesses, if a phantomjs test
last too long, the test is killed alone.
Finally, when the coverage is enabled, the tests duration is
approximately increased of 1.5 times.
With this commit, the cpu_limit of the two main tests jobs are
increased. When the coverage is used, the cpu_limit is increased.
Closes: #23
When a new commit is found and a new build is created, if a user
pushes more commits in the same branch a few minutes later, it causes
more builds in testing phase, resulting in a CPU waste.
With this commit, previous builds in testing phase in non sticky
branches are killed when a new build is created.
Closes: #20
When someone tries to log in an old runbot build that is not running
anymore, he lands on the runbot instance that was running the build.
Also, all the running builds are allowed on all runbot instances,
leading to the same behavior.
With this commit, only the builds that are running on the runbot
instance can be reached, others are defaulted to a 404.
closes#21
Actually, the "Other builds" button can increase the page size and
increase the page loading time.
With this commit, the number of other builds visible in the button is limited to 100.
When the coverage is activated on a branch, the coverage result is not
stored.
With this commit, the coverage result will be stored on a build.
The last result will be shown on the frontend for sticky branches.
Also, an extra_parameter field is added on the build model.
When a reverse dependency is in testing, pending or deathrow state, there is no
icon in the depending build box.
This commit adds icons for testing, pending and deathrow states.
Also, icons are now displayed in the repository name order.
closes: #19
When a build fails in a repo that depends on another repo, it's difficult
to figure out from which commit it comes in the depending repo.
If this commit is applied, when a new commit is found in a repo's sticky
branch, the latest build from the same branch name, in the depending
repo will be forced to rebuild.
The refresh method is deprecated and invalidate_cache should be used
instead. Also, since the new API, the cache is automatically
invalidated, hence this removal.
Actually, when a build is a duplicate of another build, its github status is not updated.
e.g. a pull request may not have a github status but its corresponding
branch have a gtihub status.
With this commit, all builds will have their github status updated based
on the corresponding build status.
Thanks for your help @xmo-odoo
closes: #3
When a build was a duplicate, the link to the running build instance was
leading to the duplicate instance with the db filter of the current
build. Because of that, the user was then redirected to the database selector.
With this commit, the link button gives the proper database argument.
Thanks @RomainLibert for reporting this issue.
When code coverage was processed the 'coverage' utility was called the
same way regardless of the Odoo version. That was the cause of two
problems:
1) In some OS packages, the 'coverage' executable was renamed to
'python-coverage' and 'python3-coverage'
2) Since version 11.0, Odoo needs python3
With this commit, the coverage module is called from python '-m'
argument and the python version is chosen from the Odoo executable
shebang.
closes: #12
When a server is built based on dependency_ids, only the branch refs was
logged. In this situation, it's difficult to reproduce a build locally
in the exact same conditions.
With this commit, the latest commit hash and message of this
branch is also logged.
When searching for the closest branch name in a target repo, if the PR
head names are equals, a branch method is called on a dictionary, causing
a traceback.
With this commit, the method is called on a branch object instance.
closes#11
Before this commit, when a clean instance of the runbot is installed,
there was a KeyError on the frontend because there was not builds for
the sticky branches.
When installing a clean runbot instance a bad query is made because
there is not any build yet.
With this commmit, the query is not made when there are no builds
directories found on the filesystem.
Some HTML attributes were missing, and are required to ensure good
compatibility with screen readers.
The following attributes are included:
- `aria-expanded`: this should be present in all dropdown menus
- `aria-label`: this should be present in all links and buttons which
don't show text, but icons or images.