When runbot is installed to test customs addons, we don't
want to build all odoo commit, but we need to update branches
in order to make _get_closest_branch work.
This commit will allow a user to set odoo in poll mode
with no_build set to True, to create branches only.
(And a small fix for additionnal_env)
Since 81fefee, the container.py CLI does not work as expected.
With this commit, the CLI is working, a new arg was added to test
flamegraphs and the dump is adapted to mimic the runbot.
Also, a small issue is fixed in the zip file creation. Before the zip
creation, the directory is changed, if the directory change fails, the
zip is created from the current directory which is removed by zip at the
end of the process. That could lead to the deletion of the build dir.
A typical use case when an error is detected is to disable
this test by adding a negated test-tags on config
step 'all' and 'split_all'. This commit will help
to do that by adding a test_tags management on build error.
The user define a test_tag that will only execute failling test.
if a config step has the flag enable_auto_tags, the test tag will
be negated and added to config test-tags.
This commit also add some information for monitoring.
Current dump version doesn't include filestore. This new
version adds the filestore trying to match odoo backup format
in order to ease restore.
manifest.json file is not create since it isn't usefull,
but an info.json is added, with build info.
Creating multi builds configs can be tedious. One must create 2 build
configs and 2 build config steps in the right order.
With this commit, a simple wizard is added that creates those 4
configurations by simply filling 4 fields.
Also, a new field, group, is added in order to be able to gather
config's and config steps into groups. The group is a Many2one on a
config.
While at it, the runbot menu has been a bit rearranged with everything
about config's in a parent menu named Configs.
Config's and config's steps tree views have been enhanced to show the
config group and add some filters in the search views.
With this commit, a new boolean field "flamegraph" is added on the
build_config to allow a flamegraph generation.
In order to be able to generate a flamegraph during a runbot build, the
flamegraph package is added to the Docker image as well as the
flamegraph.pl tool.
Dump a db at the end of a build, using a new 'finals' cmd part
added in order to execute dump even if build fails.
Add a link in last step log to download dump.
In different situations, a docker container may stay alive even if the
build global_state is done. This can lead to a build failure when a
build wants to go in running state and tries to expose the same ports as
the left over build.
This reverts commit 1207daded1.
A too quick review, setting a default value is a good idea but since field is a float now,
default value should be time.time
Actually some Odoo modules are black_listed from a set hardcoded in the
runbot code. In some cases, one needs to blacklist custom modules,
preferably in a config_step.
With this commit, the repo.modules, branch.modules,
config_step.install_modules fields are concatained in a comma separated
list of fnmatch patterns. The patterns can be prefixed with a dash to
exclude the matching module(s).
Co-authored by @Xavier-Do
When a build is running, a cron, an evil query or something else can
start to fill and bloat the runbot ir_logging table.
With this commit, a log_counter field is added on the build, starting at
100. The SQL trigger decrement this counter after a line is inserted.
When the counter drops to 0, a the last log line contains a message
stating that the limit has been reached. Further log lines are dropped
for this build step.
The counter is reset to a default of 100 before each step.
This value is configurable through an optional ir.config_parameter
runbot_maxlogs.
The runbot itself is still able to add logs lines through the build _log
method.
Thanks @Xavier-Do for the smart idea.
When a build only create sub-builds, the build_time is verry small (a few seconds),
and this information is not relevant. This commit propagates end_time to parent_build
if parent_build is done or running.
When a build_error active field is changed, the onchange leads to a
traceback. Anyway, the onchange was not a good idea as it only reflects
UI changes.
With this commit, the write method is overwritten to change the
child_ids active fields too. Also, the active_test context is used to
correctly compute the childs_ids and children_build_ids.
A test is also added for all that.
The new feature in odoo/enterprise#4879 needs the firebase-admin
package. As it cannot be added to the requirements.txt, the package is
added in the Dockerfile to be able to test it on the runbot instances.
- Add a keep running flag on the build to allow a build to stay in
running state until the flag is switched off ( or the build killed)
- Do not update configs and config_steps data
- Add a first/last_seen_build and first/last_seen_date on build.error
- Children error builds now include the parent builds too
- Use a notebook on build.error form view to display builds and linked
errors
- Update result when a build triggers a change from 'warn' to 'ko' too
- Add the sticky flag on the error logs stored sql view
When a build error appears with the same fingerprint as already known
one which was supposedly fixed, the build is simply added to the known
build error.
In order to keep an eye on such reappearing bugs and keep the fixing
history separated, this commit simply creates a new build_error.
Old build errors with the same hash (or child_ids 's hashes) appears in
a computed field error_history_ids.
When using the runbot frontend, it's sometimes very frustrating when
trying to copy branch name, some mouse gym is necessary.
With this commit, a copy to clipboard button is added near the branch
name on the frontend.
The new feature in odoo/enterprise#4892 needs the dbfread module.
As this lib is not required for other Odoo modules, it cannot be added
to the requirements.txt file.
In order to run the tests and use this new feature on the runbot, this
commit installs the dbfread lib in the Docker image.
When finding new commits, if there is more pending builds on a repo than
the running_max parameter, the exceeding builds are skipped.
As a result, when nightly builds are created on the runbot, it happens
that some of them are skipped.
Also, since e51412d , only refs newer than max_age are builded; thus the
logic is not needed to prevent rebuild of olds refs in case of a fresh
runbot install.
The Many2many related on a Many2many does not map the ids as expected.
With this commit, the records are mapped in a compute.
It also fixes an uppercase letter was used in the children_build_ids field name.
When killed a build could have his build end changed (problematic when
killing a running since build_time must represent the testing time)
-> if a build already has a build end, don't overwrite it.
Port also needs to be reset on wake_up since another build would have
recycle the current one since port unicity is limited to build not in
done state. This was working most of the time before since port unicity
was determined cross runbots, thus we only had one chance over 17
to have a conflict on wake up. (this changed with prevous commit)
With the increasing number of runbot servers (17), the total number of docker
instances can reach more than 3570 only for running build. Starting at 2000,
this covers the posrt 5432 used by postgress and make the build run step fail.
This commit simply limit the port unicity constraint by host.
With this commit, a new model is introduced to facilitate the tracking
of the build errors.
Its based on an SQL view (Thanks @Xavier-Do), that way, there is no new
table in DB and this view is also useful from the PSQL CLI.
In the UI, the search for errors easier than manipulate the ir_logging
view because the builds informations can be used in search and filters.
The since google chrome 74, a random bug makes it crash at startup,
making the odoo tests crash.
With this commit, an odoo custom deb repo is used on nightly with a
known working chrome version.
When a build is wake-up and something goes wrong during the
_run_odoo_run method, the "fetch and build" cron is broken and the
concerned runbot host stops working.
With this commit, the exception is catched and the build goes back to
the "done" state whith a log.
With this commit, a new RunbotBuilError model is added in order to
classify and manage errors that appears during runbot builds. This is
an helper to find undeterministic bugs in Odoo builds. Build logs can
be parsed on demand, during the parsing, the logs are cleaned with some
regexes stored on the RunbotErrorRegex model. A hash is computed on the
cleaned log, if a build error already exists with the same fingerprint,
the build is appended on the build error.
Errors can also be manually linked together with a parent/children
relation in case of a related error log. e.g. the error message is
different in two different branches but the bug is the same.
Also, a new build_url field is added to the runbot_build in order to
access the build web page from the backend.
Add a new model runbot.host to keep info and configuration about
hosts (worker servers), like number of worker, reserved or not,
ping times (last start loop, successful iteration, end loop, ...)
and also last errors, number of testing per host, psql connection
count, ...
A new monitoring frontend page is created, similar to glances
but with additionnal information like hosts states and
last_monitored builds (for nightly)
Later this model will be used for runbot_build host instead of char.
Host are automaticaly created when running _scheduler.
In some case _force can return an empty recordset,
if the corresponding branch is in no_build mode in other
repo may be an explanation here.
This commit avoid to stuck the fetch and build loop in this case.
Hook can represent label changes, closed pr, ....
We only want to fetch is some push or synchronize are sent.
TODO We also may want to catch retarget later in order to update branch.
indirect state was writen on parent leading to unconsistent info.
indirect was using last build regardless of build_type. Now, indirect
will only use normal build to avoid red-chain after a sticky rebuild.
A prototype of feature was added some times a go.
No really tested, this commit improves parmater format
and makes dependency closest_branch_id not required
since a repo/sha is all we need.
Docker can take some time to be considered as running after docker_run. This
issue can appear when we speedup sheduling loop. To avoid that when can add an
time condition to consider if a docker is running, but we want to avoid to wait
to much since some jobs are fast.
This solution check if a job is a docker run before waiting, and will also
update job_start after a checkout since this can take some time if
a git fetch is performed.
When a build is killed, result will be set to manually killed,
removing the 'error' or 'warn' result.
This commit removes this behaviour in order to keep error result
in this case.
If children are killed, they will all look the same in the parent view
making difficult to find the failed one in staging branches.
This commit displays result rather than status in priority if build
is in failure.
If a user really wants to keep a database up for a long time, he has the possibility
to wake it up multiple times.
Using last job end as reference will allow to keep a database alive longer.
The main motivation of this commit is to be able to notify github status only
when all children are done.
Until today, children where only used for dev branches and nightly. The needs to
use this system for staging need to enforce github_status behaviour.
Before this commit, a parent won't send github status since he will only
create childrens. And childrens are not awared of other children state,
so sending a succes may be wrong if another one failed.
Asking the parent to make the github_status looks the easiest solution:
-If top parent config will have update_github_state False, we also want to take that into account.
-If a child wants to contact github for failfast, parent will be in global_state error too and will send
message immediatly.
-If a child want to contact github for succsess, we actually want to wait for last child, parent will
be in waiting global_state and notify nothing (or pending).
Only last child will be able to notiffy success since global_state will be running or done at this step.
Orphan builds wont have any impact on result in with this scenario.
Some data are logged at each loop turn even if nothing interresting was done:
- ... builds [] where allocated to runbot
- reload nginx
That kind of info was interresting for debug but now this noise makes
logs heavier and more difficult to read.
Reload ngnix will be done only if file changed and this this will avoid
a log at each loop turn.
We also display difference between existing sources and source that should
be there instead of complete lists.
Sources can be easily exported if needed since they are in the bare repository
most of the time. To avoid using to much space, this commit will garbage collect
all sources at the beginning of a long_running cron.
Only real side effect of this is that it will be impossible to wake up
a build that was force pushed since source cannot be fetched anymore.
We may imagine that we could keep sources of recent build, maybe for
48 hours, but keeping build specific data (logs, database)
is more interresting.
This commit add a wake up button in place of connect button when build is
not running and may be wake up.
Connect button will also be visible immediatly when local_state is 'running'
since we don't need to wait sub build to finish.
When a build is completely dead, with directory and db deleted, the
wake-up system fails.
With this commit, a wake-up is not allowed on such dead builds.
In order to stores other things than logs, that could be accessible by
end users, for example screenshots and screencasts, a "tests" directory
is allowed thruough the nginx template in the builds directories.
Also, the "with" context manager is used to open the nginx configuration
to ensure that the file descriptor is released during long running crons.
When an Odoo instance is run in a Docker container, it listen on all
interfaces by default and a bridge interface is used to communicate with
the outside world.
This bridge interface is necessary to allow the instance to send logging
messages into the runbot postgresql database.
Even though Docker isolate the container from the oustide world, it's
still possible to reach the Odoo instace from the runbot host and worse,
from other Docker containers using the same bridge interface.
To confirm the Murphy's law, it finally happened with this commit
odoo/enterprise@0ba0ef99de that scanned the ip range of the interface,
disturbing other builds.
One solution would be to create a bridge interface for each instance to
isolate each Docker but that would imply a big change to garbage collect
forgotten bridges ...
An 'icc' (Inter Container Communication) option exists for the Docker
daemon which defaults to True. Setting it to False was tested and works
but it appears that this option can be messed-up by the firewall on the
runbot host.
Finally, if applied, this commit will prevent the Odoo instance to
listen on the bridge interface by only listening to 127.0.0.1 during
tests. A local_only parameter is a added on the _cmd method which is
true by default and means that the Odoo instance will only listen on
127.0.0.1. This parameter is set to False for the running step to allow
the running to be contacted through the Docker exposed ports.
The requirements path and python version where defined from
server in cmd. Since in coverage we add a 'python' before server,
it is difficult to define which element of the cmd is the server.
A solution here is simply to define requirements install and
python version when building cmd since we have access to all
build/source informations. We also add python part in every
cases, and coverage params are now a _cmd python_params.
The _cmd method now returns a Command object instead of a
list, which behave has a list for the cmd part but also contains
a pres and posts list.
pres are requirement install, preparation, ...
cmd is the original cmd list, element can be append or added, this
will allow to keep existing python job without to much changes.
posts are post cmd commands, like coverage result making.
This commit also fix issue with create_job dependencies.
Multibuild can create generate a lots of checkout, especially for small
and fast jobs, which can overload runbot discs since we are trying not
to clean build immediatly. (To ease bug fix and allow wake up)
This commit proposes to store source on a single place, so that
docker can add them as ro volume in the build directory.
The checkout is also moved to the installs jobs, so that
builds containing only create builds steps won't checkout
the sources.
This change implies to use --addons-path correctly, since odoo
and enterprise addons wont be merged in the same repo anymore.
This will allow to test addons a dev will do, with a closer
command line.
This implies to change the code structure a litle, some changes
where made to remove no-so-usefull fields on build, and some
hard-coded logic (manifest_names and server_names) are now
stored on repo instead.
This changes implies that a build CANNOT write in his sources.
It shouldn't be the case, but it means that runbot cannot be
tested on runbot untill datas are written elsewhere than in static.
Other possibilities are possible, like bind mounting the sources
in the build directory instead of adding ro volumes in docker.
Unfortunately, this needs to give access to mount as sudo for
runbot user and changes docjker config to allow mounts
in volumes which is not the case by default. A plus of this
solution would be to be able to make an overlay mount.
In some conditions, Docker can take a little time to start a container.
In that case, if the runbot checks that the container is running before
it starts, runbot consider the job as finished. It the tries to grep the
logs and, as expected, it does not find the "Modules loaded".
With this commit, we consider young builds (less than 15 sec) as
running, giving more time to Docker for starting it.
When creating multi builds config steps, the force_build option is often
forgotten. In that case, the multi builds are detected as duplicate of
the first one.
With this commit, when asking for more that one multi build, the
force_build is chnaged to to True.
When updating repo, order is odoo/odoo, odoo-dev/odoo, odoo/enterprise, odoo-dev/enterprise
The problem with this is that if a community hash and an enterprise hash arrives exactly
between odoo/odoo and odoo/enterprise update, enterprise could have a newest version
than community when creating builds.
By inversing this order, we have less chances to have this cornercase (as unlikelly as
it could be)
get_ref_time was cast from float to datetime and thus,
milliseconds where lost. Storing it as float make code
easier to read and avoid this rounding that was breaking
this feature.
When getting new refs, a lot of them are really old and the
find_new_commits is called for each one and thus browsing branches.
With this commit, refs older than configured max_age are ignored.
Co-authored-by: Xavier Dollé (xdo@odoo.com)
Accessing childrens can create rollback, especially for builds with
a lot of them, since other runbot will concurently access and update
states. This commit tries to improve this by incrementing and
decrementing counters instead of counting all of them each time.
A slow query was detected on the runbot, causing a latency when loading
any page. After some investigations (thnks jle), we found that an index
on local_state could improve speed.
When updating github statuses, it happens that we face a "Bad gateway"
from github. In that case, the error is logged in the runbot logs and
that's it. As a consequence, when the runbot_merge is waiting status for
the staging branch and this kind of error occurs, the runbot_merge
timeouts and the users vainly search the reason.
With this commit, the runbot tries to update the status at least twice.
If it fails, an INFO message is logged on the build itself.
Nightly build have a low priority but once they have a slot, they keep it.
When pushing a branch or asking robodoo to nicelly merge a branch for the
third time at 22:00, there may be no slot left since all the nightly build are created.
This commit will only assign scheduled build if there is no other build to create and
will always keep a free slot for other builds.
When using a "python job", it's sometimes useful to write a file or
create a directory.
Instead of giving a wide open access to the os module in the
"_run_python" context, this commit adds a write_file and make_dirs
methods on the build which is usable in the _run_python eval context.
When using the read_method in a "python job", it's sometimes needed to
read a file in binary mode.
With this commit, the mode can be specified when calling the method.
Since the use of the "python jobs", we spotted various needs that were
not fulfilled. In order to add flexibility to "python jobs", this commit
adds some useful objetcs in the _run_python eval context.
Also, the glob.glob function is given instead of the whole glob module
to avoid giving access to the os module via glob.os.
When a runbot instance is scheduling builds, the numbers of builds
depends of a global ir.config_parameter. Even if one of the runbot
instance is running on a more powerful systsem, its number of workers is
limited by this global parameter.
With this commit, this parameter still exists but can be overriden by
specific ir.config_parameter.
For example, if the host 'runbot24.odoo.com' has more cpu power, the
number of workers for this host can be specified in the
ir.config_parameter named 'runbot24.odoo.com.workers'.
When installing software with apt-get in a Dockerfile, it should be
preceded with an apt-get update in the same RUN. Otherwise, the step may
fail if the needed package has been updated.
In a create config, a parent result is computed based on children
results
In some situations, it could be handy to ignore the result of some
sub-builds.
Example: the nightly tests are just the children of one nightly build
with a create config. The external tests are failing randomly and as a
consequence, the nightly result is always red. On the other hand,
keeping the test running, just to have logs is a good idea.
With this commit, a config_step of type create can be marked as
orphan_result, that way, the result is not taken into account in the
parent build result.
When the quickconnect button is used, the last running build is
searched in the last 10 builds. If no running build is found, the last
one is rebuilt, even if it's a nightly build.
With this commit, the quickconnect build is choosen only among the ones
with the same config.
With recursive states computation, schedule is
most likely to have transactionnal errors.
This is particularly a problem when external
operations are done during the transaction,
like running a docker.
Adding some commits will help to reduce
transactionnal errors, and ensure that the db
is consistent with docker states.
As the public user needs to be in runbot user group to display the
frontend, the public user is allowed to kill or rebuild a build.
With this commit, only the logged in users have access to the Rebuil/Kill
menu entry.
When searchings for new refs to create builds, the for-each-ref git
commit is run and each ref is searched in the database which is a
somewhat heavy operation.
With this commit, the timestamp of the last database update with the
refs is stored in a field on the repo. This timestamp is checked each
time a for-each-ref is needed, running the operation only when
necessary.
This commit aims to replace static jobs by fully configurable build config.
Each build has a config (custom or inherited from repo or branch).
Each config has a list of steps.
For now, a step can test/run odoo or create a new child build. A python job is
also available.
The mimic the previous behaviour of runbot, a default config is available with
three steps, an install of base, an install+test of all modules, and a last step
for run.
Multibuilds are replace by a config containing cretaion steps.
The created builds are not displayed in main views, but are available
on parent build log page. The result of a parent takes the result of
all children into account.
This new mechanics will help to create some custom behaviours for specifics
use cases, and latter help to parallelise work.
The cpu limit used in job_20 uses the runbot_timeout config_parameter
since b539112a7e. When measuring coverage, this parameter is multiplied
and leads to an error because the type of ir.config_parameter.get_param
method is str.
With this commit, the this value is converted into integer before usage
in job_20.
When running Odoo in the Docker container, the username used to connect
to the database is the username defined in the docker container
(actually odoo).
A problem may arise if the user of the runbot process is not the same.
An authentication error is then raised by postgres because of the
username mismatch.
With this commit, the '-r' parameter of Odoo is added to the command
with the username used by the runbot process.
While at it, unused imports are removed.
When a build exceeds the cpu limit, it is simply killed by the kernel.
As a safeguard the "Initiating shutdown." sentence should be searched
in the log file, and the build marked as "ko" if not found.
Unfortunateley, there is no period (.) at the end of the sentence in the
Odoo logs (see: https://github.com/odoo/odoo/blob/12.0/odoo/service/server.py#L444)
Thus, this condition is never fulfilled.
On top of that, this was masked by the first part of the condition,
checking that the 'test/common.py' has no "post_install" string.
The "test" directory does not exists in Odoo ( but "tests" exists) , so
the condition was always falsy.
Finally, a build can be marked as "ok" when he is killed and no errors
are found until the kill.
With this commit:
* The legacy grep for post_install is removed as it now exists in
all Odoo supported versions.
* The period typo is fixed.
* A log is inserted when the final sentence is not found.
* The cpu_limit is set as the same as the runbot_timeout parameter
for better consitency.
* The time exceeded log message is now logged in the build instead
of the runbot log.
Co-authored-by: @Xavier-Do