When a build is done, various numerical informations could be extracted
from log files. e.g.: global query count or tests query count ...
The extraction regular expression could be hard-coded in a custom step
but there is no place holder where to store the retrieved information.
In order to compare results, we need to store it.
With this commit, a new model `runbot.build.stat` is used to store
key/values pair linked to a build/config_step. That way, extracted
values can be stored.
Also, another `runbot.build.stat.regex` is used to store regular
expressions that can be used to grep log files and extract values.
The regular expression must contain a named group like this:
`(?P<value>.+)`
The text catched by this group MUST be castable into a float.
Optionally, another named group can be used in the regular expresion
like this:
`(?P<key>.+)`
This `key` group will then be used to augment the key name in the
database.
Example:
Consider a log line like this one:
`odoo.addons.website_blog.tests.test_ui tested in 10.35s`
A regular expression like this, named `test_duration`:
`odoo.addons.(?P<key>.+) tested in (?P<value>\d+\.\d+)s`
Should store the following key:value:
`{
'key': 'test_duration.website_blog.tests.test_ui',
'value': 10.35
}`
A `generic` boolean field is present on the build.stat.regex object,
meaning that when no regex are linked to a make_stats config step, then,
all the generic regex will be applied.
A wizard is added to help the creation the regular expressions, allowing
to test if they work against a user provided example.
A _make_stats method is added to the ConfigStep model which is called
during the _schedule of a build, just before calling the next step.
The regex search is only apllied in steps that have the `make_stats`
boolean field set to true. Also, the build branch have to be flagged
`make_stats` too or the build can have a key `make_stats` in its
config_data field.
The `make_stats` field on the branch is a compute stored field.
That way, sticky branches are automaticaly set `make_stats' true.
Finally, an SQL view is used to facilitate the stats visualisation.
With this commit, a kind of markdown is allowed in log messages and
build.description.
Following patterns are supported:
**strong**
~~striketrough~~
__underline__
`code`
[link](target)
@icon-font-awesome-class
On non sticky branches, when a new build is found while another is
already testing, the older build is killed. This happens during when the
main runbot instance is discovering new commits and create new builds.
As a result, concurrent updates may occur while the builders access the
concerned build.
With this commit, this garbage collecting procedure occurs during the
scheduler loop that runs on runbot builder hosts.
Also, the logic changed in a way that the kill is requested only if the
host needs room to handle pending builds.
When a build age reaches the gc_days parameter, its database is dropped
and its directory is removed.
With this commit, two fields are added in order to keep some builds
longer that the defined gc_days.
The gc_delay field on the build allows to add a delay (in number of
days) that is added to its gc_days to compute the gc_date.
The gc_date field is the date when the cleaning will occur.
Also, a test is added and the RunbotCase test class is improved to allow
the stop of a patcher.
Since cb05f2b9d8, when creating a database, the template1 is used, this
allows to customize the template and install some needed Postgress
extensions.
Unfortunately, it's also a source of build failures. For example, if the
pg_activity util is used on a runbot host, database creation may fail
with a message like this:
source database "template1" is being accessed by other users
It's because pg_activity needs a database and uses template1.
With this commit, template1 is still used by default but can be changed
with a system parameter. That way, a custom template can be created on
runbot hosts and used when creating DB in builds.
At this moment, the Docker image is built at the beginning of each
runbot build. This blocks the _scheduler while the image is built.
With this commit, the image is built before calling the _scheduler and
is not linked to a runbot build.
Also, the necessary dirs are created in the static path before starting
the loop.
Since 857821e4 a screenshot can be saved in the build directory under
the "tests" subdirectory.
Unfortunately, this directory is cleaned in case of local_cleanup.
With this commit, the 'tests' subdirectory is kept in place.
If a docker is killed from outside, the start file will still be there but
not the end file. (when restarting docker service for instance)
This commits add a docker_state specific to non running docker
with start a file, that should be handled like unknow state:
This state is acceptable for a while, but build should be killed
if this state remains for to long.
When two steps in the same build needs to exchange informations, some
hacks have to be used. E.g. using the extra_params fields to store comma
separated values.
With this commit, a config_data field is added alongside with a
JsonDictField that automatically transform the data into json.
This reverts commit 54f9b9b546.
The main reason is linked to inconsistency in state compute because
of error in nb_ computations.
This was to avoid concurrent update, witch is not a such a big problem
now since workers are no longer using crons. (retry on failure is faster).
This commit add the possibility to add custom checks to python steps,
as well as ignoring triggered result if log of level error/warning
is not considered as a problem.
docker_is_running is ambiguous since we dont know if it was started once.
This new feature tries to add tools to know if a docker was started or not.
The main reason of this is that sometimes docker_run may take more than 15 seconds
creating unpredictable errors on build when the second step is launched and
the previous one is still running. Hopefully this fix will help to solve this
issue and detect late docker run.
This commit was initially there for tests, when no repo exist, but
get_param will also crash if commit does not exist, wich may be
a problem on user rebuild.
When creating an .odoorc file to store configuration that are proper for
the runbot, the command line options were used as key.
The problem is that the Odoo config file use undescore instead of dash
for options keys.
Also, the Command object was not recreated with the config_tuples
parameter. Because of that, when adding a command in the list, the
config_tuples were losts.
As a consequence, on the Odoo runbot instance, the data dir was created
in the default dir and thus, not included in the zip file of the dump,
causing some runbot steps to fail.
A lot of things have to be mocked during runbot tests, as a consequence,
a lot of patch decorators accumulate in a big stack uppon some tests
methods.
Also, a lot of mocks are used multiple times among tests.
With this commit, a new RunbotClass is added that comes with patches
ready to be started. A start_patcher helper method is available to start
a patch and add the appropriate stop in a cleanup.
Also, when a build is created in the tests, the _get_params method is
always called, resulting in an annoying git warning.
With this commit, a create_build method is added on the test class, that
way the _get_params is always mocked when a build is created.
When starting an odoo instance with Docker, a very long command line is
computed and appears in the logs.
With this commit, an .odoorc configuration file is written ind the build
dir and mounted in the Docker container.
Previously, the runbot .odoorc/.openerprc file was mounted to share some
parameters. Now, if that file exsists, its content is merged with build
.odoorc.
Dump a db at the end of a build, using a new 'finals' cmd part
added in order to execute dump even if build fails.
Add a link in last step log to download dump.
In different situations, a docker container may stay alive even if the
build global_state is done. This can lead to a build failure when a
build wants to go in running state and tries to expose the same ports as
the left over build.
Actually some Odoo modules are black_listed from a set hardcoded in the
runbot code. In some cases, one needs to blacklist custom modules,
preferably in a config_step.
With this commit, the repo.modules, branch.modules,
config_step.install_modules fields are concatained in a comma separated
list of fnmatch patterns. The patterns can be prefixed with a dash to
exclude the matching module(s).
Co-authored by @Xavier-Do
When a build is running, a cron, an evil query or something else can
start to fill and bloat the runbot ir_logging table.
With this commit, a log_counter field is added on the build, starting at
100. The SQL trigger decrement this counter after a line is inserted.
When the counter drops to 0, a the last log line contains a message
stating that the limit has been reached. Further log lines are dropped
for this build step.
The counter is reset to a default of 100 before each step.
This value is configurable through an optional ir.config_parameter
runbot_maxlogs.
The runbot itself is still able to add logs lines through the build _log
method.
Thanks @Xavier-Do for the smart idea.
When a build only create sub-builds, the build_time is verry small (a few seconds),
and this information is not relevant. This commit propagates end_time to parent_build
if parent_build is done or running.
- Add a keep running flag on the build to allow a build to stay in
running state until the flag is switched off ( or the build killed)
- Do not update configs and config_steps data
- Add a first/last_seen_build and first/last_seen_date on build.error
- Children error builds now include the parent builds too
- Use a notebook on build.error form view to display builds and linked
errors
- Update result when a build triggers a change from 'warn' to 'ko' too
- Add the sticky flag on the error logs stored sql view
When killed a build could have his build end changed (problematic when
killing a running since build_time must represent the testing time)
-> if a build already has a build end, don't overwrite it.
Port also needs to be reset on wake_up since another build would have
recycle the current one since port unicity is limited to build not in
done state. This was working most of the time before since port unicity
was determined cross runbots, thus we only had one chance over 17
to have a conflict on wake up. (this changed with prevous commit)
With the increasing number of runbot servers (17), the total number of docker
instances can reach more than 3570 only for running build. Starting at 2000,
this covers the posrt 5432 used by postgress and make the build run step fail.
This commit simply limit the port unicity constraint by host.
With this commit, a new model is introduced to facilitate the tracking
of the build errors.
Its based on an SQL view (Thanks @Xavier-Do), that way, there is no new
table in DB and this view is also useful from the PSQL CLI.
In the UI, the search for errors easier than manipulate the ir_logging
view because the builds informations can be used in search and filters.
When a build is wake-up and something goes wrong during the
_run_odoo_run method, the "fetch and build" cron is broken and the
concerned runbot host stops working.
With this commit, the exception is catched and the build goes back to
the "done" state whith a log.
With this commit, a new RunbotBuilError model is added in order to
classify and manage errors that appears during runbot builds. This is
an helper to find undeterministic bugs in Odoo builds. Build logs can
be parsed on demand, during the parsing, the logs are cleaned with some
regexes stored on the RunbotErrorRegex model. A hash is computed on the
cleaned log, if a build error already exists with the same fingerprint,
the build is appended on the build error.
Errors can also be manually linked together with a parent/children
relation in case of a related error log. e.g. the error message is
different in two different branches but the bug is the same.
Also, a new build_url field is added to the runbot_build in order to
access the build web page from the backend.
A prototype of feature was added some times a go.
No really tested, this commit improves parmater format
and makes dependency closest_branch_id not required
since a repo/sha is all we need.
Docker can take some time to be considered as running after docker_run. This
issue can appear when we speedup sheduling loop. To avoid that when can add an
time condition to consider if a docker is running, but we want to avoid to wait
to much since some jobs are fast.
This solution check if a job is a docker run before waiting, and will also
update job_start after a checkout since this can take some time if
a git fetch is performed.
When a build is killed, result will be set to manually killed,
removing the 'error' or 'warn' result.
This commit removes this behaviour in order to keep error result
in this case.
If a user really wants to keep a database up for a long time, he has the possibility
to wake it up multiple times.
Using last job end as reference will allow to keep a database alive longer.
The main motivation of this commit is to be able to notify github status only
when all children are done.
Until today, children where only used for dev branches and nightly. The needs to
use this system for staging need to enforce github_status behaviour.
Before this commit, a parent won't send github status since he will only
create childrens. And childrens are not awared of other children state,
so sending a succes may be wrong if another one failed.
Asking the parent to make the github_status looks the easiest solution:
-If top parent config will have update_github_state False, we also want to take that into account.
-If a child wants to contact github for failfast, parent will be in global_state error too and will send
message immediatly.
-If a child want to contact github for succsess, we actually want to wait for last child, parent will
be in waiting global_state and notify nothing (or pending).
Only last child will be able to notiffy success since global_state will be running or done at this step.
Orphan builds wont have any impact on result in with this scenario.
Sources can be easily exported if needed since they are in the bare repository
most of the time. To avoid using to much space, this commit will garbage collect
all sources at the beginning of a long_running cron.
Only real side effect of this is that it will be impossible to wake up
a build that was force pushed since source cannot be fetched anymore.
We may imagine that we could keep sources of recent build, maybe for
48 hours, but keeping build specific data (logs, database)
is more interresting.
When a build is completely dead, with directory and db deleted, the
wake-up system fails.
With this commit, a wake-up is not allowed on such dead builds.
When an Odoo instance is run in a Docker container, it listen on all
interfaces by default and a bridge interface is used to communicate with
the outside world.
This bridge interface is necessary to allow the instance to send logging
messages into the runbot postgresql database.
Even though Docker isolate the container from the oustide world, it's
still possible to reach the Odoo instace from the runbot host and worse,
from other Docker containers using the same bridge interface.
To confirm the Murphy's law, it finally happened with this commit
odoo/enterprise@0ba0ef99de that scanned the ip range of the interface,
disturbing other builds.
One solution would be to create a bridge interface for each instance to
isolate each Docker but that would imply a big change to garbage collect
forgotten bridges ...
An 'icc' (Inter Container Communication) option exists for the Docker
daemon which defaults to True. Setting it to False was tested and works
but it appears that this option can be messed-up by the firewall on the
runbot host.
Finally, if applied, this commit will prevent the Odoo instance to
listen on the bridge interface by only listening to 127.0.0.1 during
tests. A local_only parameter is a added on the _cmd method which is
true by default and means that the Odoo instance will only listen on
127.0.0.1. This parameter is set to False for the running step to allow
the running to be contacted through the Docker exposed ports.
The requirements path and python version where defined from
server in cmd. Since in coverage we add a 'python' before server,
it is difficult to define which element of the cmd is the server.
A solution here is simply to define requirements install and
python version when building cmd since we have access to all
build/source informations. We also add python part in every
cases, and coverage params are now a _cmd python_params.
The _cmd method now returns a Command object instead of a
list, which behave has a list for the cmd part but also contains
a pres and posts list.
pres are requirement install, preparation, ...
cmd is the original cmd list, element can be append or added, this
will allow to keep existing python job without to much changes.
posts are post cmd commands, like coverage result making.
This commit also fix issue with create_job dependencies.
Multibuild can create generate a lots of checkout, especially for small
and fast jobs, which can overload runbot discs since we are trying not
to clean build immediatly. (To ease bug fix and allow wake up)
This commit proposes to store source on a single place, so that
docker can add them as ro volume in the build directory.
The checkout is also moved to the installs jobs, so that
builds containing only create builds steps won't checkout
the sources.
This change implies to use --addons-path correctly, since odoo
and enterprise addons wont be merged in the same repo anymore.
This will allow to test addons a dev will do, with a closer
command line.
This implies to change the code structure a litle, some changes
where made to remove no-so-usefull fields on build, and some
hard-coded logic (manifest_names and server_names) are now
stored on repo instead.
This changes implies that a build CANNOT write in his sources.
It shouldn't be the case, but it means that runbot cannot be
tested on runbot untill datas are written elsewhere than in static.
Other possibilities are possible, like bind mounting the sources
in the build directory instead of adding ro volumes in docker.
Unfortunately, this needs to give access to mount as sudo for
runbot user and changes docjker config to allow mounts
in volumes which is not the case by default. A plus of this
solution would be to be able to make an overlay mount.
In some conditions, Docker can take a little time to start a container.
In that case, if the runbot checks that the container is running before
it starts, runbot consider the job as finished. It the tries to grep the
logs and, as expected, it does not find the "Modules loaded".
With this commit, we consider young builds (less than 15 sec) as
running, giving more time to Docker for starting it.
Accessing childrens can create rollback, especially for builds with
a lot of them, since other runbot will concurently access and update
states. This commit tries to improve this by incrementing and
decrementing counters instead of counting all of them each time.
A slow query was detected on the runbot, causing a latency when loading
any page. After some investigations (thnks jle), we found that an index
on local_state could improve speed.
When updating github statuses, it happens that we face a "Bad gateway"
from github. In that case, the error is logged in the runbot logs and
that's it. As a consequence, when the runbot_merge is waiting status for
the staging branch and this kind of error occurs, the runbot_merge
timeouts and the users vainly search the reason.
With this commit, the runbot tries to update the status at least twice.
If it fails, an INFO message is logged on the build itself.