When a docker fails to build, the output is logged in the chatter
leading to a lot of noise and a not so readable output. Moreover, the
output tries to format markdown and don't render line break correctly.
This commit proposes to introduce a model to store this output, as well
as some other info like the image identifier, build time, ...
This will help to compare images versions between hosts and should be
useful later to have multiple version of the same image with variant
once the docker registry is introduced.
Breakages:
- the entire http.py API was updated requiring fixing the uses of
`request.jsonrequest` and the patches to `WebRequest` to hook in
sentry
- `fontawesome` was moved
- `*[@groups]` are now completely removed from the view if not
matching, so any field inside of them which needs to be used outside
(e.g. attrs) has to be added as invisible outside the element
- discuss removed the mail tracking value helpers from RPC in
odoo/odoo#88547, so reimplement locally (and better)
The batch prepare fill missing won't work if the repo is single version,
in a foreign bundle.
This commit simply adds the branches depending on their version instead
of the bundle version.
Because the first operation the notification task performs is updating
the commit, this has to be flushed in order to read-back the commit
hash (probably fixed in later version).
This means if updating the flag triggers a serialization
failure (which happens and is normal) we transition to the `exception`
handler, which *tries to retrieve the sha again* and we get an "in
failed transaction" error on top of the current "serialization
failure", leading to a giant traceback.
Also an unhelpful one since a serialization failure is expected in
this cron.
- Move the reads with no write dependency out of the `try`, there's no
reason for them to fail, and notably memoize the `sha`.
- Split the handling of serialization failures out of the normal one,
and just log an `info` (we could actually log nothing, probably).
- Also set the precommit data just before the commit, in case staging
tracking is ever a thing which could use it.
Fixes#909
Detailed statuses are useful in the actual PR dashboard as that allows
direct access to the builds, however in the PR where it's only a
picture it's useless, so fold that information. Also fold it when a PR
is staged.
And while at it add a note / sub-title that the PR is staged.
Fixes#919
The weekly maintenance would not prune refs. This is not an issue on
odoo/odoo because development branches are in a separate repository,
thus never fetched (we push to them but only using local commits and
remote refs).
However on repos like odoo/documentation the reference and development
branches are collocated, the lack of pruning thus keeps every
development branch alive locally, even years after the branch has been
deleted in the repository.
By pruning remote-tracking refs before GC-ing, we should have cleaner
local clones, and better packing.
`test_inconsistent_target` was, appropriately, inconsistent, but would
only fail a very small fraction of the time: the issue is that a PR
would switch target between `other` and `master` assuming neither was
an intrinsic blocker *but* the branches are created independently,
just with the same content.
This means if a second ticked over between the creation of the
`master` branch's commit and that of `other`, they would get different
commit hashes (because different timestamp), thus the PR would get 2
commits (or complete nonsense) when targeted to `other`, and the PR
itself would be blocked for lack of a merge method.
The solution is to be slightly less lazy, and create `other` from
`master` instead of copy/pasting the `make_commits` directive. This
means the PR has the exact same number of commits whether targeted to
`master` or `other`, and we now test what we want to test 60 seconds
out of every minute.
A few crons (e.g.database maintenance) remain on timers, but most of
the work crons should be migrated to triggers.
The purpose of this change is mostly to reduce log spam, as crons get
triggered very frequently (most of them every minute) but with little
to do. Secondary purposes are getting experience with cron triggers,
and better sussing out dependencies between actions and crons /
queues.
Fixes#797
Since every cron runs on a fresh database, on the first `run_crons`
every single cron in the db will run even though almost none of them
is relevant.
Aside from the slight inefficiency, this creates unnecessary extra
garbage in the test logs.
By setting the `nextcall` of all crons to infinity in the template we
avoid this issue, only triggered crons (or the crons whose nextcall we
set ourselves) will trigger during calls.
This requires adjusting the branch cleanup cron slightly: it didn't
correctly handle the initial run (`lastcall` being false).
With the trigger-ification pretty much complete the only cron that's
still routinely triggered explicitly is the cross-pr check, and it's
that in all modules, so there's no cause to keep an overridable
fixture.
The staging cron turns out to be pretty reasonable to trigger, as we
already have a handler on the transition of a batch to `not blocked`,
which is exactly when we want to create a staging (that and the
completion of the previous staging).
The batch transition is in a compute which is not awesome, but on the
flip side we also cancel active stagings in that exact scenario (if it
applies), so that matches.
The only real finesse is that one of the tests wants to observe the
instant between the end of a staging (and creation of splits) and the
start of the next one, which because the staging cron is triggered by
the failure of the previous staging is now "atomic", requiring
disabling the staging cron, which means the trigger is skipped
entirely. So this requires triggering the staging cron by hand.
The merge cron is the one in charge of checking staging state and
either integrating the staging into the reference branch (if
successful) or cancelling the staging (if failed).
The most obvious trigger for the merge cron is a change in staging
state from the states computation (transition from pending to either
success or failure). Explicitly cancelling / failing a staging marks
it as inactive so the merge cron isn't actually needed.
However an other major trigger is *timeout*, which doesn't have a
trivial signal. Instead, it needs to be hooked on the `timeout_limit`,
and has to be re-triggered at every update to the `timeout_limit`,
which in normal operations is mostly from "pending" statuses bumping
the timeout limit. In that case, `_trigger` to the `timeout_limit` as
that's where / when we expect a status change.
Worst case scenario with this is we have parasitic wakeups of this
cron, but having half a dozen wakeups unnecessary wakeups in an hour
is still probably better than having a wakeup every minute.
These are pretty simple to convert as they are straightforward: an
item is added to a work queue (table), then a cron regularly scans
through the table executing the items and deleting them.
That means the cron trigger can just be added on `create` and things
should work out fine.
There's just two wrinkles in the port_forward cron:
- It can be requeued in the future, so needs a conditional trigger-ing
in `write`.
- It is disabled during freeze (maybe something to change), as a
result triggers don't enqueue at all, so we need to immediately
trigger after freeze to force the cron re-enabling it.
When cleaning build errors before fingerprinting, it's only possible to
replace the matching regex with something else but not an empty string.
Since the python 3.11 that may adds lines in error message in order to
visually improve them, the fingerprint of those errors does not match
anymore between different versions.
With this commit, when the replacement string is two consecutive simple
quotes, the matching element is replaced by an empty sting, allowing to
remove unwanted characters.
Mergebot / forwardport crons need to run in a specific ordering in
order to flow into one another correctly. The default ordering being
unspecified, it was not possible to use the normal cron
runner (instead of the external driver running crons in sequence one
at a time). This can be fixed by setting *sequences* on crons, as the
cron runner (`_process_jobs`) will use that order to acquire and run
crons.
Also override `_process_jobs` however: the built-in cron runner
fetches a static list of ready crons, then runs that.
This is fine for normal situation where the cron runner runs in a loop
anyway but it's any issue for the tests, as we expect that cron A can
trigger cron B, and we want cron B to run *right now* even if it
hadn't been triggered before cron A ran.
We can replace `_process_job` with a cut down version which does
that (cut down because we don't need most of the error handling /
resilience, there's no concurrent workers, there's no module being
installed, versions must match, ...). This allows e.g. the cron
propagating commit statuses to trigger the staging cron, and both will
run within the same `run_crons` session.
Something I didn't touch is that `_process_jobs` internally creates
completely new environments so there is no way to pass context into
the cron jobs anymore (whereas it works for `method_direct_trigger`),
this means the context values have to be shunted elsewhere for that
purpose which is gross. But even though I'm replacing `_process_jobs`,
this seems a bit too much of a change in cron execution semantics. So
left it out.
While at it tho, silence the spammy `py.warnings` stuff I can't do
much about.
The order of requirements may have an impact on final outcome.
In documentation builds, we have two requirements with conflicting needs
Lets make the doc requirements install after the odoo ones
(sequence dependant)
- add context, in case that allows tracking overlap between tests
eventually
- use `-m` to run odoo, requires backporting `odoo/__main__.py`
but I'm not quite sure how I was running it previously, possibly
via odoo-bin? It certainly doesn't seem to work out with a global
`odoo` helper
Merge errors are logical failures, not technical, it doesn't make
sense to log them out because there's nothing to be done technically,
a PR having consistency issues or a conflict is "normal". As such
those messages are completely useless and just take unnecessary space
in the logs, making their use more difficult.
Instead of sending them to logging, log staging attempts to the PR
itself, and only do normal logging of the operation as an indicative
item. And remove a bunch of `expect_log_errors` which don't stand
anymore.
While at it, fix a missed issue in forward porting: if the `root.head`
doesn't exist in the repo its `fetch` will immediately fail before
`cat-file` can even run, so the second one is redundant and the first
one needs to be handled properly. Do that. And leave checking
for *that* specific condition as a logged-out error as it should mean
there's something very odd with the repository (how can a pull request
have a head we can't fetch?)
Apparently I'd already fixed that in
286c1fdaee but it has yet to be
deployed.
While at it, add a feedback message to clarify that, unlike `r+`, `r-`
on forward ports does *not* propagate.
Fixes#912
The runbot colors can lack contrast in some case, especially for
colorblinded people.
This commit introduces a theme that should help a little.
Note that this was done without much analysis and should be tweaked in
the future.
The PR dashboard picture provides a great overview of the batch state
both horizontally and vertically *but* apparently people can't for the
life of them go check the actual dashboard when things don't line
up. So expand the "current batch" to a view more similar to
dashboard *page*, which gives details of the sub-checks being
performed and whether they are or are not fulfilled.
Fixes#908
The previous version worked but was extremely plodding and
procedural. Initially I wanted to compose the table in a single pass
but that turns out not to really be possible as the goal for #908 is
to have a "drawer" for extended information about the current batch:
this means different cells of the same row can have different heights,
so we can't one-pass the image either vertically (later cells of the
current column might be wider) or horizontally (later cells of the
current row might be taller).
However what can be done is give the entire thing *structure*,
essentially defining a very cut down and ad-hoc layout system before
committing the layout to raster.
This also deduplicates formatting and labelling information which was
previously in both the computation first step and the rasterisation
second step.
See 66a1018918 (r144596695)
Looks like 1.21.5 is not available on some distribution
1.22 should be available on most system.
Note that the prefered solution would be to install matplotlib using
apt install python3-matplotlib as stated in the readme.
Extract current table generation into a separate function, add an
other function to render an alert / list of PR targets if the batch is
not consistent.
This means an extra pass on the table contents to precompute the image
size, but we can delay loading fonts until after etag computation
which might be a bigger gain all things considered: there aren't many
cells in most PR tables, but fonts are rather expensive to
load (I should probably load them at import and cache them in the module...)
A new version of numpy (2.0.0) was released in june 2024
Since matplotib requires a version of numpy >= 1.17, this new version
is installed instead of the 1.x.
Pinning the numpy version to the expected one in ubuntu jammy should
solve the issue.
Failed to initialize database `65188118-17-0-runbot`.
Traceback (most recent call last):
File "/data/build/odoo/odoo/service/server.py", line 1313, in preload_registries
registry = Registry.new(dbname, update_module=update_module)
File "<decorator-gen-16>", line 2, in new
File "/data/build/odoo/odoo/tools/func.py", line 87, in locked
return func(inst, *args, **kwargs)
File "/data/build/odoo/odoo/modules/registry.py", line 113, in new
odoo.modules.load_modules(registry, force_demo, status, update_module)
File "/data/build/odoo/odoo/modules/loading.py", line 480, in load_modules
processed_modules += load_marked_modules(env, graph,
File "/data/build/odoo/odoo/modules/loading.py", line 364, in load_marked_modules
loaded, processed = load_module_graph(
File "/data/build/odoo/odoo/modules/loading.py", line 185, in load_module_graph
load_openerp_module(package.name)
File "/data/build/odoo/odoo/modules/module.py", line 395, in load_openerp_module
__import__(qualname)
File "/data/build/runbot/runbot/__init__.py", line 3, in <module>
from . import controllers
File "/data/build/runbot/runbot/controllers/__init__.py", line 5, in <module>
from . import badge
File "/data/build/runbot/runbot/controllers/badge.py", line 5, in <module>
from matplotlib.font_manager import FontProperties
File "/home/odoo/.local/lib/python3.10/site-packages/matplotlib/__init__.py", line 109, in <module>
from . import _api, _version, cbook, docstring, rcsetup
File "/home/odoo/.local/lib/python3.10/site-packages/matplotlib/rcsetup.py", line 27, in <module>
from matplotlib.colors import Colormap, is_color_like
File "/home/odoo/.local/lib/python3.10/site-packages/matplotlib/colors.py", line 56, in <module>
from matplotlib import _api, cbook, scale
File "/home/odoo/.local/lib/python3.10/site-packages/matplotlib/scale.py", line 23, in <module>
from matplotlib.ticker import (
File "/home/odoo/.local/lib/python3.10/site-packages/matplotlib/ticker.py", line 136, in <module>
from matplotlib import transforms as mtransforms
File "/home/odoo/.local/lib/python3.10/site-packages/matplotlib/transforms.py", line 46, in <module>
from matplotlib._path import (
ImportError: numpy.core.multiarray failed to import
- make sure inconsistent batches prevent merging
- don't take closed PRs in account for the consistency check (unless
all PRs are closed)
- add a wizard to split PRs out of a batch when the inconsistency is
legitimate
- notify that the batch is inconsistent on the PR dashboard
Fixes#911
- Update branch name to prefix with project as it can be hard to
differentiate when filtering by or trying to set targets, given some
targets are extremely common (e.g. `master`/`main`) and not all
fields are filtered by project (or even can be).
- Add a proper menu item and list view for batches, maybe it'll be of
use one day.
- Upgrade label in PR search, it's more likely to be needed than
author or target.
- Put PRs first in the mergebot menu, as it's *by far* the most likely
item to look for, unless it's staging in order to cancel one.
Finally went to fix the issue that web.base.url is not correctly set
up by the test suite and thus things like `PullRequest.url` don't
return the right URL when running tests.
The answer I re-discovered yet again is that one only has to log in as
an admin (`base.group_system`) while providing a `base_location` env
key.
This is done automatically when logging via the frontend, under the
assumption that this is the "correct" url, but when logging in via RPC
it has to be done manually as that's more internal and we might use
e.g. scripts on the same server to manipulate the instance.
Previously we would copy *only* the database itself, which is mostly
not an issue, but when trying to debug *pages* using a browser, that
browser tries to load various assets, which are looked up in the
database, then in the filestore, where they are missing.
This is not an issue in and of itself, but it triggers ugly tracebacks
in the logs which is not convenient, and since
2ab06ca96b it fails the test even if the
test would otherwise work.
From the previous version of `_compute_target` this was clearly
intended otherwise the fallback makes no sense, but just as clearly I
completely missed / forgot about it halfway through (and the lack of
test didn't help).
The compute is also way overcomplicated, it's not clear why (the only
explanation I can think of is that an intermediate version had a
string target but if that ever happened it was squashed away).
This color was altered in 232aa271b0, it
was moved from a cyan-ish green to a yellow quite close to the warning
color.
There is no explanation why, the commit concerns itself with *PR*
dashboards but this class / color is only used on the main
dashboard. It may have been a victim of the color refactoring in that
commit and I fucked up.
This is very disagreeable as it shows up as basically a warning
between the end of staging and it actually getting merged. Rollback
this change back to a green-cyan.
Previously PR descriptions were displayed as raw text in the PR
dashboard. While not wrong per se, this was pretty ugly and not always
convenient as e.g. links had to be copied by hand.
Push descriptions through pymarkdown for rendering them, with a few
customisations:
- Enabled footnotes & tables & fenced code blocks because GFM has
that, this doesn't quite put pymarkdown's base behaviour on par with
gfm (and py-gfm ultimately gave up on that effort moving to just
wrap github's own markdown renderer instead).
- Don't allow raw html because too much of a hassle to do it
correctly, and very few people ever do it (mostly me I think).
- Added a bespoke handler / renderer for github-style references.
Note: uses positional captures because it started that way and named
captures are not removed from that sequence so mixing and matching
is not very useful, plus python does not support identically named
groups (even exclusive) so all 4 repo captures and all 3 issue
number captures would need different names...
- And added a second bespoke handler for our own opw/issue references
leading to odoo.com, that's something we can't do via github[^1] so
it's a genuine value-add.
Fixes#889
[^1]: github can do it (though possibly not with the arbitrary
unspecified nonsense I got when I tried to list some of the
reference styles, some folks need therapy), but it's not available
on our plan
I thought I'd removed the error message when approving an already
approved PR but apparently not?
However we can improve the message in that specific case, to make the
expected operation clearer.
Fixes#906
There is an issue in unidiff 0.7.3 fixed in 0.7.4
a3faffc54e
This version is not able to parse a diff with removed files.
Since the unidiff packaged version in noble is 0.7.3 patching it looks
like the easiest solution
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes#847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
Automating via parameters is riskier as we can hit the CLI
limitations (cf 0a839a4857). Going
through stdin is a lot safer and cleaner when automating, and it's not
much of an imposition here.
After seeing it be used, I foresee confusion around the current
behaviour (where it sets the limit), as one would expect the `fw=`
flags to affect one another when it looks like that would make sense
e.g. no/default/skipci/skipmerge all specify how to forward port, so
`fw=default` not doing anything after you've said `fw=no` (possibly by
mistake) would be fucking weird.
Also since the author can set limits, allow them to reset the fw
policy to default (keep skipci for reviewers), and for @d-fence add a
`fw=disabled` alias.
Fixes#902