There is no check or unique index, so nothing actually prevents having
multiple fw tasks for the same batch.
In that case, if we try to update the limit the handler will crash
because the code assumes a single forwardport task per batch.
Triggered by nle on odoo/odoo#201394 probably while trying to recover
from the incident of March 13/14 (which was not going to work as the
forwardport queue itself was hosed, see two commits above for the
reason).
Starting in 3.12 `TarFile.extact(all)` can take a `filter` parameter
which alters the behaviour of the extractor as it relates to the
application of tar metadata. Not passing this parameter is deprecated
unless the also new `TarFile.extraction_filter` attribute is set.
Now there are a few wrinkles here:
1. Both parameter and attributes were added in older patch releases
e.g. 3.9.17 and 3.10.12 and 3.11.4.
2. The attribute form will *not* resolve string filters, it needs the
callables.
As such just to be on the safe side of things set the attribute using
a `getattr`, in releases with the feature it will be set to a proper
value (the future default which ignores most or all metadata from the
archive), and in releases without it just sets the attribute to `None`
which should do no harm.
This was the root cause of the incident of Feb 13/14: because the
patcher pushed to the local branch before pushing to the remote
failing to push to the remote would leave the local ref broken, as
`fetch("refs/heads/*:refs/heads/*")` apparently does not do non-ff
updates (which does make some sense I guess).
So in this case a staging finished, was pushed to the remote, then git
delayed the read side just enough that when the patcher looked up the
target it got the old commit. It applied a patch on top of that, tried
to push, and got a failure (non-ff update), which led the local and
remote branches divergent, and caused any further update of the local
reference branches to fail, thus every forward port to be blocked.
Using symbolic branches during patching was completely dumb (and
updating the local branch unnecessary), so switch the entire thing to
using just commits, and update a bunch of error reporting while at it.
This is only a temporary / partial fix (see #1100 for further issues)
but it solves the most obvious problem of a batch having a single PR,
being staged, the staging fails, the PR gets closed, and the batch
disappearing from the staging.
The issue remains that stagings don't actually reflect the state as of
the staging, since they link to the updated version of batches rather
than the version as of staging. That probably can't be fixed if we
want to keep track of everything *but* for documentary and display
purposes the staging should store the state of batches and
PRs (e.g. as JSON) as of its creation.
Fixes#1079
The primary variables are also used in the backend, and they get
manipulated via SCSS, so setting the SCSS link-color to a `var()`
expression blows up the backend.
Much like the alerts and tables bootstrap's dropdowns don't bother
using CSS variables so don't react correctly to the color theme.
Fix them up by updating their CSS rules to use the relevant
variables. They can't be manipulated via their SCSS variables because
there's a bunch of other pieces of code (e.g. odoo theming shit) which
wants to manipulate the dropdown colors in SCSS so using CSS vars
break them.
- Make the body color slightly darker for better contrast (makes it
lighter when inverted).
- Make backgrounds lighter in dark mode.
- Hard-code success as well as danger and update the latter to make it
lighter.
Ideally this should be computed in oklch but sadly libsass doesn't
support oklch computations, and there is no way to extract
individual channels in CSS so there's no way to generate the `-rgb`
variables.
- Hard-code the dark mode background to a dark gray instead of a black
(using light-gray in light mode looks weird so keep white).
- Fix text and link color in alerts, I clearly still don't understand
how CSS variables work and I don't care anymore, fuck'em.
Part of #1088
THe bootstra-review-frontend resets text colors in an undesirable
manner, override the mixin to remove that crap.
Also add a few comments in case this needs to be further tuned in the
future.
Part of #1088
Uses localStorage rather than the odoo backend cookie because
Odoo *requires* the cookie, it can't fallback on the system theme, and
we might want a different color theme on the frontend and backend
since the frontend dark theme is pretty half-assed.
Also load the file by hand, as adding it to `assets_frontend` causes a
flash of system color theme. It would probably be possible to create
an asset which is not deferred or lazy (but wouldn't be async either
which is a shame), but I can't be arsed.
Part of #1088
Also reduce the grace period for merged PR branches to 1 week (from
2), and go through the on-disk repository instead of the github API.
Technically it might even be possible to do this un bulk, though that
seems annoying in case of failure (e.g. because somebody else deleted
the branch previously).
Fixes#1082
Requested by @Williambraecky to make checking over the entire batch
easier when checking if an upgrade exception can be removed.
Also add the info to the batch generalogy table, because why not.
Partial mitigation of #1065, not complete by any means. Avoiding
updating a local ref during staging is probably the most important bit
here, we're staging on commits (into a new ref) anyway so it should
not be needed, and that avoids conflicts between the staging cron and
e.g. the forwardport cron.
Since 94cf3e9647, FW is not limited to
reviewers. And it's been possible to set the fw policy on any forward
port pretty much since it was created (during commands rework /
formalisation).
However the help for the fw subcommands would only be shown on the
source PR, unnecessarily.
Express (re-express) a bunch of SCSS rules in terms of CSS variables,
and create a darkified version.
Based entirely on prefers-color-scheme, as apparently there's no
native in-document way to force that out and I can't be arsed to add
an entire override on a stick. If you want to toggle the page, use
toggley.
Not claiming the scheme looks any good tho, it's very dark indeed. But
it should limit the levels of flashbanging... until you open the
backend and die anyway.
Fix#1021
Because the only condition between PRs and statuses is sharing a
repository (and kinda not even that for stagings), adding or removing
a status on a repository would try to recompute the statuses/state of
essentially every staging in the history of the project, and a
significant fraction of the PRs, leading to tens of thousands of
queries, minutes of computation, and even OOMs of the HTTP workers as
we'd load the PRs, their batches, and the stagings into memory to
update them, then load more things because of what depends on PR
statuses, etc...
But there is no reason to touch any of the closed or merged PRs, or
the completed (deactivated) stagings. And in fact there is every
reason *not* to.
Implementing a search-only m2m on each object in order to restrict the
set of PRs/stagings relevant to a change reduces the number of queries
to a few hundreds, the run time to seconds, and the memory increase to
unnoticeable. The change still takes some time because the RD project
currently has about 7000 open PRs, 4500 of which target
odoo/odoo (which is our test case here), but that is nothing compared
to the 164000 total PRs to odoo/odoo out of some 250000 PRs for the RD
project.
And though they're likely less of an issue as they don't recurse quite
as much, the >120000 stagings during the project's history are not to
be ignored, when then number of *active* stagings at any one time is
at most the number of branches, which is half a dozen to a dozen.
For very basic numbers, as of committing this change, creating a
status in odoo/odoo (RD), optional on PRs and ignored on statuses,
on my current machine (7530U with 32GB RAM):
- without this change, 4835 queries, 37s of sql, 65s of non-SQL, RSS
climbs to 2258128 (2.15GiB)
- with this change, 758 queries, 1.46s SQL, 2.25s non-SQL, RSS climbs
to 187088 (182MiB)
Fixes#1067
... by default. Optional statuses we haven't at least received a
`pending` for are irrelevant to the PR's state, so should remain
hidden.
This subtetly was missed when fixing #1062Fixes#1066
- Adjust background colors via CSS variables instead of manual CSS
rules.
- Reset table background rules to use "normal" background colors
instead of bootstrap's bespoke thing.
- Change bg-unmerged to interpolate between "info" and "success"
background colors (in LAB it seems to yield the colors we originally
had hardcoded).
- Partially soft-code the alter-primary bg color adjustment. The
saturation and lightness changes are kinda arbitrary but it's now
based on the primary at least...
Commits can take some time to propagate through the network (I guess),
or human error can lead to the wrong commit being set.
Either way, because the entire thing was done using a single fetch in
`check=True` the cron job would fail entirely if any of the patch
commits was yet unavailable.
Update the updater to:
- fallback on individual fetches
- remove the patch from the set of applicable patch if we (still)
can't find its commit
I'd have hoped `fetch` could retrieve whatever it found, but
apparently the server just crashes out when it doesn't find the commit
we ask for, and `fetch` doesn't update anything.
No linked issue because I apparently forgot to jot it down (and only
remembered about this issue with the #1063 patching issue) but this
was reported by mat last week (2025-02-21) when they were wondering
why one of their patches was taking a while:
- At 0832 patch was created by automated script.
- At 0947, an attempt to apply was made, the commit was not found.
- At 1126, a second attempt was made but an other patch had been
created whose commit was not found, failing both.
- At 1255, there was a concurrency error ("cannot lock ref" on the
target branch).
- Finally at 1427 the patch was applied.
All in all it took 6 hours to apply the patch, which is 3-4 staging
cycles.
It's shorter, it's simpler (kinda), and it's 70% faster (although
that's unlikely to be any sort of bottleneck given applying patches
involves multiple network roundtrips).
Verify that the tree is different before and after applying the patch,
otherwise if there's a mistake made (e.g. a script does not check that
they have content and request a patch applying an existing commit
which is how odoo/enterprise#612a9cf3cadba64e4b18d535ca0ac7e3f4429a08
occurred) we end up with a completely empty commit and a duplicated
commit message.
Fixes#1063
Note: using `rev-parse` to retrieve the commit's tree would be 50%
faster, but we're talking 3.2 versus 2.4ms and it requires string
formatting instead of nice clean arguments so it's a bit meh.
If a status is defined as `optional`, then the PR is considered valid
if the status is never sent, but *if* the status is sent then it
becomes required.
Note that as ever this is a per-commit requirement, so it's mostly
useful for conditional statuses.
Fixes#1062
Mostly for tests: it can be really difficult to correlate issues as
there are 3 different processes involved (the test runner, the odoo
being tested, and dummy-central (as github)) and the intermixing of
logs between the test driver and the odoo being tested is
not *amazing*.
- `pytest-opentelemetry`'s `--export-traces` is the signal for running
tests with tracing enabled, that way just having
`pytest-opentelemetry` installed does not do anything untowards.
- Until chrisguidry/pytest-opentelemetry#34 is merged, should probably
use the linked branch as the default / base mode of having a single
trace for the entire test suite is not great when there are 460
tests, especially as local clients (opentelemetry-desktop-viewer,
venator) mostly work on traces and aren't very good at zooming on
*spans* at least currently.
- Additionally, the conftest plugin adds hooks for propagating through
the xmlrpc client (communications with odoo) and enables the
requests instrumentor from the opentelemetry python contribs.
- The dummy `saas_worker` was moved into the filesystem, that makes it
easier to review and update.
- A second such module was added for the otel instrumentation *of the
odoo under test*, that instruments psycopg2, requests, wsgi, and the
server side of xmlrpc.
- The git ops were instrumented directly in runbot_merge, as I've
tried to keep `saas_tracing` relatively generic, in case it could be
moved to community or used by internal eventually.
Some typing was added to the conftest hooks and fixtures, and they
were migrated from tmpdir (py.path) to tmp_path (pathlib) for
consistency, even though technically the `mkdir` of pathlib is an
annoying downgrade.
Fixes#835
"Run manually" is a bit meh, as it runs the cron synchronously (so you
have to wait for it, and hope it doesn't run for longer than the
request timeout which may be smaller than the cron timeout) and it can
run in a subtly different environment than normal, which can lead to
different behaviour.
Instead add a button to enqueue a cron trigger, now that they exist
that's much closer to what we actually want, and it does run the cron
in a normal / expected environment.
That's a bit of a weird one: apparently the boolean_toggle widget has
an `autosave` option which should be `true` by default, effecting the
row as soon as the toggle is toggled[^1]. But in 15.0 and 18.0 it
seems to have no effect, the `boolean_toggle` always just stores the
change in the parent form and that gets "committed on save.
In 16.0 and 17.0 however it does have an effect, toggling the control
will immediately save its value *without going through the parent
form*, resulting in the override to `Project.write` managing
new/existing branches to not be called, thus not calling
`Project_followup_prs`, and ultimately not creating the followup
forward ports.
After contacting AAB to get more info (and grepping a bit):
- autosave was added (enabled by default) in 16.0 after the owl
rewrite (odoo/odoo@28e6b7eb83)
- toggle was added in 17.0
(odoo/odoo@a449b05221)
- feature was removed in 18.0
(odoo/odoo@6bd2c1fdfb)
Which explains why the issue occurs in 16.0 and 17.0, and not in 15.0
or 18.0.
Fixes#1051
[^1]: but apparently not going through the parent form...
This is a bit of an odd case which was only noticed because of
persistent forwardport.batches, which ended up having a ton of related
traceback in the logs (the mergebot kept trying to create forward
ports from Jan 27th to Feb 10th, thankfully the errors happened in git
so did not seem to eat through our API rate limiting).
The issue was triggered by the addition of odoo/enterprise#77876 to
odoo/odoo#194818. This triggered a completion job which led to the
creation of odoo/enterprise#77877 to odoo/enterprise#77880, so far so
good.
Except the bit of code responsible for creating completion jobs only
checked if the PR was being added to a batch with a descendant. That
is the case of odoo/enterprise#77877 to odoo/enterprise#77879 (not
odoo/enterprise#77880 because that's the end of the line). As a
result, those triggered 3 more completion jobs, which kept failing in
a loop because they tried pushing different commits to their
next-siblings (without forcing, leading git to reject the non-ff push,
hurray).
A completion request should only be triggered by the addition of a
new *source* (a PR without a source) to an existing batch with
descendants, so add that to the check. This requires updating
`_from_json` to create PRs in a single step (rather than one step to
create based on github's data, and an other one for the hierarchical
tracking) as we need the source to be set during `create` not as a
post-action.
Although there was a test which could have triggered this issue, the
test only had 3 branches so was not long enough to trigger the issue:
- Initial PR 1 on branch A merged then forward-ported to B and C.
- Sibling PR 2 added to the batch in B.
- Completed to C.
- Ending there as C(1) has no descendant batch, leading to no further
completion request.
Adding a 4th branch did surface / show the issue by providing space
for a new completion request from the creation of C(2). Interestingly
even though I the test harness attempts to run all triggered crons to
completion there can be misses, so the test can fail in two different
ways (being now checked for):
- there's a leftover forwardport.batch after we've created all our
forwardports
- there's an extra PR targeting D, descending from C(2)
- in almost every case there's also a traceback in the logs, which
does fail the build thanks to the `env` fixture's check
For historical reasons pretty much all tests used to use the contexts
legal/cla and ci/runbot. While there are a few tests where we need the
interactions of multiple contexts and that makes sense, on the vast
majority of tests that's just extra traffic and noise in the
test (from needing to send multiple statuses unnecessarily).
In fact on the average PR where everything passes by default we could
even remove the required statuses entirely...
Should limit the risk of cases where the fork contains outdated
versions of the reference branches and we end up with odd outdated /
not up to date basis for branches & updates, which can lead to
confusing situations.
Skipmerge creates forward-ports before the source PR is even merged.
- In a break from the norm, skipmerge will create forwardports even in
the face of conflicts.
- It will also not *detach* pull requests in case of conflicts, this
is so the source PR can be updated and the update correctly cascades
through the stack (likewise for any intermediate PR though *that*
will detach as usual).
Note that this doesn't really look at outstandings, especially as they
were recently updated, so it might need to be fixed up in case of
freakout, but I feel like that should not be too much of an issue, the
authors will just get their FW reminders earlier than usual. If that's
a hassle we can always update the reminder job to ignore forward ports
whose source is not merged I guess.
Fixes#418
Not entirely sure about just allowing any PR author to set the merge
method as it gives them a lot more control over commits (via the PR
message), and I'm uncertain about the ramifications of doing that.
However if the author of the PR is classified as an
employee (via a provisioned user linked to the github account) then
allow it. Should not be prone to abuse at least.
Fixes#1036
All `pull_request` events seem to provide the `commits` count
property. As such we can use them all to check the `squash` state even
if we don't otherwise care for the event.
Also split out notifying an approved pull request about its missing
merge method into a separate cron from the one notifying a PR that its
siblings are ready.
Fixes#1036
The forward port process adds a uniquifier to the branch name as it's
possible to reuse branch names for different sets of PRs (and some
individuals do that systematically) which can cause collisions in the
fw branches.
Originally this uniquifier was random by necessity, however now that
batches are reified and forward port is performed batch-wise we don't
need the randomness we can just use the source batch's id as it's
unique per sequence of forward ports. This means it'll eventually be
possible for an external system to retrieve the source(s) of a forward
port by reading the batch object, and it's also possible to correlate
forward ports through the batch id (although not possible to find the
source set without access to the mergebot's information).
Do the same thing for backports, because why not.
- Update `get_reviewers` to returning all active users, as that makes
more sense for the server side operations, this is now redundant
with `list_users`, so eventually internals should be fixed and it
removed.
- Update `remove_users` to deactivate the user as well as move it to
portal, since the mergebot doesn't use portal keeping enabled portal
users doesn't make much sense.
- Reorganise `provision_user` to make the update path more obvious.
Fixes#968
pycharm is a bit dumb so it considers `[ ]` to be unnecessary even in
`re.VERBOSE` regexes. But it has a bit of a point in that it's not
super clear, so inject the space characters by (unicode) name, the
behaviour should be the exact same but the regexes should be slightly
less opaque.
Link stagings to the staging commits instead of the commits directly
in order to have the repository name and not need to try the commit on
every repository.
Also make the trees editable, in combination with `readonly="1"` this
does not make the trees editable but does prevent clicking on a row to
open a dialog with the linked objects, which is convenient if one
wants to copy the commit hash.
Having the staged and merged heads via an easy to reach endpoint is
nice, but having *just* the hashes is not as it requires basically
exhaustively checking commits against repos to see which is
where. That's annoying.
Make the head serializations a map of repository to commit
hash. That's a lot more convenient.