Adds a very limited ability to try and look for false positive /
non-determinstic staging errors. It tries to err on the side of
limiting false false positives, so it's likely to miss many.
Currently has no automation / reporting, just sets a flag on the
stagings which are strongly believed to have failed due to false
positives.
While at it, add link between a "root" staging and its splits. It's
necessary to clear the "false positive" flag, and surfacing it in the
UI could be useful one day.
Fixes#660
Requires parsing the commit messages as github plain doesn't collate
the info from there, but also the descriptions: apparently github only
adds those to the references if the PR targets the default
branch. That's not a limitation we want.
Meaning at the end of the day, the only thing we're calling graphql
for is explicit manual linking, which can't be tested without
extending the github API (and then would only be testable on DC), and
which I'm not sure anyone bothers with...
Limitations
-----------
- Links are retrieved on staging (so if one is added later it won't be
taken in account).
- Only repository-local links are handled, not cross-repository.
Fixes#777
In that case, ignore the reopen, close the PR, and tell the idiot to
fuck off.
Also the case where a PR is reopened while its batch is staged was
already handled, but probably not tested: it was implicit in
forcefully updating the HEAD of the PR, which triggers an unstage
since c8a06601a7.
Now that scenario is tested, which should lower the odds of breaking
it in the future.
Fixes#965
The `to_pr` helper was added a *while* ago to replace the pretty
verbose process of looking a PR with the right number in the right
repository after a `make_pr`. However while I did some ad-hoc
replacement of existing `search` calls as I had to work with existing
tests I never did a full search and replace to try and excise searches
from the test suite.
This is now done. I've undoubtedly missed a few, but a hundred-odd
lines of codes have been simplified.
- don't *fail* in `_compute_identity`, it causes issues when the token
is valid but doesn't have `user:email` access as the request is
aborted and saving doesn't work
- make `github_name` and `github_email` required rather than ad-hoc
requiring them in `_compute_identity` (which doesn't work correctly)
- force copy of `github_name` and `github_email`, with o2ms being
!copy this means duplicating projects now works out of the box (or
should...)
Currently errors in `_compute_identity` are reported via logging which
is not great as it's not UI visible, should probably get moved to
chatter eventually but that's not currently enabled on projects.
Fixes#990
- Apparently if a user is on windows the ACE editor can swap out their
line end from unix to windows. The patch parsers were predicated
upon all patches being in unix mode (because git, and diff).
Fixup both parsers to convert windows-style line end to unix before
trying to parse the patch data. Also add a few fallbacks to limit
the odds of an unhelpful `StopIteration` (though that might hide
errors more than reveal them...)
- Make sure we support `format-patch --no-signature`, just requires
using the correct partition direction: I assume I used `rpartition`
as a form of micro-optimisation *but*
- If the separator is not found the "patch body" ends up in the
third parameter rather than the first, which makes the fallback
difficult.
- There doesn't seem to be anything preventing *multiple* signature
separators in a message, and logically the first one should hold
and the rest is all part of the signature.
As a result, for both reasons we need to look *forwards* for the
signature separator, not backwards. Hence `str.partition`.
Fixes#992
`get_content` round-trips the text part through `ascii` with
`error=replace`, so if the input is not ascii it screws up
tremendously, which leads to either failing to apply patches (the more
likely situation) or corrupting the patches.
`get_payload`, if called without `decode`, pretty much just returns
the payload unless it needs a decoding pass (e.g. because it contains
raw surrogates, but that should not be an issue for us). So this is
really what we want.
While at it, increase `patch`'s verbosity in case it can give us more
info.
If a comment causes an unknown PR to be fetched, it's a bit odd to
ping the author (and possibly reviewer) anyway as they're not super
concerned (and technically we could be ignoring the purported /
attempted reviewer).
So if a fetch job was created because of a comment, remember the
comment author and ping *them* instead of using the default ping
policy.
Fixes#981
Turns out you can configure format-patch with `--no-prefix` and some
people (*cough cough* mat) have that in their standard setup, so the
assumption of needing to strip 1 level of prefix does not necessarily
hold.
Also fix a few more issues:
- some people (*cough cough* still mat) also use `-n` by default,
which adds the series sequence (`n/m`) even for a single patch,
handle that correctly
- logging patch application errors is pretty useful when patching
fails and I'm trying to get the information via logs, do that
- especially when I decide to add error messages to tracking *but
forgot to show the chatter by default*, fix that as well
The commit-based patcher worked first try, and patch-based would have
worked too if not for those meddling kids. In the future it might be a
good idea to reify the stripping level (`-p`) on the patch object
though, and maybe provide a computed preview of the list of files to
patch, so issues are easier for the operator to diagnose.
Missed it during the previous pass, probably because it's in the
middle of `pull_requests.py`. It's a classic template for triggered
crons since the model is just a queue of actions for the cron.
Because of the false negatives due to github's reordering of events on
retargeting, blocking merge methods can be rather frustrating or the
user as what's happening and how to solve it isn't clear in that case.
Keep the warnings and all, but remove the blocking factor: if a PR
doesn't have a merge method and is not single-commit, just skip it on
staging. This way, PRs which are actually single-commit will stage
fine even if the mergebot thinks they shouldn't be.
Fixes#957
Unstaged changes can be useful or necessary for some tasks
e.g. absolute emergency (where even faking the state of a staging is
not really desirable, if that's even possible anymore), or changes
which are so broad they're difficult to stage (e.g. t10s updates).
Add a new object which serves as a queue for patch to direct-apply,
with support for either text patches (udiff style out of git show or
format-patch) or commits to cherry-pick. In the former case, the part
of the show / format-patch before the diff itself is used for the
commit metadata (author, committer, dates, message) whereas for the
commit version the commit itself is reused as-is.
Applied patches are simply disabled for traceability.
Fixes#926
Add warnings when trying to send comments / commands to PRs targeting
inactive branches.
This was missing leading to confusion, as one warning is clearly not
enough.
Fixes#941
By updating the staging timeout every time we run `_compute_state` and
still have a `pending` status, we'd actually bump the timeout *on
every success CI* except for the last one. Which was never the
intention and can add an hour or two to the mergebot-side timeout.
Instead, add an `updated_at` attribute to statuses (name taken from
the webhook payload even though we don't use that), read it when we
see `pending` required statuses, and update the staging timeout based
on that if necessary.
That way as long as we don't get *new* pending statuses, the timeout
doesn't get updated.
Fixes#952
- Switch to just `default` ci.
- Autouse fixture to init the master branch.
- Switch to `make_commits`.
- Merge `test_reopen_update` and `test_update_closed_revalidate` into
`test_update_closed`: the former did basically nothing useful and
the latter could easily be folded into `test_update_closed` by just
validating the new commit.
And unconditionally unstage when the HEAD of a PR is synchronised.
While a rebuild on a PR which was previously staged can be a false
positive (e.g. because it hit a non-derministic test the second time
around) it can also be legitimate e.g. auto-rebase of an old PR to
check it out. In that case the PR should be unstaged.
Furthermore, as soon as the PR gets rebuilt it goes back into
`approved` (because the status goes to pending and currently there's
no great way to suppress that in the rebuild case without also fucking
it up for the sync case). This would cause a sync on the PR to be
missed (in that it would not unstage the PR), which is broken. Fix
that by not checking the state of the PR beforehand, it seems to be an
unnecessary remnant of older code, and not really an optimisation (or
at least one likely not worth bothering with, especially as we then
proceed to perform a full PR validation anyway).
Fixes#950
The UX around the split of limit and forward port policy (and
especially moving "don't forward port" to the policy) was not really
considered and caused confusion for both me and devs: after having
disabled forward porting, updating the limit would not restore it, but
there would be no indication of such an issue.
This caused odoo/enterprise#68916 to not be forward ported at merge
(despite looking for all the world like it should be), and while
updating the limit post-merge did force a forward-port that
inconsistency was just as jarring (also not helped by being unable to
create an fw batch via the backend UI because reasons, see
10ca096d86).
Fix this along most axis:
- Notify the user and reset the fw policy if the limit is updated
while `fw=no`.
- Trigger a forward port if the fw policy is updated (from `no`) on a
merged PR, currently only sources.
- Add check & warning to limit update process so it does *not* force a
port (though maybe it should under the assumption that we're
updating the limit anyway? We'll see).
Fixes#953
- rather than enumerate states, forward-porting should just check if
the statuses are successful on a PR
- for the same consistency reasons explained in
f97502e503, `skipchecks` should force
the status of a PR to `success`: it very odd that a PR would be
ready without being validated...
Broken (can't run odoo at all):
- In Odoo 17.0, the `pre_init_hook` takes an env, not a cursor, update
`_check_citext`.
- Odoo 17.0 rejects `@attrs` and doesn't say where they are or how to
update them, fun, hunt down `attrs={'invisible': ...` and try to fix
them.
- Odoo 17.0 warns on non-multi creates, update them, most were very
reasonable, one very wasn't.
Test failures:
- Odoo 17.0 deprecates `name_get` and doesn't use it as a *source*
anymore, replace overrides by overrides to `_compute_display_name`.
- Multiple tracking changes:
- `_track_set_author` takes a `Partner` not an id.
- `_message_compute_author` still requires overriding in order to
handle record creation, which in standard doesn't support author
overriding.
- `mail.tracking.value.field_type` has been removed, the field type
now needs to be retrieved from the `field_id`.
- Some tracking ordering have changed and require adjusting a few
tests.
Also added a few flushes before SQL queries which are not (obviously
at least) at the start of a cron or controller, no test failure
observed but better safe than sorry (probably).
Breakages:
- the entire http.py API was updated requiring fixing the uses of
`request.jsonrequest` and the patches to `WebRequest` to hook in
sentry
- `fontawesome` was moved
- `*[@groups]` are now completely removed from the view if not
matching, so any field inside of them which needs to be used outside
(e.g. attrs) has to be added as invisible outside the element
- discuss removed the mail tracking value helpers from RPC in
odoo/odoo#88547, so reimplement locally (and better)
`test_inconsistent_target` was, appropriately, inconsistent, but would
only fail a very small fraction of the time: the issue is that a PR
would switch target between `other` and `master` assuming neither was
an intrinsic blocker *but* the branches are created independently,
just with the same content.
This means if a second ticked over between the creation of the
`master` branch's commit and that of `other`, they would get different
commit hashes (because different timestamp), thus the PR would get 2
commits (or complete nonsense) when targeted to `other`, and the PR
itself would be blocked for lack of a merge method.
The solution is to be slightly less lazy, and create `other` from
`master` instead of copy/pasting the `make_commits` directive. This
means the PR has the exact same number of commits whether targeted to
`master` or `other`, and we now test what we want to test 60 seconds
out of every minute.
With the trigger-ification pretty much complete the only cron that's
still routinely triggered explicitly is the cross-pr check, and it's
that in all modules, so there's no cause to keep an overridable
fixture.
The staging cron turns out to be pretty reasonable to trigger, as we
already have a handler on the transition of a batch to `not blocked`,
which is exactly when we want to create a staging (that and the
completion of the previous staging).
The batch transition is in a compute which is not awesome, but on the
flip side we also cancel active stagings in that exact scenario (if it
applies), so that matches.
The only real finesse is that one of the tests wants to observe the
instant between the end of a staging (and creation of splits) and the
start of the next one, which because the staging cron is triggered by
the failure of the previous staging is now "atomic", requiring
disabling the staging cron, which means the trigger is skipped
entirely. So this requires triggering the staging cron by hand.
The merge cron is the one in charge of checking staging state and
either integrating the staging into the reference branch (if
successful) or cancelling the staging (if failed).
The most obvious trigger for the merge cron is a change in staging
state from the states computation (transition from pending to either
success or failure). Explicitly cancelling / failing a staging marks
it as inactive so the merge cron isn't actually needed.
However an other major trigger is *timeout*, which doesn't have a
trivial signal. Instead, it needs to be hooked on the `timeout_limit`,
and has to be re-triggered at every update to the `timeout_limit`,
which in normal operations is mostly from "pending" statuses bumping
the timeout limit. In that case, `_trigger` to the `timeout_limit` as
that's where / when we expect a status change.
Worst case scenario with this is we have parasitic wakeups of this
cron, but having half a dozen wakeups unnecessary wakeups in an hour
is still probably better than having a wakeup every minute.
These are pretty simple to convert as they are straightforward: an
item is added to a work queue (table), then a cron regularly scans
through the table executing the items and deleting them.
That means the cron trigger can just be added on `create` and things
should work out fine.
There's just two wrinkles in the port_forward cron:
- It can be requeued in the future, so needs a conditional trigger-ing
in `write`.
- It is disabled during freeze (maybe something to change), as a
result triggers don't enqueue at all, so we need to immediately
trigger after freeze to force the cron re-enabling it.
Mergebot / forwardport crons need to run in a specific ordering in
order to flow into one another correctly. The default ordering being
unspecified, it was not possible to use the normal cron
runner (instead of the external driver running crons in sequence one
at a time). This can be fixed by setting *sequences* on crons, as the
cron runner (`_process_jobs`) will use that order to acquire and run
crons.
Also override `_process_jobs` however: the built-in cron runner
fetches a static list of ready crons, then runs that.
This is fine for normal situation where the cron runner runs in a loop
anyway but it's any issue for the tests, as we expect that cron A can
trigger cron B, and we want cron B to run *right now* even if it
hadn't been triggered before cron A ran.
We can replace `_process_job` with a cut down version which does
that (cut down because we don't need most of the error handling /
resilience, there's no concurrent workers, there's no module being
installed, versions must match, ...). This allows e.g. the cron
propagating commit statuses to trigger the staging cron, and both will
run within the same `run_crons` session.
Something I didn't touch is that `_process_jobs` internally creates
completely new environments so there is no way to pass context into
the cron jobs anymore (whereas it works for `method_direct_trigger`),
this means the context values have to be shunted elsewhere for that
purpose which is gross. But even though I'm replacing `_process_jobs`,
this seems a bit too much of a change in cron execution semantics. So
left it out.
While at it tho, silence the spammy `py.warnings` stuff I can't do
much about.
Merge errors are logical failures, not technical, it doesn't make
sense to log them out because there's nothing to be done technically,
a PR having consistency issues or a conflict is "normal". As such
those messages are completely useless and just take unnecessary space
in the logs, making their use more difficult.
Instead of sending them to logging, log staging attempts to the PR
itself, and only do normal logging of the operation as an indicative
item. And remove a bunch of `expect_log_errors` which don't stand
anymore.
While at it, fix a missed issue in forward porting: if the `root.head`
doesn't exist in the repo its `fetch` will immediately fail before
`cat-file` can even run, so the second one is redundant and the first
one needs to be handled properly. Do that. And leave checking
for *that* specific condition as a logged-out error as it should mean
there's something very odd with the repository (how can a pull request
have a head we can't fetch?)
From the previous version of `_compute_target` this was clearly
intended otherwise the fallback makes no sense, but just as clearly I
completely missed / forgot about it halfway through (and the lack of
test didn't help).
The compute is also way overcomplicated, it's not clear why (the only
explanation I can think of is that an intermediate version had a
string target but if that ever happened it was squashed away).
- Update branch name to prefix with project as it can be hard to
differentiate when filtering by or trying to set targets, given some
targets are extremely common (e.g. `master`/`main`) and not all
fields are filtered by project (or even can be).
- Add a proper menu item and list view for batches, maybe it'll be of
use one day.
- Upgrade label in PR search, it's more likely to be needed than
author or target.
- Put PRs first in the mergebot menu, as it's *by far* the most likely
item to look for, unless it's staging in order to cancel one.
Previously PR descriptions were displayed as raw text in the PR
dashboard. While not wrong per se, this was pretty ugly and not always
convenient as e.g. links had to be copied by hand.
Push descriptions through pymarkdown for rendering them, with a few
customisations:
- Enabled footnotes & tables & fenced code blocks because GFM has
that, this doesn't quite put pymarkdown's base behaviour on par with
gfm (and py-gfm ultimately gave up on that effort moving to just
wrap github's own markdown renderer instead).
- Don't allow raw html because too much of a hassle to do it
correctly, and very few people ever do it (mostly me I think).
- Added a bespoke handler / renderer for github-style references.
Note: uses positional captures because it started that way and named
captures are not removed from that sequence so mixing and matching
is not very useful, plus python does not support identically named
groups (even exclusive) so all 4 repo captures and all 3 issue
number captures would need different names...
- And added a second bespoke handler for our own opw/issue references
leading to odoo.com, that's something we can't do via github[^1] so
it's a genuine value-add.
Fixes#889
[^1]: github can do it (though possibly not with the arbitrary
unspecified nonsense I got when I tried to list some of the
reference styles, some folks need therapy), but it's not available
on our plan
I thought I'd removed the error message when approving an already
approved PR but apparently not?
However we can improve the message in that specific case, to make the
expected operation clearer.
Fixes#906
d4fa1fd353 added tracking to changes
from *comments* (as well as a few hacks around authorship transfer),
however it missed two things:
First, it set the `change-author` during comments handling only, so
changes from the `PullRequest` hook e.g. open, synchronise, close,
edit, don't get attributed to their actual source, and instead just
fall back to uid(1). This is easy enough to fix as the `sender` is
always provided, that can be resolved to a partner which is then set
as the author of whatever changes happen.
Second, I actually missed one of the message hooks: there's both
`_message_log` and `_message_log_batch` and they don't call one
another, so both have to be overridden in order for tracking to be
consistent. In this case, specifically, the *creation* of a tracked
object goes through `_message_log_batch` (since that's a very generic
message and so works on every tracked object created during the
transaction... even though batch has a message per record anyway...)
while *updates* go through `_message_log`.
Fixes#895
- When a redundant approval is sent to a PR, notify but don't ignore
the entire command set, there's no actual risk.
- Indicate that the entire comment was ignored when finding something
which does not parse.
Fixes#892, fixes#893
If a PR gets approved *then* fails CI, there should be a notification
warning the author & reviewer since
48e08b657b, it even has a test, which
passes (in fact it has *two*, one of which is redundant, so merge
`test_ci_failure_after_review` into the later `test_ci_approved`).
*However* this is in runbot_merge, turns out in
fafa7ef437 some refactoring was done in
order to override the notification and customise it for *forward
ports* with a failed status... except that override never called its
`super()`, so as soon as forwardport is installed the base
notification stops working, and that's been that since October
2019 (had been added in March that year, ignoring deployment lag).
This can be revealed by adding the corresponding check in the
*forwardport* tests, revealing the failure.
This was a pain to track down, thankfully it reproduced relatively
easily locally.
While this could be resolved in the override, might as well fold it
into the base method in furtherance of #789: the mergebot is only
used by odoo, and only with both modules combined, so splitting them
is not useful. And furthermore it things should work fine with the
forwardport installed but unused.
Fixes#894
- Instead of warning about the merge method on ready PRs, also warn on
*approved* (but exclude staged just cuz), as that's really when the
user wants to know that they forgot to set the merge method
- The cron only triggers hourly, but *if* a user approves a PR *and*
the merge method is not set yet, chances are good they'll need a
reminder (if they `r+ rebase-merge` or w/e the cron will just ignore
the PR and it's no skin off our back), so `_trigger` the cron for
validation.
- Also do the same when skipchecks is set as it's very similar.
In reality we might want to hook off of the state transitioning to
reviewed but I'm not sure there's good ways to do that (triggering a
cron inside a compute doesn't seem like a good idea).
Update a pair of tests which would approve a multi-commit PR without
setting a merge method, just because the helper they use to build the
PR happens to create multiple commits.
Fix#891
Seems like a good idea to better keep track of the log of an Odoo used
to testing, and avoid silently ignoring logged errors.
- intercept odoo's stderr via a pipe, that way we can still write it
back out and pytest is able to read & buffer it, pytest's capfd
would not work correctly: it breaks output capturing (and printing
on failure); and because of the way it hooks in it's unable to
capture from subprocesses inheriting the standard stream, cf
pytest-dev/pytest#4428
- update the env fixture to check that the odoo log doesn't have any
exception on failure
- make that check conditional on the `expect_log_errors` marker, this
way we can mark tests for which we expect errors to be logged, and
assert that that does happen
Setting the PR state directly really doesn't work as it doesn't
correctly save (and can get overwritten by any dependency of which
there are many).
This caused setting odoo/odoo#165777 in error to fail, leading to it
being re-staged (and failing) repeatedly, and the PR being spammed
with comments.
- create a more formal helper for preventing directly setting computed
functions (without an actual inverse)
- replace direct state setting by setting the corresponding dependency
e.g. `error` for error and `skipchecks` to force a PR to ready
- add a `skipchecks` inverse to the PR so it can also set itself as
reviewed, and is convenient, might be worth also adding stuff to
`Batch.write`
Currently webhook secrets are configured per *project* which is an
issue both because different repositories may have different
administrators and thus creates safety concerns, and because multiple
repositories can feed into different projects (e.g. on mergebot,
odoo-dev/odoo is both an ancillary repository to the main RD project,
and the main repository to the minor / legacy master-wowl
project). This means it can be necessary to have multiple projects
share the same secret as well, this then mandates the secret for more
repositories per (1).
This is a pain in the ass, so just detach secrets from projects and
link them *only* to repositories, it's cleaner and easier to manage
and set up progressively.
This requires a lot of changes to the tests, as they all need to
correctly configure the signaling.
For `runbot_merge` there was *some* setup sharing already via the
module-level `repo` fixtures`, those were merged into a conftest-level
fixture which could handle the signaling setup. A few tests which
unnecessarily set up repositories ad-hoc were also moved to the
fixture. But for most of the ad-hoc setup in `runbot_merge`, as well
as `forwardport` where it's all ad-hoc, events sources setup was just
appended as is. This should probably be cleaned up at one point, with
the various requirements collected and organised into a small set of
fixtures doing the job more uniformly.
Fixes#887