Commit Graph

1796 Commits

Author SHA1 Message Date
Xavier Morel
f97502e503 [IMP] runbot_merge: make skipchecks impact PR state
It's a bit weird and inconsistent to have a PR being staged while
unreviewed or unapproved or w/e. If we compute the state based on
skipchecks then everything is consistent.

Also remove the implicit override of all statuses when explicitly
marking the pr as `ready`, it risks creating difficult to understand
states, and it's unnecessary since `skipchecks` gets set.

Also as with setting skipchecks, sets the current user as reviewer on
all PRs of the batch without a reviewer.
2024-05-24 09:08:56 +02:00
Xavier Morel
fa2bba3cb9 [CHG] runbot_merge: don't reset cancel_staging on r-
Also send skipchecks removal to the PR being r-'d, as sending it to a
random PR of the batch doesn't really make sense?
2024-05-24 09:08:56 +02:00
Xavier Morel
c66451a8c7 [IMP] runbot_merge: cleanup/modernize test_multirepo.py
- remove the `legal/cla` and `ci/runbot` context names, which I use a
  lot for historical reasons but fundamentally they're not useful to
  the tests, the `default` context is generally simpler.
- remove `make_branch` helper as we don't actually use branch
  protection and at the end of the day it doesn't do much else
- convert a few explicit PR lookups to the project-wide `to_pr` helper
2024-05-23 07:58:58 +02:00
Xavier Morel
a6a37f8896 [FIX] runbot_merge: handling of staging cancellation
Move staging cancellation to the batch, remove its (complicated)
handling from the PRs.

This loses some precision in the cancellation message, but that could
likely be recovered (in part) by adding more precise checks &
diagnostic extractions in the compute.
2024-05-23 07:58:58 +02:00
Xavier Morel
ad1d590d9c [IMP] runbot_merge: fix dual merge of split prioritised PRs
Because `alone` (formerly p != 2) is selected before split PRs, if a
prioritised PR gets split (or a split PR gets prioritised) it will be
staged once as prioritised, and again because split.

Improve the selection of ready batches to exclude split batches
upstream, such that they don't have to be rechecked over and over, and
their priorities don't cause us issues.
2024-05-23 07:58:58 +02:00
Xavier Morel
83511f45e2 [CHG] runbot_merge: move priority field from PR to batch
Simplifies the `ready_prs` query a bit and allows it to be converted
to an ORM search, by moving the priority check outside. This also
allows the caller to not need to post-process the records list
anywhere near the previous state of affairs.

`ready_prs` now returns *either* the "alone" batches, or the non-alone
batches, rather than mixing both into a single sequence. This requires
correctly applying the search filters to not retrieve priority of
batches in error or targeting other branches.
2024-05-23 07:58:58 +02:00
Xavier Morel
ef6a002ea7 [CHG] runbot_merge: move staging readiness to batch
Staging readiness is a batch-level concerns, and many of the markers
are already there though a few need to be aggregated from the PRs. As
such, staging has no reason to be performed in terms of PRs anymore,
it should be performed via batches directly.

There is a bit of a mess in order not to completely fuck up when
retargeting PRs (implicitly via freeze wizard, or explicitely) as for
now we're moving PRs between batches in order to keep the
batches *mostly* target-bound.

Some of the side-effects in managing the coherence of the targeting
and moving PRs between batches is... not great. This might need to be
revisited and cleaned up with those scenarios better considered.
2024-05-23 07:58:58 +02:00
Xavier Morel
9ddf017768 [CHG] *: move fw_policy from PR to batch 2024-05-23 07:58:58 +02:00
Xavier Morel
21b5dd439b [CHG] runbot_merge: move merge_date to batch, remove active
- `merge_date` should be common to an entire batch, so move it there
- remove `Batch.active` which should probably have been removed when
  batches were made persistent (can eventually re-add as a proxy for
  `merge_date` being set maybe, but for now removing it seems a better
  way to catch mistakes)
- update various sites to use `Batch.merge_date` instead of
  `Batch.active`
2024-05-23 07:58:58 +02:00
Xavier Morel
e910b8e857 [IMP] runbot_merge: move cross-pr properties to batch 2024-05-23 07:58:58 +02:00
Xavier Morel
473f89f87d [CHG] *: persistent batches
This probably has latent bugs, and is only the start of the road to v2
(#789): PR batches are now created up-front (alongside the PR), with
PRs attached and detached as needed, hopefully such that things are
not broken (tests pass but...), this required a fair number of
ajustments to code not taking batches into account, or creating
batches on the fly.

`PullRequests.blocked` has also been updated to rely on the batch to
get its batch-mates, such that it can now be a stored field with the
right dependencies.

The next step is to better leverage this change:

- move cross-PR state up to the batch (e.g. skipchecks, priority, ...)
- add fw info to the batch, perform forward-ports batchwise in order
  to avoid redundant batch-selection work, and allow altering batches
  during fw (e.g. adding or removing PRs)
- use batches to select stagings
- maybe expose staging history of a batch?
2024-05-23 07:58:58 +02:00
Xavier Morel
c140701975 [ADD] runbot_merge: support staging ready PRs over splits
Not sure it's going to be useful but it's hard to know if we can't
test it. The intent is mostly the ability to prioritize throughput (or
attempt to) during high-load events, if we can favour staging N
new batches over a split's N/2 we might be able to merge more crap.

But maybe not, we'll see, either way now it's here and seems to more
or less work.

Fixes #798
2024-05-23 07:58:58 +02:00
Xavier Morel
9f54e6f209 [ADD] runbot_merge: option to disable staging without cron
Because the mergebot crons are on such a tight scheduling, and just
them finding out they have nothing to do can take a while, disabling
them can be a chore. Disabling staging via the project is much less
likely to cause issues as the projects don't normally (or ever?) get
exclusively locked, so they can generally be written to at any moment.

Furthermore, if we ever get in a situation where we have multiple
active projects (not really the case currently, we have multiple
projects but only one is really active) it's less disruptive to
disable stagings on a single specific project.

Fixes #860
2024-05-23 07:58:58 +02:00
Xavier Morel
d4fa1fd353 [CHG] *: rewrite commands set, rework status management
This commit revisits the commands set in order to make it more
regular, and limit inconsistent command-sets, although it includes
pseudo-command aliases for common tasks now removed from the core set.

Hard Errors
===========

The previous iteration of the commands set would ignore any
non-command term in a command line. This has been changed to hard
error (and ignoring the entire thing) if any command is unknown or
invalid.

This fixes inconsistent / unexpected interpretations where a user
sends a command, then writes a novel on the same line some words of
which happen to *also* be commands, leading to merge states they did
not expect. They should now be told to fuck off.

Priority Restructuring
----------------------

The numerical priority system was pretty messy in that it confused
"staging priority" (in ways which were not entirely straightforward)
with overrides to other concerns.

This has now being split along all the axis, with separate command
subsets for:

- staging prioritisation, now separated between `default`, `priority`,
  and `alone`,

  - `default` means PRs are picked by an unspecified order when
    creating a staging, if nothing better is available
  - `priority` means PRs are picked first when staging, however if
    `priority` PRs don't fill the staging the rest will be filled with
    `default`, this mode did not previously exist
  - `alone` means the PRs are picked first, before splits, and only
    `alone` PRs can be part of the staging (which usually matches the
    modename)
- `skipchecks` overrides both statuses and approval checks, for the
  batch, something previously implied in `p=0`, but now
  independent. Setting `skipchecks` basically makes the entire batch
  `ready`.

  For consistency this also sets the reviewer implicitly: since
  skipchecks overrides both statuses *and approval*, whoever enables
  this mode is essentially the reviewer.
- `cancel` cancels any ongoing staging when the marked PR becomes
  ready again, previously this was also implied (in a more restricted
  form) by setting `p=0`

FWBot removal
=============

While the "forwardport bot" still exists as an API level (to segregate
access rights between tokens) it has been removed as an interaction
point, as part of the modules merge plan. As a result,

fwbot stops responding
----------------------

Feedback messages are now always sent by the mergebot, the
forward-porting bot should not send any message or notification
anymore.

commands moved to the merge bot
-------------------------------

- `ignore`/`up to` simply changes bot
- `close` as well
- `skipci` is now a choice / flag of an `fw` command, which denotes
  the forward-port policy,

  - `fw=default` is the old `ci` and resets the policy to default,
    that is wait for the PR to be merged to create forward ports, and
    for the required statuses on each forward port to be received
    before creating the next
  - `fw=skipci` is the old `skipci`, it waits for the merge of the
    base PR but then creates all the forward ports immediately (unless
    it gets a conflict)
  - `fw=skipmerge` immediately creates all the forward ports, without
    even waiting for the PR to be merged

    This is a completely new mode, and may be rather broken as until
    now the 'bot has always assumed the source PR had been merged.

approval rework
---------------

Because of the previous section, there is no distinguishing feature
between `mergebot r+` = "merge this PR" and `forwardbot r+` = "merge
this PR and all its parent with different access rights".

As a result, the two have been merged under a single `mergebot r+`
with heuristics attempting to provide the best experience:

- if approving a non-forward port, the behavior does not change
- else, with review rights on the source, all ancestors are approved
- else, as author of the original, approves all ancestors which descend
  from a merged PR
- else, approves all ancestors up to and including the oldest ancestor
  to which we have review rights

Most notably, the source's author is not delegated on the source or
any of its descendants anymore. This might need to be revisited if it
provides too restrictive.

For the very specialized need of approving a forward-port *and none of
its ancestors*, `review=` can now take a comma (`,`) separated list of
pull request numbers (github numbers, not mergebot ids).

Computed State
==============

The `state` field of pull requests is now computed. Hopefully this
makes the status more consistent and predictable in the long run, and
importantly makes status management more reliable (because reference
datum get updated naturally flowing to the state).

For now however it makes things more complicated as some of the states
have to be separately signaled or updated:

- `closed` and `error` are now separate flags
- `merge_date` is pulled down from forwardport and becomes the
  transition signal for ready -> merged
- `reviewed_by` becomes the transition signal for approval (might be a
  good idea to rename it...)
- `status` is computed from the head's statuses and overrides, and
  *that* becomes the validation state

Ideally, batch-level flags like `skipchecks` should be on, well, the
batch, and `state` should have a dependency on the batch. However
currently the batch is not a durable / permanent member of the system,
so it's a PR-level flag and a messy pile.

On notable change is that *forcing* the state to `ready` now does that
but also sets the reviewer, `skipchecks`, and overrides to ensure the
API-mediated readying does not get rolled back by e.g. the runbot
sending a status.

This is useful for a few types of automated / programmatic PRs
e.g. translation exports, where we set the state programmatically to
limit noise.

recursive dependency hack
-------------------------

Given a sequence of PRs with an override of the source, if one of the
PRs is updated its descendants should not have the override
anymore. However if the updated PR gets overridden, its descendants
should have *that* override.

This requires some unholy manipulations via an override of `modified`,
as the ORM supports recursive fields but not recursive
dependencies (on a different field).

unconditional followup scheduling
---------------------------------

Previously scheduling forward-port followup was contigent on the FW
policy, but it's not actually correct if the new PR is *immediately*
validated (which can happen now that the field is computed, if there
are no required statuses *or* all of the required statuses are
overridden by an ancestor) as nothing will trigger the state change
and thus scheduling of the fp followup.

The followup function checks all the properties of the batch to port,
so this should not result on incorrect ports. Although it's a bit more
expensive, and will lead to more spam.

Previously this would not happen because on creation of a PR the
validation task (commit -> PR) would still have to execute.

Misc Changes
============

- If a PR is marked as overriding / canceling stagings, it now does
  so on retry not just when setting initially.

  This was not handled at all previously, so a PR in P0 going into
  error due to e.g. a non-deterministic bug would be retried and still
  p=0, but a current staging would not get cancelled. Same when a PR
  in p=0 goes into error because something was failed, then is updated
  with a fix.
- Add tracking to a bunch of relevant PR fields.

  Post-mortem analysis currently generally requires going through the
  text logs to see what happened, which is annoying.

  There is a nondeterminism / inconsistency in the tracking which
  sometimes leads the admin user to trigger tracking before the bot
  does, leading to the staging tracking being attributed to them
  during tests, shove under the carpet by ignoring the user to whom
  that tracking is attributed.

  When multiple users update tracked fields in the same transaction
  all the changes are attributed to the first one having triggered
  tracking (?), I couldn't find why the admin sometimes takes over.
- added and leveraged support for enum-backed selection fields
- moved variuous fields from forwardport to runbot_merge
- fix a migration which had never worked and which never run (because
  I forgot to bump the version on the module)
- remove some unnecessary intermediate de/serialisation

fixes #673, fixes #309, fixes #792, fixes #846 (probably)
2024-05-23 07:58:46 +02:00
Xavier Morel
75f29f9315 [IMP] conftest: avoid parsing json it if may not be
When asserting a status fails, it's possible that this is a 400 or
500-type error which does not yield JSON data at all (e.g. forgot to
start the proxy or dummy), in which case trying to dump the json data
is actively harmful as it triggers cascading failures. And it's not
like the output message is formatted, so a re-structured assertion
message is not helpful.
2024-05-16 10:37:50 +02:00
Xavier Morel
955a61a1e8 [CHG] runbot_merge, forwardbot: merge commands parser
- move all commands parsing to runbot_merge as part of the long-term
  unification effort (#789)
- set up an actual parser-ish structure to parse the commands to
  something approaching a sum type (fixes #507)
- this is mostly prep for reworking the commands set (#673), although
  *strict command parsing* has been implemented (cf update to
  `test_unknown_commands`)
2024-05-16 10:37:50 +02:00
Xavier Morel
0e71f85802 [ADD] forwardport: some typing to conftest
It's not amazing, because cross-typing between the different conftests
and utility modules doesn't really work smoothly. So not entirely
convinced it's great.

While at it, inline a few `x.json()` local aliases when the jsons are
used in exclusive locations e.g. usually an assertion message on one
side and a result / process on the other.
2024-05-16 10:37:50 +02:00
Xavier Morel
f4889ec8cf [ADD] runbot_merge: ad-hoc ACL tracking to res.partner
Sadly m2ms don't support tracking, so add a bunch of ad-hoc tracking
to the override rights in order to know who, what, when at least.

Do the same for the review rights although maybe tracking works for
those.
2024-05-16 09:32:03 +02:00
Xavier Morel
b177361f20 [IMP] forwardport: locking during creation of fw batch
Not sure why I didn't think about it previously (in #357), but it
would make sense for the entire batch creation to be atomic and thus
under the same lock.

The commit during forward porting also makes a lot less sense: it was
a failed early attempt at resolving the problem by hoping we'd win the
race with the webhook (commit before the webhook hit). By locking the
PRs table to update, we actually resolved it.

But since all that happens then is a few updates and then a commit by
the cron itself (it commits per batch), it's probably good enough to
leave the entire thing under the same lock. This means we lock out
other interactions a bit longer, but since the span is still just the
forward port of a single batch it should not be too much of an issue
outside of post-freeze recovery thingie.
2024-05-16 09:32:03 +02:00
Xavier Morel
a8e4d6dfee [IMP] runbot_merge: don't select content when locking rows
It might not be a huge amount of extra work since we're never actually
retrieving the rows, but it still seems completely unnecessary.

Sadly we can't do something cleaner like an aggregation, because
aggregating requires moving the locking query to a subquery, and
experimentally that seems slower than just ignoring / discarding the
result set.
2024-05-16 09:32:03 +02:00
Xavier Morel
9f22305903 [IMP] runbot_merge: view warnings around ACLs
Eventually we might want to add a proper "sensitive" flag on overrides
and compute the flag based on that. For now just check for
`ci/security`.
2024-03-19 12:54:20 +01:00
Xavier Morel
5024c2e27b [FIX] forwardport: suppress warning when closing unmanaged PR
Continuation of 327500bc83 for an other
edge case of closing a PR to a detached branch with a merged
descendant. The mergebot would:

- warn on the parent about it being detached due to being closed
- then warn on the child about it being detached due to the parent
  being closed (despite it being merged already)
- then warn the parent *again* due to the child being detached

At least some of those messages were still produced by the test case,
stop them.

Issue was noticed on odoo/odoo#145969 and odoo/odoo#145984 due to 16.2
being deactivated.
2024-03-19 11:46:36 +01:00
Xavier Morel
953bf86044 [FIX] forwardport: don't notify of detached child on merged parents
The notification is both noise and confusing: we're telling the
author (and reviewer, and anyone else subscribed) that they need to
merge a merged PR.

Fixes #855
2024-03-12 14:57:50 +01:00
Xavier Morel
327500bc83 [FIX] runbot_merge: don't notify on closing unknown PRs
If an untracked PR is closed, especially on an inactive or untracked
branch, the closer (or author) almost certainly don't care to receive
3 different notifications on the subject.

The fix requires a schema change in order to track that we're fetching
the PR due to a `closed` event, as in other cases we may still want to
notify the user that we received the request (and it just happened to
resolve to a closed PR).

Fixes #857
2024-03-12 12:17:30 +01:00
Xavier Morel
721b769039 [IMP] runbot_merge: handling of signatures
- correctly handle projects without a secret set, we don't want the
  requests to blow up by trying to `strip()` a `False` or `None`, that
  is dumb, who would do that?
- provide better reporting on signature mismatch: which repo we tried
  to access, and the full list of headers
- log when there was no signature matching, either because there was
  no signature in the request and no secret on the project, or because
  the request is signed but no secret is configured on the repo
2024-02-26 10:11:53 +01:00
Xavier Morel
bcf6074153 [FIX] runbot_merge: maintenance gc command
`gc --prune` can not take a *separate* parameter, it has to be part of
the same arg (the `=` is not optional), otherwise the `gc` call blows
up.

So use the positional form of the git command to generate the correct
invocation, Python-level `foo=bar` generates a split-style option in
two args which does not please git.
2024-02-26 09:58:22 +01:00
Xavier Morel
de32b54090 [FIX] runbot_merge: error in maintenance, and tracking
Before this, we would check if a repository had a name and run
maintenance on it, leading to repeated (but unnoticed until now
because I didn't monitor it) tracebacks as the maintenance cron would
fail to find the local repo then run maintenance on nowhere anyway.

Also augment the repo-finding process to try and get better
information about what it's doing when it fails, rather than failing
completely silently.
2024-02-23 13:58:31 +01:00
Xavier Morel
5d615bd733 [IMP] runbot_merge: logging around webhook body & signature
The signature validation code seems correct, but there are validation
failure in production, increase logging around webhook requests to
try and diagnose things better:

- dump the *entire* body to the github_requests logfile
- add the received & computed signatures to the log error
2024-02-12 10:19:53 +01:00
Xavier Morel
65c303a750 [FIX] runbot_merge: bot info fetch
`r0` is still used afterwards as the response object, so don't
overwrite it when parsing the JSON body.
2024-02-12 10:18:59 +01:00
Xavier Morel
3a4fa494f8 [FIX] runbot_merge: incorrectly named endpoint
Method has the same name as its preceding sibling, so it overwrites
it and one of the endpoints is not accessible.
2024-02-12 10:18:25 +01:00
Christophe Monniez
e9fc57816b [FIX] runbot: properly move builds to the merged error
When build errors are merged together, the builds of the merged errors
should be moved to the only error that will be kept.
It 's not the case because the merge method is assigning a compute field
and moreover it was hidden in the tests because the compute was not
triggered.

With this commit, the build_error_link is updated to point to the new
error. The test is modified to properly check the case and also to add a
case when the link already exists.

The access rights are updated to allow admin to unlink the
build_error_link records. Otherwise the action could fail when the link
already exists.
2024-01-30 08:50:12 -01:00
Christophe Monniez
3ff4787788 [FIX] runbot: allow build parsing with same error
When a build that contains the same error that appears two times is
parsed, it crashes because of the unique constraint on build error link.

With this commit, only one link with the same error is created.
Two tests are added for the two cases:
- a new error appearing two times in a same build
- an existing error appearing two times in a same build
2024-01-25 14:31:04 -01:00
Xavier-Do
0b20a834c7 [FIX] runbot: fix typos 2024-01-25 08:44:48 +01:00
Christophe Monniez
b949ea1817 [FIX] runbot: adapt build error search for link model
Since c6f9d1f0c a new model was added to link build errors and builds.
The _search_version and _search_trigger_ids were not adapted to work
with this new model.
2024-01-25 06:41:01 -01:00
Xavier-Do
5406d29f8f [FIX] runbot: fix prepare 2024-01-25 08:40:42 +01:00
Christophe Monniez
c6f9d1f0c5 [IMP] runbot: use real log date on build errors
In the build error view, a list of build is displayed with a confusing
create date. The create date in the list is the creation date of the
build, leading to a confusion with the creation of the build log
creation.

With this commit, the real log creation is used in this view.

To achieve that, the many2many relation is extended with a
log_date which is filled when a build log entry is parsed.
2024-01-24 13:02:22 -01:00
Christophe Monniez
d2bb42264e [IMP] runbot: give more information about known errors 2024-01-24 13:02:22 -01:00
Christophe Monniez
7c0c03753d [IMP] runbot: add warning ribbon when an error is a link
It happens that a user edits or annotates a build error form without
noticing that the error is linked to another one. In that case, the
modification or the note is probably useless. So a warning ribbon should
grab his attention.

While at it, we change the warning ribbon when an error has a test-tag
to not be confused with the link ribbon and also because removing a
test-tag can lead to a real danger (for the mergebot stagings).
2024-01-24 13:02:22 -01:00
Christophe Monniez
ef045c8540 [IMP] runbot: move builds on host page into a tab 2024-01-24 13:02:22 -01:00
Christophe Monniez
05d072d6eb [IMP] runbot: add a frontend button to view batch in backend 2024-01-24 13:02:22 -01:00
Christophe Monniez
c322458c5b [IMP] runbot: add a frontend btn to parse a single log 2024-01-24 13:02:22 -01:00
Christophe Monniez
68892398cd [FIX] runbot: fix error_log parse_log method 2024-01-24 13:02:22 -01:00
Christophe Monniez
c837e7330c [IMP] runbot: remove useless prefix on build errors 2024-01-24 08:54:39 -01:00
Christophe Monniez
6030988560 [IMP] runbot: allow to filter team errors by trigger 2024-01-24 08:54:39 -01:00
Christophe Monniez
bef9dc0e49 [FIX] runbot: properly compute error versions and triggers 2024-01-24 08:54:39 -01:00
Christophe Monniez
f47beb31ba [FIX] runbot: fix the favorites filter bug 2024-01-24 08:54:39 -01:00
Christophe Monniez
2999f92cdb [IMP] runbot: allow to search errors in a single version
When filtering the build error tree view based on the versions equality,
the results may not be what you expect.

e.g.: searching for `versions is equal to 16.0` gives the errors that
appeared in `16.0` (hopefully) but also those which appeared in other
versions too.

With this commit, this search will give the errors that appeared in the
specified version only. When the user wants to list errors that appeared
in `16.0` and other versions too, he has to use the `contains 16.0`
criteria.
2024-01-24 08:54:39 -01:00
Christophe Monniez
08c614a86e [IMP] runbot: add optional fingerprint in error list view 2024-01-24 07:43:33 -01:00
Christophe Monniez
bddac0a645 [IMP] runbot: allow to choose the replacement string
When defining cleaning regex, the replacement character is always the
percent sign as it's hard coded in various methods.

With this commit, a replacement string can be defined by cleaning regex
and fallback to the percent sign by default.
2024-01-24 07:43:33 -01:00
Christophe Monniez
7ce79b434e [IMP] runbot: improve build error cleaning action
With this commit, when build errors are re-cleaned, they are also merged
if the fingerprints when fingerprints are matching.

Also, this fixes the ir_logging compute that associate a build error
so that the active build error is preferred over an inactive one.
2024-01-24 07:43:33 -01:00