1cea247e6c tried to improve staging
checks to avoid staging PRs in the wrong state, however it had two
issues:
PR state
--------
The process would reset the PR's state to open, but unless the head
was being resync'd it wouldn't re-apply the statuses on the state,
leading to a PR with all-valid statuses, but a missing CI.
Message
-------
The message check didn't compose the PR message the same way PR
creation / update did (it did not trim the title and description
individually, only after concatenation), resulting in a
not-actually-existing divergence getting signaled in the case where
the PR title ends or the description starts with whitespace.
Expand relevant test, add a utility function to compose a PR message
and use it everywhere for coherence.
Also update the logging and reporting to show a diff of all the
updated items (hidden behind a `details` element).
If there are bump PRs anyway: the bump commits will cause the
forward-port of the staging to fail, so might as well clearly notify
everybody of the issue if there is a pending staging, and not waste
too much time waiting for a staging which can not succeed.
We could also cancel stagings when there's no bump PR, but it's not
clear that there's any reason to do so: if we didn't touch any master
branch, there's no reason for the staging to fail, or to otherwise
cancel it.
And obviously we can't have staged anything on the new branch so
there's nothing to cancel.
Part-Of: #718
I DECLARE BANKRUPTCY!!!
The previous implementation of labels lookup was really not
intuitive (it was just a char field, and matched labels by equality
including the owner tag), and was also full of broken edge
cases (e.g. traceback if a label matched multiple PRs in the same repo
because people reuse branch names).
Tried messing about with contextual `display_name` and `name_search`
on PRs but the client goes wonky in that case, and there is no clean
autocomplete for non-relational fields.
So created a view which reifies labels, and that can be used as the
basis for our search. It doesn't have to be maintained by hand, can be
searched somewhat flexibly, we can add new view fields in the future
if desirable, and it seems to work fine providing a nice
understandable UX, with the reliability of using a normal Odoo model
the normal way.
Also fixed the handling of bump PRs, clearly clearing the entire field
before trying to update existing records (even with a link_to
inbetween) is not the web client's fancy, re-selecting the current
label would just empty the thing entirely.
So use a two-step process slightly closer to the release PRs instead:
- first update or delete the existing bump PRs
- then add the new ones
The second part is because bump PRs are somewhat less critical than
release, so it can be a bit more DWIM compared to the more deliberate
process of release PRs where first the list of repositories involved
has to be set up just so, then the PRs can be filled in each of them.
Fixes#697
In order to support partial freezing, we need the ability to remove
some of the release lines for the repos we don't want to
freeze (e.g. because they don't use per-version branches).
This subsequently means we need the ability to *create* new lines if
we fucked up and removed one we should not have. Alternatively the
freeze meat-bot could cancel the entire thing and redo the wizard but
that seems harsh and mean, so don't do that.
Fixes 0f3647b7c7 which specifically
mentioned partial freeze then proceeded to make them entirely
impossible anyway.
Part of #718
Previously the mergebot would only sync the head commit, but synching
more is useful.
Also update the final sanity check on staging:
- as with check, update the message & target branch
- reset PR state and post a message when updating message instead of
doing so silently
Note: maybe only fail the staging if the message is updated *and*
relevant to staging (aka there's a merge method and it's not
`rebase`)?
Fixes#680
After review, there doesn't seem to be a single integer field created
by the mergebot or fortwardbot modules for which a `group_operator`
makes sense, let alone the default of `sum`.
So just disable them all.
Fixes#674
If commits have different authors (/ committers), the mergebot would
ask github to create a commit with an author (/ committer) of `None` /
`null`.
Apparently github really does not like that, and complains that
> nil is not an object
So remove the key entirely. Also fix the collision between `author`
and the `Co-Authored-By` list, which could lead to trying to set an
`author` of `[name, email]` instead of an object, which is also not
accepted by github.
Fixes to the new bits which didn't really work:
- Fix borked view layout
- Add some help to the label fields
- Improve the resolution of label -> pr, and fix
- Also make the feature actually work for bump PRs
- Also make pr -> label work more reliably, now allows setting one PR
and getting the other PRs of the same batch (with the same label)
even without setting the label by hand
An autocomplete for the label has been considered but there is no
autocomplete field for char/selection fields, and it seems way too
much work for the utility:
- either create a brand new widget for 15.0 which will have to be
entirely rewritten in 16
- or create a transient model composed entirely of fake records to
provide an m2o to records which don't actually exist as label
bearers, which is also a lot of unnecessary work
NOTE: we want to support partial freezing (aka not freeze all the
branches because some of them have different release models
than others), so some project repos *not* having a release
PR is fine and normal, such a validation should not be added.
Fixes#664
In case where the last branch (before the branch being frozen) is
disabled, the forwardport inserter screws up, and fails to correctly
create the intermediate forwardports from the new branch.
Also when disabling a branch, if there are FW PRs which target that
branch and have not been forward-ported further, automatically
forward-port them as if the branch had been disabled when they were
created, this should limit data loss and confusion.
Also change the message set on PRs when disabling a branch: because of
user conflicts in test setup, the message about a branch being
disabled would close the PRs, which would then orphan the followup,
leading to unexpected / inconsistent behaviour.
Fixes#665
The `statuses` field of a staging is always "live" because it's a
computed non-stored field. This is an issue when a staging finishes in
whatever state, then someone gets new statuses sent on one of the head
commits, either by rebuilding (part of) the staging or by just using
the same commit for one of their branches.
This makes the reporting of the main dashboard confusing, as one might
look at a failed staging and see all the required statuses
successful. It also makes post-mortem analysis more complicated as the
logs have to be trawled for what the statuses used to be (and they
don't always tell).
Solve this by storing a snapshot of the statuses the first time a
staging moves away from `pending`, whether it's to success or failure.
Fixes#667
Partially revert 0c882fc0df
This turns out to be more bane than boon, as it breaks forward-port
chains and confuses people (despite the message). Update notification
message and don't close the PR anymore.
While at it, disable any pending staging on the branch being deactivated.
Fixes#654
Turns out I was running "15.0" except just on the runbot, enterprise
and community were still the 14.0 repos, so some of the changes were
missing.
While at it, bundle fixes for 3.10, as that's what Jammy needs, and
the mergebot/15.0 will be running on that.
- override the staging's name_get to provide a slightly more useful
display_name (though still not great as the staging object remains
quite technical and inimical to human interaction)
- show individual PRs in a batch (as m2m tags) for readability
- update PR views to show the author and reviewer, except in the list
of delegations of users where it's a lot less useful
/cc #632
Currently deactivating a branch kinda leaves users in the dark, with
little way to know what has happened aside from inferring it from the
branch having disappeared from the main dashboard.
- surface the state of the branch in the PR dashboard (also surface
the target branch at all so users can see if their PR is targeted
as they expect as far as the mergebot is concerned)
- close & notify every PR to a branch being deactivated
- cancel any current staging to the branch (as a consequence of the
above)
Closes#632
- if stderr has been rerouted or explicitely rerouted to STDOUT,
`e.stderr` is `None` and the error reporting blows up (which is
inconvenient). Handle this case.
- handle the case where `fp_github_name` has not been configured (it's
not a super useful handling but meh, apparently git/hub doesn't
really care if there's a username when using API tokens)
- minor improvement to the refline parser: per-spec, the trailing
newline is optional, so don't fail if it's missing
Stop *staging* release PRs: they are normally fairly simple and should
not fail their staging outside of unreliable tests (or possibly a few
edge cases e.g. forgot one version change thing), however staging them
creates the possibility of a "version hole" on the release branch
which is undesirable.
Instead, immediately and unconditionally push the release commits onto
the newly created branches, if there are things which don't work they
can be fixed afterwards (and the process refined, maybe).
Also add the same feature for *bump* PRs, with the difference that the
bump PRs are not created / requested by default (they have to be opted
in individually).
For convenience, add a feature which automatically finds the PRs via
inputting the label (not really tested yet).
Closes#603
- add a logging entry for PR updates
- change the generic log entry to log the sender (of the event) rather
than the PR author
- fix the post-facto PR loader to more systematically and reliably set
a `sender`
The log message only indicated whether the PR was squashed or not, but
that's not actually useful.
Improve the message to log the actual merge method, for
information. The old "squash" (aka squash flag set and no merge
method, since an actual squash merge method was reintroduced a while
ago) has been renamed to "single" for the purpose of this display.
Old messages were quite inconsistent in their pinging of the PR author
and reviewer.
Reviewed messages (probably missed some but...) and try to more
consistently ping when the feedback requires some sort of action in
order to proceed.
Fixes#592
Stagings would be cancelled automatically if the PR's commits were
updated, but not if the target (base) was changed, even though that
has a drastic impact on staging.
Add hooks to unstage PRs if their base is updated, or if their message
is updated and relevant to staging (merge or rebase-merge methods).
Fixes#604
- code in the various menus added over time through the UI (queues,
configuration, ...)
- update / improve PR layout a tick
- fix "outstanding forward ports" count on the dashboard
- improve hover title / help on dashboard
- add date of last modification (usually date of success / failure)
- make casing more coherent (everything lowercase)
- add explicit note that UTC date on staged at label is staged at datetime
- rediscover yet again that the staging information is when hovering
on the staging *except the staged at label*
- improve `PullRequest.unstage` to always insert the PR at the start of the
reason when cancelling the staging, for clarity / traceability
Closes#560, closes#609
New accounts endpoint such that the SSO can push new pre-configured
users / employees directly. This lowers maintenance burden.
Also remove one of the source partners from the merge test, as
ordering seems wonky for unclear reasons leading to random failures of
that test.
On #509, the rebasing process was changed to forcefully update the
commit date of the commits, in order to force trigger builds.
However when squashing was re-enabled for #539 for some fool reason it
implemented its own bespoke rebasing (despite that not actually saving
any API call that I can see), meaning it did *not* update the commit
date. As such, an old commit being squashed would not get picked up by
the runbot, which is what happened to odoo/documentation#1226 (which
ultimately had to be hand-rebased after some confusion as to why it
did not work).
Update `_stage_squash` to go through `rebase` the normal way, also
update `rebase` to pop the commit date entirely instead of setting it
manually, and update the squashing test to check that the commit date
gets properly updated.
Fixes#579, closes#582
Multiple fixes to various issues of the freeze wizard.
Some of the less fix-oriented sub-tasks were moved to #586 for discussion instead.
Closes#559, #587
Currently limited to release/freeze PRs: it can be difficult to be
sure the right PR was selected then, and a mistake there seems more
impactful than in the PRs being waited for?
Note: adds a test to make sure I don't break the check that all
release PRs must have the same label (be linked). This was
already safe, and in a way this PR adds convenience but not
really safety, but better sure than sorry.
- add flag to not select repos for freezing
- allow removing more repositories from the wizard
- when performing the freeze, only create branches for the selected
repos
The freeze wizard was implemented using a single action to open and
validate the dialog.
This was a mistake, as it means if there are no errors left (e.g. all
the PRs being waited for are now validated) trying to view the freeze
wizard will immediately validate it and commit the freeze, which is
unexpected, surprising, and unsafe e.g.
- open wizard
- add freeze prs
- add a required pr or two
- close and go do something else
- be told that more PRs need to be waited for
- reopen wizard
- oops freeze is done
So split the "open action" part of `action_freeze` into opening the
action and performing the freeze. The "freeze" / "view freeze" button
on the project only activates the latter, and the actual freeze
operation is only triggered from the wizard's "Freeze" button.
Part of #559.
Provides a less manual interface for creating the freeze:
* takes the name of the branch to create
* takes any number of PRs which must be part of the freeze
* takes PRs representing the HEADs of the new branches
Then essentially takes care of the test.
Implementation of the actual wizard is not trivial but fairly
straightforward and linear, biggest issue is not being able to
`project_id.branch_ids[1]` to get the new branch, not sure why but it
seems to ignore the ordering, clearing the cache doens't fix it.
When creating the branches, add a sleep after each one for secondary
rate limiting purposes. Same when deleting branches.
Also the forwardbot has been updated to disable the forwardport cron
while a freeze is ongoing, this simplifies the freezing process.
Note: after recommendation of @aab-odoo, tried using `_applyChanges`
in `_checkState` but it simply did not work: the two relational fields
got completely frozen and were impossible to update, which is less
than ideal. Oh well, hopefully it works well enough like this for now.
Could not reproduce it in a shell, but in the original version
`self.env.cr.rowcount` would always be 0 after the `modified`.
Turns out the check is really completely dumb, because if we got any
match in `select for update` we're going to find the same on in the
update, and thus the conditional is unnecessary. I've no idea why I
did that.
Anyway remove the conditional and just always try to unstage the PR.
To prep for the addition of the freeze wizard:
* move projects out of `pull_requests.py`
* then realize half the methods there have no relation to projects and
move them to more relevant places in `pull_requests.py`
* update corresponding crons (and tests using those crons) as the
methods have changed model, and the cron definitions thus need to be
updated
* split update to labels out of sending feedback comments while at it:
labels are not used much during tests so their manipulation can be
avoided; and labels are not as urgent as feedback so the crons can
be quite a bit slower
* move the project view out of `mergebot.xml` as well
When a commit is lacking the purpose (?) tag e.g. `[FIX]`, `[IMP]`,
..., a normal commit message of the form `<module>: <info>` marches
the looks of a git pseudo-header.
This results in a commit rewrite rejiggering the entire thing and
breaking the message by moving the title to the pseudo-headers and
mis-promoting either the `closes` line of body content to "title",
resulting in a really crappy commit message
e.g. odoo/odoo@d4aa9ad031.
Update the commit rewriting procedure to specifically skip the title
line, and re-inject it without processing in the output.
Fixes#540
Re-introduce a "squash" mode solely for the purpose of fixing up
commit messages without having to go and edit them: for now "squash"
only works for single-commit PRs, acts as a normal
integration (`rebase-ff`) *but* replaces the message of the commit
itself by that of the PR, similar to the `merge` modes.
This means maintainers can update commit messages to standards by
editing the PR description (though this is obviously sensible to
edition races with the original author).
Fixes#539
Because only the first staging failure is considered "hard" and will
put the PR in error, when looking at staging logs it's possible to see
the same PR get staged over and over and over again, which is quite
confusing.
To make the logs less weird, always log a staging failure even when it
doesn't put the PR in error. Sadly this can't be tested as `capsys` is
not able to intercept an stderr inherited by a child process (capfd
doesn't work either).
Fixes#527
If a reviewer doesn't have an email set, the Signed-Off-By is an
`@users.noreply.github.com` address which just looks weird in the
final result.
Initially the thinking was that emails would be required for users to
*be* reviewers or self-reviewers, but since those are now o2ms / m2ms
it's a bit of a pain in the ass.
Instead, provide an action to easily try and fetch the public email of
a user from github.
Fixes#531
After internal discussions it was concluded that this didn't extend
much more trust than allowing authors to accept their single-PR
commits without additional supervisions, and it would avoid some
inconveniences and PR-blocking.
Fixes#69 (nice)
Because sometimes github updates are missed (usually because github
never triggers it), it's possible for the mergebot's view of a PR
description to be incorrect. In that case, the PR may get merged with
the wrong merge message entirely, through no fault of the user.
Since we already fetch the PR info when staging it, there's very
little overhead to checking that the PR message we store is correct
then, and update it if it's not. This means the forward-port's
description should also be correct.
While at it, clean the forward port PR's creation a bit:
- there should always be a message since the title is required on
PRs (only the body can be missing), therefore no need to check that
- as we're adding a bunch of pseudo-headers, there always is a body,
no need for the condition
- inline the `pr_data` and `URL`: they were extracted for the support
of draft PRs, since that's been removed it's now unnecessary
Fixes#530
On staging failure, the 'bot would point to the first error or failure
status it found on the commit. This turns out not to be correct as
we (now) have various statuses which are optional, and may fail
without blocking stagings (either because they're solely informational
or because they're blocking & overridable on PRs).
Fix this so the 'bot points to the first *required* failure.
Fixes#517
* Remove the forwardport creating PRs in draft, that was mostly to
avoid codeowners triggering but we've removed the github one and
hand-rolled it, so not a concern anymore.
* Prevent merging `draft` PRs, the mergebot rejects approval on draft
PRs and insults people.
TBD (maybe): try to create *conflicting* forward-port PRs in draft so
it's clearer they need to be *fixed*? Issue of not being able to do
that on all private repositories remains so~~
Fixes#500
"Uniquifier" commits were introduced to ensure branches of a staging
on which nothing had been staged would still be rebuilt properly.
This means technically the branches on which something had been
staged never *needed* a uniquifier, strictly speaking. And those lead
to extra building, because once the actually staged PRs get pushed
from staging to their final destination it's an unknown commit to the
runbot, which needs to rebuild it instead of being able to just use
the staging it already has.
Thus only add the uniquifier where it *might* be necessary:
technically the runbot should not manage this use case much better,
however there are still issues like an ancillary build working with
the same branch tip (e.g. the "current master") and sending a failure
result which would fail the entire staging. The uniquifier guards
against this issue.
Also update rebase semantics to always update the *commit date* of the
rebased commits: this ensures the tip commit is always "recent" in the
case of a rebase-ff (which is common as that's what single-commit PRs
do), as the runbot may skip commits it considers "old".
Also update some of the utility methods around repos / commits to be
simpler, and avoid assuming the result is JSON-decodable (sometimes it
is not).
Also update the handling of commit statuses using postgres' ON
CONFLICT and jsonb support, hopefully this improves (or even fixes)
the serialization errors. Should be compatible with 9.5 onwards which
is *ancient* at this point.
Fixes#509
Although it's possible to find what PR a commit was part of with a bit
of `git log` magic (e.g. `--ancestry-path COMMIT.. --reverse`) it's
not the most convenient, and many people don't know about it, leading
them to various debatable decisions to try and mitigate the issue,
such as tagging every commit in a PR with the PR's identity, which
then leads github to spam the PR itself with pingbacks from its own
commits. Which is great.
Add this information to the commits when rebasing them (and *only*
when rebasing them), using a `Part-of:` pseudo-header.
Fixes#482
Proper attribution is important in general, but especially for
external contributors. Before this change, and the previous change
fixing authorship deduplication, it was rather easy for a "squashed"
conflict commit (attributed to the 'bot for lack of a really clean
option) to get merged by mistake.
This commit changes two things:
* The mergebot now refuses to stage commits without an email set, the
github API rejects those commits anyway so any integration mode
other than `merge` would fail, just with a very unclear error
* The forwardbot now creates commits with an empty author / committer
email when the pull request as a whole has multiple authors /
committers. This leverages the mergebot update.
Also clean up the staging process to provide richer error reporting
using bespoke exceptions instead of simple assertions. I'm not sure
we've ever encountered most of these errors so they're really sanity
checks but the old reporting would be... less than great.
Fixes#505
If a PR is closed on github and unknown by the mergebot, when fetched
it should be properly sync'd as "closed" in the backend, otherwise the
PR can get in a weird state and cause issues.
Also move the "I fetched the thing" comment before the actual creation
of the PR for workflow clarity, otherwise the reader has the
impression that the 'bot knew about the PR then fetched it anyway.
And improve savepoint management around the fetching: savepoints
should be released in all cases.
Closes#488.
If two PRs have the same label *in different projects entirely*, the
mergebot should not consider them to be linked, but it did as shown by
the warning message on odoo-dev/odoo#905 (two PRs created from the
same branch in different projects were seen as linked by the status
checker).
3b417b16a1 fixed the actual staging
selection, it's only the warning which did not properly segregate PRs.
Only group PRs which target the same branch (therefore are within the
same project).
Fixes#487
Previously, a PR's status page would only show the linked / related
PRs when `open`.
Since the relations between PRs remains useful, also make this
information available during staging and after merging.
Fixes#463
If a PR got merged to master (or whatever the current development
branch is), there's no easy way to know what maintenance branch it
ended up landing in, except by asking git which branches contain the
commit (which can be rather slow).
Add a special case on merge which labels the PR with a pseudo-branch
patterned after the second-to-last branch of the project:
* if the branch ends with a number, increment the number by one
e.g. 2.0 -> 2.1, 5 -> 5.1
* otherwise, just prefix with `post-` e.g. "maint" ->
"post-maint" (that one doesn't sound very helpful, but I guess it's
nice for the weirdoes who call their branches "natty narwhal" and
shit)
Fixes#450
5cf3617eef intended to create merge
messages with only the content of PR descriptions before the first
thematic break. However this processing was incorrectly applied
to all messages being processed (meaning rebased / squashed commit
messages as well).
Properly apply thematic break processing to only commit messages
created from PR descriptions.
Before this, we would "roughly" select stagings by looking at stagings
whose heads matched a specific sha then validating them all. This
could perform extra validations on stagings once in a while but this
was assumed not to be much an issue, at least originally.
However two changes later on have contributed to this likely being the
cause of #429 (stagings never timing out):
* heads of the staging branches are uniquifier commits stored in the
heads map, but the actual heads of the stagings are also stored
there, some of which are no-ops (hence the uniquifiers) so assuming
repos A and B, if a staging contains PRs touching A then the head of
B actual will also be a head of B
* when a staging is validated, if it *contains* any pending result the
timeout limit gets bumped back
The issue here is that if a success / failure status is lost (which
would be the most common reason for timeouts) *and* someone has forked
and is regularly rebuilding a branch-head used as-is by a staging,
each of those rebuilds will trigger a validation of the staging, which
will find that one of the statuses is still pending (because we missed
the success / failure), which will bump up the timeout limit,
continuing until the branch stops getting rebuilt.
This is probably one of the reasons why some stagings last for *way*
more than 2h, though it is far from explaining all of them: 90% of the
stagings lasting more than *3*h end up succeeding. Tho it's always
possible that this is because someone notices and resends a success
for the missing status it seems somewhat doubtful. Oh well.
Also fix the incorrect log call on `update_timeout_limit` triggering.
I'd forgotten that in order to better handle cases where the CI is
highly backed up (and / or slow for some reason), we actually update
the CI timeout to really take the last "pending" status as the "true
start" of the CI. This might explain why lots of stagings needed extra
time: as of right now, out of 28835 stagings
- 20086 had their timeout bumped by more than 15mn
- 6967 had their timeout bumped by more than 30mn
- 264 had their timeout bumped by more than 1h
- 30 had their timeout bumped by more than 2h
Add some logging every time the CI is bumped this way, so we have
better visibility into that event.
Closes#429
The mergebot has a feature to ping users when an approved PR or
forward-port suffers from a CI failure, as those PRs might be somewhat
unattended (so the author needs to be warned explicitly).
Because the runbot can send the same failure information multiple
times, the mergebot also has a *deduplication* feature, however this
deduplication feature was too weak to handle the case where the PR has
2+ failures e.g. ci and linting as it only stores the last-seen
failure, and there would be two different failures here.
Worse, because the validation step looks at all required statuses, in
that case it would send a failure ping message for each failed
status *on each inbound status*: first it'd notify about the ci
failure and store that, then it'd see the linting failure, check
against the previous (ci), consider it a new failure, notify, and
store that. Rinse and repeat every time runbot sends a ci *or* lint
failure, leading to a lot of dumb and useless spam.
Fix by storing the entire current failure state (a map of context:
status) instead of just the last-seen status data.
Note: includes a backwards-compatibility shim where we just convert a
stored status into a full `{context: status}` map. This uses the
"current context" because we don't have the original, but if it was a
different context it's not going to match anyway (the target_url
should be different) and if it was the same context then there's a
chance we skip sending a redundant notification.
Fixes#435
Before this change, a CI override would have to be replicated on most
/ all forward-ports of the base PR. This was intentional to see how it
would shake out, the answer being that it's rather annoying.
Also add a `statuses_full` computed field on PRs for the aggregate
status: the existing `statuses` field is just a copy of the commit
statuses which I didn't remember I kept free of the overrides so the
commit statuses could be displayed "as-is" in the backend (the
overrides are displayed separately). And while at it fix the PR
dashboard to use that new field: that was basically the intention but
then I went on to use the "wrong" field hence #433.
Mebbe the UI part should be displayed using a computed M2M (?)
as a table or as tags instead? This m2m could indicate whether the
status is an override or an "intrinsic" status.
Also removed some dead code:
* leftover from the removed tagging feature (removed the tag
manipulation but forgot some of the setup / computations)
* unused local variables
* an empty skipped test case
Fixes#439.
Fixes#433.
Currently when creating *merges* the commit message is the
concatenation of the PR title and the PR body.
However it can be convenient to include description test which would
not be included in the commit message e.g. PR template
stuff. "Thematic breaks" seem like a good way to do this: the commit
message will only include the content preceding the first thematic
break, everything else is ignored (except headings which are not
ignored, double negations FTW).
Might be that that's a bit excessive and we should only ignore what
follows the *last* thematic break. Or that there needs to be a more
specific marker. We'll see.
Fixes#443.
Because the commands were collected in a `dict[Command, Params]` a
command could only be present once *in the mergebot* (the forwardbot
already collected commands in a list).
* Update mergebot commands parser to collect commands in a list.
* Improve override to allow a comma-separated list of CIs to override.
* Simplify the parsing of delegate to generate one delegate command
per target (slightly less efficient if multiple logins are provided
but that is likely extremely rare).
Fixes#445
Because github materialises every labels change in the
timeline (interspersed with comments), the increasing labels churn
contributes to PRs being difficult to read and review.
This change removes the update of labels on PRs, instead the mergebot
will automatically send a comment to created PRs serving as a
notification that the PR was noticed & providing a link to the
mergebot's dashboard for that PR where users should be able to see the
PR state in detail in case they wonder what's what.
Lots of tests had to be edited to:
- remove any check on the labels of the PR
- add checks on the PR dashboard (to ensure that they're at least on
the correct "view")
- add a helper to handle the comment now added to every PR by the 'bot
- since that helper is needed by both mergebot and forwardbot, the
utils modules were unified and moved out of the odoo modules
Probably relevant note: no test was added for the dashboard
ACL, though since I had to explicitly unset the group on the repo used
for tests for things to work it looks to me like it at least excludes
people just fine.
Fixes#419
Convert overridable CI to an m2m from partners, it's significantly
more convenient to manipulate as multiple users can (and likely will)
have access to the same overrides, add a name_search so the override
is easy to find from a partner, and provide a view for the
overrides (with partners as tags).
Also make the repository optional on CI overrides.
Fixes#420
When retrieving unknown PRs, the process would apply all comments,
thereby applying eventual r+ without taking in account their
relationship to a force push. This means it was possible for a
mergebot-unknown PR to be r+'d, updated, retargeted, and the mergetbot
would consider it good to go.
The possible damage would be somewhat limited but still, not great.
Sadly Github simply doesn't provide access to the entire event stream
of the PR, so there is no way to even know whether the PR was updated,
let alone when in relation to comments. Therefore just resync the PR
after having applied comments: we still want to apply the merge method
& al, we just want to reset back to un-approved.
An other minor fix (for something we never actually hit but could):
reviews are treated more or less as comments, but separate at github's
level. The job would apply all comments then all reviews, so the
relative order of comments and reviews would be wrong.
Combine and order comments and reviews so they are applied
in (hopefully) the correct order of their creation / submission.
Closes#416
Historically PRs to disabled branches were treated like PRs to
un-managed branches: ignored.
However because they cay *already exist* when the branch is disabled,
the effects can be subtly different, and problematically so
e.g. ignoring all PR events on PRs targeting disabled branches means
we can't close them anymore, which is less than great.
So don't ignore events on PRs to disabled branches (creation, sync,
closing, and reopening) but also send feedback on PRs to disabled or
un-managed branches to indicate that they're not merge-able.
Fixes#410
If we can't stage a PR, rather than immediately put them in error wait
until they were the first we tried staging, otherwise they might have
been conflicting with the previous batch which ultimately won't be
merged for other reason and they would have worked as part of the next
batch.
Note: this commit will lead to false negatives because it's
batch-based rather than repo-based, so if the first batch only affects
repo A and the second batch only affects repo B, if the second batch
triggers a merge error it should be rejected immediately (as it's
applied on a "pristine" repo B) but this change will just bump it to
the next staging.
fixes#209
On per-repo status configurations, convert the "branch_ids" filter to
a domain on branches. Since the selection is generally
binary (statuses either apply to the master branch or apply to
non-master branch) this avoids error-prone missed updates where we
forget to enable statuses pretty much every time we fork off a new
branch.
Fixes#404
Normally opening a PR against a disabled branch is like opening a PR
against a branch which is not configured at all: the PR id ignored
entirely.
However if the PR already exists then the state of the branch isn't
normally checked when interacting with the branch, and it is possible
to trigger its staging, at which point the staging itself will crash:
on a project the branches are `active_test=False` so they're all
visible in the form, but when repos go search()-ing for the branch
they won't find it and will blow up.
Solution: only try staging on branches which are active. Fixes
odoo/runbot#408. Also do the same for checking stagings.
And while at it, fix#409 by wrapping each checking or staging into a
try/except and a savepoint. This way if a staging blows up it should
move on to the next branch instead of getting stuck.
The "blocked" computation would not take branch targets in account, so
PRs with the same label targeting *different branches* (possible if
somewhat rare due to our naming conventions) could block one another,
despite really being unrelated.
Also fix up some messages:
* if a PR is blocked due to having no merge method, it should say
that, not "has no merge" (no merge what?)
* format un-managed branches as `$repo:$branch` in logging messages,
`$repo#$thing` is for issues / PRs and `$branch` alone can be very
unhelpful
Closes#407
When using @fw-bot close, a feedback would be created without a
message (rather than e.g. with an empty one). As a result, the
feedback-sending cron would crash, but not before having closed the
corresponding PR.
This would lead to closing the PR in a loop & spamming the logs with
tracebacks.
Seems to be a pretty long-standing issue but I'd not noticed it before
as it's rather rarely taken & our sentry remains rather blown to hell,
I only happened to stumble upon the issue in the logs.
There's no ``number`` attribute on the repository object (to which
``_load_pr`` belongs). We obviously want to use the number of the PR
we're currently loading.
This is a regression due to the implementation details of
odoo/runbot#376: previously _parse_command would only yield the
commands it had specifically recognised (from a whitelist).
22e18e752b simplified the implementation
and (for convenience when adding new commands) now passes through any
command to the executor instead of skipping the unknown one.
But I forgot to update the executor to ignore unknown commands, so it
treats them as *failed* (since the success flag doesn't get set) and
assumes it's an ACL issue, so notifies the user that they can't do the
thing they never really asked for.
Add an end-case which skips the feedback bit for unrecognized
commands, which restores the old behavior.
Fixes#390
Currently it can be difficult to know why the mergebot refuses to
merge a PR (not that people care, they generally just keep sending new
commands without checking what the 'bot is telling them, oh well...).
Anyway knowing the CI state is the most complicated bit because the CI
tag only provides a global pass/fail for statuses but not a view of
specific statuses, and sometimes either the runbot or github fails to
notify the mergebot, leading to inconsistent internal states & shit.
By adding a tag per status context per PR, we can more clearly
indicate what's what.
Fixes#389
Adds an `override` mergebot command. The ability to override is set on
an individual per-context per-repository basis, similar to but
independent from review rights. That is, a given individual may be
able to override the status X on repository A and unable to do so on
repository B.
Overrides are stored in the same format as regular statuses, but
independent from them in order to persist them across builds.
Only PR statuses can be overridden, statuses which are overridable on
PRs would simply not be required on stagings.
An alternative to implementing this feature in the mergebot would be
to add it to individual status-generating tools on a per-need
basis.
Pros of that alternative:
* display the correct status on PRs, currently the PR will be failing
status-wise (on github) but correct as far as the mergebot is
concerned
* remove complexity from the mergebot
Cons of that alternative:
* each status-generating tool would have to implement some sort of ACL
system
* each status-generating tool would have to receive & parse PR
comments
* each status-generating tool would have to maintain per-pr state in
order to track overrides
Some sort of helper library / framework ought make that rather easy
though. It could also be linked into the central provisioning system
thing.
Closes#376
Requirement for odoo/runbot#376: one can't expect there being someone
to override CI checks on stagings, so it only makes sense for checks
on PRs, which in turns requires that there could be checks only
required on PRs.
Could also be useful for features like incremental linting /
formatting, we may want to apply checks on PRs which filter on the
lines modified, but not require the entire software be reformatted at
once.
Having the required statuses be a mere list of contexts has become a
bit too limiting for our needs as it doesn't allow e.g. adding new
required statuses on only some branches of a repository (e.g. only
master), nor does it allow putting checks on only branches, or only
stagings, which would be useful for overridable checks and the like,
or for checks which only make sense linked to a specific revision
range (e.g. "incremental" linting which would only check whatever's
been modified in a PR).
Split the required statuses into a separate set of objects, any of
which can be separately marked as applying only to some branches (no
branch = all branches).
Fixes#382
When it updated tagging e82de3136b also
incorrectly replaced a `pr` by `pr.display_name`, probably leftover
from an attempt to update a callsite from `str(pr)` to
`pr.display_name` which I missed when reverting that.
Anyway at that section, `pr` is an integer (as it comes from an SQL
query) not an object.
The logic of the partner merge wizard is to collect all relevant data
from source partners, write them to a destination partner, then remove
the sources.
This... doesn't work when the field in question has a UNIQUE
constraint (like github_login), because it's going to copy the value
from a source onto a dest which will blow the constraint, and so the
copy fails. In that case the user first has to *move over* the unique
field's value then they can use the wizard.
Just fix for the github login: take all sources, remove (and store)
their github logins, then write the login onto the dst.
An alternative would have been to *defer* the constraint, however:
* it only works on unique constraints, not unique indexes
* it requires the constraint to be declared DEFERRABLE
Closes#301
Up till now if an FF failed with an exception having neither cause nor
context the cancel reason would be an empty string.
Fallback on stringifying the exception itself as a last resort.
Genericise runbot_merge's tagging (move states to the "UI" but only
store / manage actual tags), and remove forwardport.tagging as it's
now redundant.
Closes#232
approving a PR which failed CI should trigger a feedback message since
6cb58a322d (#158), the code has not been
removed and the tests still pass.
However fwbot r+ would go through its own process for r+ which would
explain why that feedback is sometimes gone / lost (cf #327 and #336).
* make fwbot r+ delegate to mergebot r+
* add dedicated logging for this operation to better analyze
post-mortem
* automatically ping the reviewer to specifically tell them they're idiots
* move the feedback item out of the state change bit, send it even if
it's a useless r+ (because it's already r+'d)
* add a test for forward-ports
Closes#327, closes#336
Remove original-signed-off-by, doesn't actually seem useful given the
semantics of signed-off-by according to the kernel doc'. Plus it
didn't actually work as the intent was to keep the signoff of the
original PR in the forward-port, but that signoff is not part of the
commit we're cherrypicking (it gets added on the fly when the commit
is merged).
Therefore explicitly get the ack-chain into the PR: when merging an FP
PR, try to integrate the signoff of the original PR, that of the final
FP pr, and while at it that of the last explicit update in the commit
chain (e.g. in case there's been a conflict or something).
Fixes#284
* only provide fields which make sense for the mergebot
* provide formatting & searchability for review rights records so
they're visible from the list directly
This is more of a sanity check as it normally should not be a factor:
labels generally contain the target name, and staging checks are
performed per-target so we're not mixing multiple targets anyway.
But let's say a third-party creates a fix-foo branch for A and a
fix-foo branch for B, we want to ensure they're not considered batched
together.
Rather than try to fix up various bits where we search & all and
wonder what index we should be using, make the column a CIText.
For mergebot the main use case would be properly handling
delegate=XXX: currently if XXX is not a case-sensitive match we're
going to create a new partner with the new github login and
give *them* delegation, and the intended target of the delegation
isn't going to work correctly.
Also try to install the citext extension if it's not in the database,
and run the database-creation process with `check=True` so if that
fails we properly bubble up the error and don't try to run tests on a
corrupted / broken DB.
Fixes#318
As the odds of having more projects or more repos with different
requirements in the same project, the need to have different sets of
reviewers for different repositories increases.
As a result, rather than be trivial boolean flags the review info
should probably depend on the user / partner and the repo. Turns out
the permission checks had already been extracted into their own
function so most of the mess comes from testing utilities which went
and configured their review rights as needed.
Incidentally it might be that the test suite could just use something
like a sequence of commoditized accounts which get configured as
needed and not even looked at unless they're used.
Before this change, `r-` on a pr[p=0] does essentially nothing. At
most it will unstage if the PR had been (somewhat unnecessarily) r+'d
in the past but then the PR will get re-staged immediately.
To avoid this odd behaviour, if r- is sent to a p=0 PR not only is
the PR unreviewed (if it was reviewed) it always gets unstaged, and
its priority gets reset to 1 (high priority but doesn't bypass CI and
review). Also send a comment on that subject so followers of the pr
are notified.
Fixes#313
The staging cron was already essentially split between "check if one
of the stagings is successful (and merge it)" and "check if we should
create a staging" as these were two separate loops in the cron.
But it might be useful to disable these two operations separately
e.g. we might want to stop the creation of new staging but let the
existing stagings complete.
The actual splitting is easy but it turns out a bunch of tests were
"optimised" to only run the merge cron. Most of them didn't blow up
but it seems more prudent to fix them all.
fixesodoo/runbot#310
The PR creation had been fixed to always validate even without a
commit found (in case there was no need for a commit), but the update
of a PR in such a situation was not tested, and thus naturally did not
work because why would it work if it wasn't tested?
Also remove the conditional skip on updating a PR to a new head.
The test was checking things would work properly with
required_statuses being an empty string, because I'd also forgotten an
empty field becomes stored as `False` in the database, so trying
things out live neither the PRs nor the staging would work as their
assumption that they could straight split the required_statuses would
always fail.
Update the test to better match expectations, and hopefully this is
the end of that saga.
PRs transitioning to 'ready' had been checked and tested but turns out
I had completely forgotten to test that stagings would validate
properly therefore of course they didn't.
The issue here was I'd forgotten `''.split(',')` returns `['']` rather
than `[]`, so on an empty required_statuses the staging validator
would keep looking for a status matching the context `''` and would
never find it, keeping the staging pending until timeout. So most
likely the problem could have been resolved by just adding a condition
to
[r.strip() for r in repomap[c.sha].required_statuses.split(',')]
but I'd already done all the rest of the reorganisation by that point,
test pass and I think it's a somewhat better logic. Therefore I'll go
with that for now.
* properly handle empty required_statuses during staging validation
* remove the final postcondition, if we're missing commits which don't
require any statuse we should not care
* expand test to include up to merging PRs
* automatically create dummy commits when creating stagings, that way
the relevant commits are in the database (can't hurt)
PS: an other alternative would have been to filter out or skip ahead
on commits which don't require any statuses aka cmap &
required_statuse / cmap would not even have that entry
Refactor the selection thingie, hopefully in a way which doesn't
absolutely crater performances, so that it's possible to explain the
reason why a PR is considered blocked.
Despite the existing dedup' sometimes the "xxx failed on this
forward-port PR" would still get multiplicated due to split builds
e.g. in odoo/odoo#43935 4 such messages appear within ~5 minutes, then
one more 10mn later.
This is despite all of them having the same "build" (target_url) and
status (failure). Since the description is the only thing that's not
logged I assume that's the field which varies and makes the dedup'
fail. Therefore:
* add the description to the logging (when getting a status ping)
* exclude the description when checking if a new status should be
taken in account or ignored: the build (and thus url) should change
on rebuild
Hopefully fixes#281
A while back I implemented name_get/display_name to print PRs using
the canonical github format (owner/repo#number), however looks like
some of the logging calls were still using bespoke formatting.
Interaction of CacheMiss and BaseModel is fucked, leading to an
infinite loop when trying to provide useful __str__ on a model (by
accessing model fields).
Using `modified` seems safer than just blowing the cache with respect
to stored computed fields depending on PR state (not sure there are
any but it's likely).
Previous version incorrectly browsed the PR *number* (rather than ID)
so at best it would do nothing and at worst it might go and notify the
wrong PR entirely.
Discussing #238 with @odony, the main concern was the difficulty of
understanding if things merged in one repo were related to things
merged in an other repo: currently, knowing this requires going to the
merged PR, getting its label, and checking the PRs with the same HEAD
in the other repository to see if there's a correlation (e.g. PRs
merged around the same time).
The current structure of the mergebot makes it reasonably easy to add
the other PRs of the batch in the pseudo-headers, such that we get
links to all "related" PRs in the head commit (and links back from the
commits which is probably less useful but...)
Fixes#238
1. if we try to stage a PR and realize we'd stored / checked the wrong
head, cancel the staging and notify the PR
2. provide a command to forcefully update pr heads (or at least check
that a PR's head is up to date)
Closes#241
When closing a PR, github completely separates the events "close the
PR" and "comment on the PR" (even when using "comment and close" in
the UI, a feature which isn't even available in the API). It doesn't
aggregate the notifications either, so users following the PR for
one reason or another get 2 notifications / mails every time a PR
gets merged, which is a lot of traffic, even more so with
forward-ported PRs multiplying the amount of PRs users are involved
in.
The comment on top of the closure itself is useful though: it allows
tracking exactly where and how the PR was merged from the PR, this
information should not be lost.
While more involved than a simple comment, *deployments* seem like
a suitable solution: they allow providing links as permanent
information / metadata on the PRs, and apparently don't trigger
notifications to users.
Therefore, modify the "close" method so it doesn't do
"comment-and-close", and provide a way to close PRs with non-comment
feedback: when the feedback's message is structured (parsable as
json) assume it's intended as deployment-bound notifications.
TODO: maybe add more keys to the feedback event payload, though in my
tests (odoo/runbot#222) none of the deployment metadata
outside of "environment" and "target_url" is listed on the PR
UI
Fixes#224
It's a waste to lose the entire staging if it's only a short blip /
delay thing, so retry multiple times. Add utility function to make
backoff functions easier (though the UI is not great ATM).
Also log the "left" parent of a merge commit (which should be the
"base") when creating it, for additional post-mortem information.
Turns out not only can that operation fail, that operation can succeed
but have its effect delayed. To try and guard against that,
immediately check that we get the correct ref' after having reset it.
This is the cause of the November 6 mess: when preparing a staging,
the mergebot does the following,
1. get the head of <branch>
2. hard-reset tmp.<branch> to that
3. start merging PRs, which requires getting the current state of
tmp.<branch> back
On the 6ths, these steps looked like this
```text
2019-11-06 10:03:21,588 head(odoo/odoo, master) -> ab6d0c38512e4944458b0b6f80f38d6c26b6b597
2019-11-06 10:03:22,375 set_ref(update, odoo/odoo, tmp.master, ab6d0c38512e4944458b0b6f80f38d6c26b6b597 -> 200 (OK)
2019-11-06 10:03:28,674 head(odoo/odoo, tmp.master) -> de2a852e7cc1f390e50190cfc497bc253687fba8
2019-11-06 10:03:30,292 head(odoo/odoo, tmp.master) -> de2a852e7cc1f390e50190cfc497bc253687fba8
```
So the 'bot fetched the commit at the head of master (ab6d0c), reset
tmp.master to that... and then got a different commit when it fetched
the tmp head to stage a PR on it.
That different head being of course a previous rejected staging. When
the new staging succeeded, it brought the entire thing in and made a
mess.
This was compounded by an issue I still have to investigate: the
staging of the new PR took the wrong base commit *but the right base
tree*, as a result the first thing it did was *reverse the entire
previous commit* (without that we could probably have left it as-is
rather than need to force-push master -- twice).
When posting a reminder that there are open / waiting forward ports on
a source PR, also post *which* PRs those are.
While at it, move the cron code in a proper python file (so we can use
stuff from odoo.tools), and fix display_name so we can straight use
display_name as a github ref' ({owner}/{repo}#{number}). This impacts
log-grepping but it seems like an improvement nonetheless.
Closesodoo/runbot#228
* shorten the postfix, forwardbot is now a bigram!
* shorten the uniquifier: go from 5 to 3 bytes, and use urlsafe base64
that way we only have a 4-char uniquifier instead of 8
* while at it, fix deprecated calls to logging.warn (should be
logging.warning)
Fixes#226
Attempt to avoid some of the comment spam by dedup-ing input (only
signaling when the status actually changes and ignoring identity
transformations) and in case of failing CI keeping the last failed
status and not signaling on the next update if it's the same failure.
Closes#225
The staging validation routine would ignore stagings which were
cancelled or ff_failed, but it should also have ignored failed and
successful aka all terminal state.
Simplify the condition for that: just ignore a staging's validation if
the staging is not pending.
Closes#211
Turns out we don't want to close the cursor on success, we just want to
commit, but that's not what the default context manager does.
So don't use said context manager.
If a _validate call blows up, the entire Commit._notify cron gets
stuck, which is an issue because not only does it stop creating
forward ports, it also stops "progressing" stagings.
If the CI is greatly backed up (either insufficient capacity or jobs
spike) a timeout which is normally perfectly fine might be
insufficient e.g. given a 2h timeout, if a job normally takes 80mn but
the staging's job starts 40mn after the staging was actually created
we're sunk. And cancelling the staging once the job has finally gotten
started is not going to improve load on the CI, it just wastes a CI
slot.
Therefore assume a `pending` event denotes the actual start of the job
on the CI, and reset the timeout to start from that moment so
ci_timeout is the timeout of the CI job itself, not of the staging
having been created.
Closes#202
Having all the feedback be sent by the mergebot user (github_token) is
confusing. Add a way to specify which field of project should be used to
source the token used when sending feedback.
Fixes#190
* Cherrypicking is handrolled because there seems to be no easy way to
programmatically edit commit messages during the cherrypicking
sequence: `-n` basically squashes all commits and `-e` invokes a
subprocess. `-e` with `VISUAL=false` kinda sorta works (in that it
interrupts the process before each commit), however there doesn't
seem to be clean status codes so it's difficult to know if the
cherrypick failed or if it's just waiting for a commit of this step.
Instead, cherrypick commits individually then edit / rewrite their
commit messages:
* add a reference to the original commit
* convert signed-off-by to something else as the original commit was
signed off but not necessarily this one
* Can't assign users when creating PRs: only repository collaborators
or people who commented on the issue / PR (which we're in the
process of creating) can be assigned.
PR authors are as likely to be collaborators as not, and we can have
non-collaborator reviewers. So pinging via a regular comment seems
less fraught as a way to notify users.
Prepares for more complex edition operations on the forwardbot side
* split out the pseudo-headers from the message body
* don't separate the co-authored-by headers from the others, seems
unnecessary, we just need to ensure they're at the end so github
doesn't miss them (/it)
* split action_cancel (UI button) from cancel (internal): since the
xhr mapping is weird, if there are available args the mapper thinks
it should pass the call context as reason which is unexpected
* make cancel a no-op when called on already inactive stagings
* make cancel work when called on multiple statgings
* make computing the active staging work properly in an
active_test=False context (e.g. when it's interacted with from the
form view because that comes from the list view which is
active_test=False, probably so we can see not just the stagings but
recursively see deactivated batches in deactivated stagings)
* don't show the cancel button on inactive stagings
Stagings have a "statuses" field which was shown but useless (as it's
a binary), they also have a "heads" field which only provides a
mapping of repository names to commits.
This change provides the staging heads as a commits m2m.
Fixes#178
* extract method to create a PR object from a github result (from the
PR endpoint)
* move some of the remote's fixtures to a global conftest (so they can
be reused in the forwardbot)
In case of error while fast-forwarding a staging to its source, we'd
log the target to which we couldn't FF. Sadly this relied on a
`repo_name` variable which (likely since the introduction of the
"safety dance" fast forwarding) can not actually be set in case of
failure.
So stash the relevant bit (the repo name) inside the FF error exception
and use that to compose our logging message instead of a variable which
can only be None.
Github constrains a single issue (/PR) number per repository, having
different targets does not allow two PRs to share a number.
Doesn't fix but should mitigate #169 slightly.
Before this change mergebot assumes github's tags are in sync with its
"previous" state, but because tags update was highly non-atomic (one
call per removal plus one for additions) and state can further change
between a failure and an update retry (especially as the labels endpoint
fails *a lot*), it's possible for set tags (in github) to be completely
desync'd from the mergebot state, leading to very misleading on-pr
indications.
This first fetches the current tagstate from github (to not lose non-
mergebot tags) then (hopefully atomically) resets all tags tags based on
the current mergebot state. This should avoid desyncs, and eventually
resync PRs (if they change state).
Fixes#170
On a PR being updated, closed or unreviewed, if it is part of an
active staging that staging would get cancelled (yay). However, if the
PR was part of a pending *split*, then the split would *not* get
cancelled / updated (to remove the PR from it), and the PR could go on
to get staged as if everything were right in the world which is an
issue.
It doesn't look like it actually happened (at least I got no echo of
it), but it almost did at least once.
fixes#160
Also add test for it & feedback of an approved PR failing CI, and fix
corner case with it (might not send a warning immediately on CI failure
depending on status requirement ordering).
Fixes#158
* when rebasing, store a map of rebased to source, that way it'll be
possible to link cherry-picked forward ports to the originally
integrated commit rather than just the one from the PR (which was
likely not itself integrated as the straight merge mode is somewhat
rare: as of 5600 PRs merged so far only 100 were straight merged)
* while at it, store the "merge head" of the PR (whether squashed,
merged or rebased) and put *that* in the commit message
fixes#161
Sometimes people add co-authored-by lines in the middle of their
message, where github ignores them.
Since we previously added properly handling existing (correct) C-A-B
lines in the case where we're adding fixes and signed-off-by, we might
as well fix-up existing but mispalced co-authored-by lines.
Fixes#107
The race condition which prompted STAGING_SLEEP rears its ugly head
again: when pushing a base repo and its dependents, it's possible for
the update to the base repo's new head to take much longer to be visible
than the dependents (or so it seems?).
In this case, CI might pick up the correct dependent but pick an older /
incorrect revision of the base, leading to a staging failing for no good
reason.
This change uses info/refs to check for the updated staging head to be
visible at the repo level after it's been set / updated via the API. It
assumes repos are in topological order.
Use the proper / actual "is there any stageable PR" query to check if
a PR is blocked as well, that way they shoudn't be diverging all the
time even if it might make PR.blocked a bit more expensive.
fixes#111
Will comment any time a statuses update folds to a CI failure on a
reviewed pull request. Might be somewhat spammy, we'll see.
No notification if the PR is not reviewed yet.
fixes#87
Before this, impacting a commit's statuses on the relevant PR or
staging would be performed immediatly / inline with its
consumption. This, however, is problematic if we want to implement
additional processing like #87 (and possibly though probably not #52):
webhook handlers should be kept short and fast, feeding back into
github would not be acceptable.
- flag commits as needing processing instead of processing them
immediately, this uses a partial index as it looks like the
recommended / proper way to index a boolean column in which one of
the values is searched much more than the other (todo: eventually
check if that actually does anythnig)
- add a new cron for commits processing
- alter tests so they use this new cron (mostly by migrating them to
`run_crons` though not solely as some still need more detailed
management to properly check intermediate steps)
Fix an issue with closing a staged PR while at it (the "merging" tag
would potentially never be removed).
Proper RFC5322 makes for much noisier messages, and seems completely
unnecessary as examples of sign-off on the internet don't quote spaces
/ names.
closes#102
* split out truly awaiting PRs from those waiting on an event of some
sort
* if a staging is active but doesn't have a state yet, it should be
considered pending not cancelled
closes#74
If a PR gets sync'd to a known-valid commit, it should be marked as
valid rather than get in this weird state where it's merely open but
github knows it passes CI.
Fixes#72
This is somewhat less useful with runbot's fail-fast as a runbot
failure (false positive or not) will now very quickly trigger an end
to the current staging.
Still, could be of use.
closes#89
Turns out skipping locks is not very useful when there are no locks
being held because we only touch the PRs *after* the merge has been
applied.
So finally do that, lock all of a staging's PRs before we try to
fast-forward the relevant repositories, so a close command coming back
from github (from having seen the closes #xxx annotation) doesn't
screw us over.
No test because I don't understand how / why it's triggered, it's just
that some PRs don't have a label. I assumed the issue occurred when
the source branch or even repo (cross-repo PR) was deleted, but it
doesn't seem to trigger the issue (or in any case not in as short a
time as a test, maybe GH eventually does some vacuuming which causes
the issue?
Anyway we may eventually want to reclaim these PRs (allowing a lack of
label and treating them like the patch-\d labels: with no semantic
value) however the simplest thing to do for now is to just ignore the
corresponding PR.
closes#101
Github is subject to a fair amount of transient failures, which are
currently ill-logged: an exception is raised and the caller /
responsible might eventually log something, but it's not really
formalised and centralised, and is thus inconvenient to try and
post-mortem issues with github's support.
Change this such that *almost* all github API calls get extensively
logged (status, reason, all headers, body) on failure.
Also automatically sets debug logging for odoo in local tests, and
alter the fake response constructor thing so it doesn't set a json
mimetype when the body is not valid json.
Closes#98
When a user changes the merge method via github messages, no feedback is
sent. This could lead to strange behavior, for example when a user try
to joke with the mergebot like this:
> robodoo are you goin ti merge my PR rogntudju !
This sets the merge method to "merge" and the user is not aware of it.
Before this change, the order of PRs to list in an "unready" feedback
message was whatever the DB returned which could vary. This change
fixes the order by applying model order.
If a transient github failure makes the integration fail but also
makes the following reset fail the entire staging process would be
cancelled (and operations so far rollbacked) except for the failure
comment which would be effected, as in odoo/odoo#26380.
By pushing the comment to the feedback queue, if the reset fails the
comment is rollbacked and "unqueued".
Hopefully this finally fixes the double commenting
issue (e.g. odoo/odoo#28160). This seems (according to reading the
logs and also logic) like a failure (concurrency of some sort) leading
to a transaction rollback *after* the GH API call, so the cron's DB
actions are rollbacked (then performed again on the next run) *but*
the GH API interactions are not rollbacked, and are thus performed
twice.
Since the entire purpose of the feedback table is to send comments,
send both "staging failed" and "Linked pull requests" comments via
that.
Once more unto the breach, with the issue of pushing stagings (with
"closes" annotations) to the target branch making GH close the PR &
send the hook, which makes runbot_merge consider the PR closed and the
staging cancelled.
This probably still doesn't fix the issue, but it reduces the
problematic window: before this, the process first updates the
branches, then marks the PRs, then comments & closes the PRs, and
finally commits the PR update.
This means as runbot_merge is sending a comment & a status update to
each PR in a staging, GH has some time to send the "closed" webhook
behind its back, making the controller immediately cancel the current
staging, especially if the v3 endpoint is a bit slow.
By moving the commenting & closing out of the critical path (to the
feedback queue), this window should be significantly shortened.
Normally, two PRs from the same owner with the same branch
name (source) are batched together, and that's represented by batching
them by label e.g. two PRs labelled
`odoo-dev:12.0-snailmail-address-format-anp` means they probably
should be merged together somehow.
*However* this is an issue when editing via the web interface: if the
editor doesn't have write access to the repository, github
automatically createa a branch called `patch-<n>` under their
ownership, where `<n>` is a sequence (of sorts?) *within the user's
fork*.
This means it's possible (and indeed easy) to create <foo>:patch-1 on
different (non-forks) but related repositories without them actually
being co-dependent, at which point they get blocked on one
another, which can lead to them being blocked (period) due to runbot
not currently handling co-dependencies between PRs.
Currently, if a staging is ongoing or failed one has to hunt for the
staging branches on the runbot dashboard in order to find out what
happens.
This adds a dropdown to the staging box/block providing direct status
and access to all the CI information whether the CI is ongoing or done,
successful or not.
Apparently the split and move arounds caused the _tagstate to get
computed/updated earlier (/differently), and thus the tagging update
relying on it to... not work anymore.
At least restore adding a "merging" tag when a PR is staged.
rebase() can fail after merge(), during set_ref(), having already
updated the target.
Under the pre-rebase model, stage() assumed on a staging failure on a
given repo it only had to rollback stagings having succeeded. This
assumption fails in a post-rebase model as even a failed staging can
have modified the target, leading to the next staging (if multiple
batches are ready) containing the failed one.
Things can get really strange if the set_ref failure was (as it
probably is) some sort of transient failure, as the following staging
will likely succeed (under the assumption that most PRs/batches pass
staging) as PR1's content gets merged as part of PR2, and PR2 is
merged empty of content later on.
Original issue (staging would get cancelled just as it was being
merged) was not really fixed but traded for a new one: serialization
errors which can lock up the mergebot for a long time, stopping
handling of all incoming signals (possibly/probably because all of
them try to write on the PR which is locked?).
Splitting the tagging cron out should already way improve things as
the status update cron should be way shorter (and thus hold its locks
for a smaller amount of time). This should also avoid the "close"
handler waiting on the extant transaction, and make the "pr update"
transaction be much shorter as each staging gets its own tnx.
Send reponse comments when users mis-interact with robodoo e.g.
send comments they don't have the right to, or commands which don't make
sense in the PR's state, or tentative interactions with robodoo from
unmanaged PRs.
People get surprised/worried that their ready PR never gets picked up,
but it's because there is a non-ready (either unreviewed or failing CI)
pull request linked to it. This aims to at least warn them of the issue.
As noted in the old comment, the provided commit was the head of the
staging branch. This confused many users as they'd click the link
expecting to see their commit and potentially got something completely
unrelated.
Since we already get backlinks from the commit through the "closes
<pr_number>" added either by the committer or by the bot itself, the
information is already available.
They're completely independent (or should be), and there's no reason
for the tagging cron to extend the "lifetime" of the main cron's
transaction (and thus extend the odds of racey behaviour).
If a staging covers multiple repositories and there's a fast-forward
issue on any but the first repo/target, runbot_merge attempted to
revert the commits it had fast-forwarded on the previous repos.
This doesn't work when branch-protection is active, unless runbot_merge
is a repository administrator (and branch protection is not configured
to apply to those): reverting is done by push-forcing the original head
back onto the ref, which branch-protection unconditionally precludes.
This commit does not entirely fix the race condition (it does not look
like github provides any way to do that), but it should significantly
reduce the race-condition window as it performs a semi-wet run of the
fast-forward process on the tmp branches before actually updating the
targets. That way the only remaining breakage should be when somebody
pushes on repositories 1.. between the test-FF on tmp branches and the
actual fast forward.
While at it, updated the github API thing to *always* dump the JSON body
on an error response, if the content-type is json.
* [ADD] runbot_merge: more informative states to stagings on error
Currently, when a staging fails for other reasons than a CI failure:
* the staging having been cancelled is known implicitly, because the
staging will be deactivated but will never get a status beyond
pending (because it's not found when looking for it since it's not
`active`)
* the fast-forward having failed is completely silent (logging aside),
it looks for all the world like the staging succeeded
Timeout fails the PR already, but split-on-timeout was not so fix that
one bit.
* [FIX] odoo/odoo#cb2862ad2a60ff4ce66c14e7af2548fdf6fc5961
Closes#41
I just spent 10mn trying to find out why staging 28 was cancelled
(a p=0 comment). Add a common prefix to all staging cancels to make
them easier to find.
staging delay was mistakenly commented in
bb664455ec
Also modified testing fixtures so the staging delay is not enabled when
running tests locally: on my box it increases the local runtime from
~70s to ~1500s (20s/staging, ~1 staging/test, 73 tests)
Before this, the bot would only acknowledge commands of the form
<botname>: <commands>
but since the bot is an actual user, people regularly use `@<botname>`
as it seems like it should work *and* provides for autocompletion.
Support that, as well as the octothorpe in case users want to pound
robodoo.
Related to odoo/runbot#38
Continuation of fa94b269de which is
apparently not sufficient:
1. log the staging event so we can check that we're staging in the
correct order
2. add a delay after each staging in case there's some sort of race
in the updating of codependent repositories
When creating staging branches from tmp, use the iteration order of
the repos in the project (that way it's easy to see and eventually
configure if we add sequences or whatever, in the short term it's the
order in which the repos were added which is the one we want).
This ensures we stage odoo/odoo before we stage odoo/enterprise
without relying on dict order of iteration, or needing meta to be an
OrderedDict.
The issue is if stagings are created/updated the other way around, the
runbot may pick up staging on odoo/enterprise before odoo/odoo has
been updated, and thus build odoo/enterprise with the wrong odoo/odoo
commit, and defeat the entire point of it.
Example: http://runbot.odoo.com/runbot/build/376112 was triggered by
the same staging as http://runbot.odoo.com/runbot/build/376113, but
used the previous staging head.
The creation order of tmp branches should not matter so ignore it.
A limitation to 50 commits PRs was put in place to avoid rebasing
huge PRs (as a rebase means 1 merge + 1 commit *per source commit*),
however the way it was done would also limit regular merges, and the
way the limitation was implemented was not clear.
* explicitly check that limit in the rebase case
* make error message on PR sizes (rebase 50 or merge 250) clearer
* remove limit from commits-fetching (check it beforehand)
* add a test to merge >50 commits PRs
* fix the local implementation of pulls/:number/commits to properly
paginate
a0063f9df0 slightly improved the error
message on non-PR ci failure (e.g. a community PR makes enterprise
break) by adding the failed commit, but that's still not exactly clear,
even for technical users (plus it requires having access to all the
repos which is not the case for everyone).
This commit further improves the situation by storing the target_url
and description fields of the commit statuses, and printing out the
target_url on failure if it's present.
That way the PR comment denoting build failure should now have a link to
the relevant failed build on runbot, as that's the target_url it
provides.
The change is nontrivial as it tries to be compatible with both old and
new statuses storage format, such that there is no migration to perform.
e98a8caffb added dummy commits to the
heads of stagings and fixed most places to make a difference between
the staging head (including dummy commit) and the actual merge head,
but the difference was missed in the comment closing a PR, which was
still using the staging head and thus pointing to the dummy commit
e.g. (https://github.com/odoo/odoo/pull/26821#issuecomment-420244592)