Previously the mergebot would only sync the head commit, but synching
more is useful.
Also update the final sanity check on staging:
- as with check, update the message & target branch
- reset PR state and post a message when updating message instead of
doing so silently
Note: maybe only fail the staging if the message is updated *and*
relevant to staging (aka there's a merge method and it's not
`rebase`)?
Fixes#680
Currently, if a PR forward-port PR gets detached the reason for it is
not always obvious, and may have to be hunted in the logs or in
"sibling" PRs.
By writing a forward port reason (hopefully) ever time we detach a PR,
and displaying that reason in the form and dashboard, the
justification should be a lot more obvious.
Fixes#679
When inserting a new branch, if there are extant forward ports
sequences spanning over the insertion, new forward ports must be
created for the new branch.
This was the case, *however* one of the cases was handled incorrectly:
PR batches (PRs to different repositories staged together) must be
forward-ported as batches. However when creating the "insert" batches,
these PRs would be split apart, leading to either inconsistent states
or an inability to merge the new forward ports (if they are
co-dependent which is a common case). This is because the handler
would look for all the relevant PRs to forward ports, but it would
fail to group them by label thereafter.
Fix that, create the `insert` batches correctly, and augment the
relevant test with an additional repository to test this case.
No gh issue was created, but this problem was reported on
odoo/odoo#110029 and odoo/enterprise#35838 (they were re-linked by
hand in the runbot and mergebot, but on github the labels remain
visibly different, one having the slug ZCwE while the other
hasO7Pr).
It's possible that the problem had occurred previously and not been
noticed (or reported), though it's also possible this issue had never
occurred before: for the latest freeze (16.1) there were 18 forward
ports spanning the insertion, but of them only 2 were the same batch.
It's almost certainly not useful, save as a minor convenience for
tests: decorrelating the branch sequence and the fp sequence seems
like it would only be extremely confusing, and on the mergebot all the
fp_sequence values are set to the default while the sequence values
are set to something useful and sensible (kinda).
Fixes#584
After review, there doesn't seem to be a single integer field created
by the mergebot or fortwardbot modules for which a `group_operator`
makes sense, let alone the default of `sum`.
So just disable them all.
Fixes#674
In case where the last branch (before the branch being frozen) is
disabled, the forwardport inserter screws up, and fails to correctly
create the intermediate forwardports from the new branch.
Also when disabling a branch, if there are FW PRs which target that
branch and have not been forward-ported further, automatically
forward-port them as if the branch had been disabled when they were
created, this should limit data loss and confusion.
Also change the message set on PRs when disabling a branch: because of
user conflicts in test setup, the message about a branch being
disabled would close the PRs, which would then orphan the followup,
leading to unexpected / inconsistent behaviour.
Fixes#665
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.
Turns out there are a few issues with this:
- the magic refs have become a bit unreliable, so we need a fallback
(b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
the magic refs live forever and diverge from all other content
(since we rarely `merge` PRs), for a large repo with lots of PRs
this leads to a significant inflation in size of repo (both on-disk
and objects count) e.g. odoo/odoo has about 25% more objects
with the pull refs included (3486550 to 4395319) and takes nearly
50% more space on disk (1.21G to 1.77)
As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.
Eventually this could be optimised further by
- performing an update of the cache repo at the start of the cron iff
there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
single `fetch`
- performing all the ports by cloning the temp clone (and not
`fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
forward port clones
Small followup for #489
The current system makes / lets GC run during fetching. This has a few
issues:
- the autogc consumes resources during the forward-porting
process (not that it's hugely urgent but it seems unnecessary)
- the autogc commonly fails due to the combination of large repository
(odoo/odoo) and low memory limits (hardmem for odoo, which get
translated into soft ulimits)
As a result, the garbage collection of the repository sometimes stops
entirely, leading to an increase in repository size and a decrease in
performances.
To mitigate this issue, disable the automagic gc and maintenance
during normal operation, and instead add a weekly cron which runs an
aggressive GC with memory limits disabled (as far as they can get, if
the limits are imposed externally there's nothing to be done).
The maintenance is implemented using a full lockout of the
forward-port cron and an in-place GC rather than a copy/gc/swap, as
doing this maintenance at the small hours of the week-end (sat-sun
night) seems like a non-issue: currently an aggressive GC of odoo/odoo
(using the default aggressive options) takes a total of 2:30 wallclock
(5h user) on a fairly elderly machine (it's closer to 20mn wallclock
and 2h user on my local machine, also turns out the cache repos are
kinda badly configured leading to ~30% more objects than necessary
which doesn't help).
For the record, a fresh checkout of odoo/odoo right now yields:
| Overall repository size | |
| * Commits | |
| * Count | 199 k |
| * Total size | 102 MiB |
| * Trees | |
| * Count | 1.60 M |
| * Total size | 2.67 GiB |
| * Total tree entries | 74.1 M |
| * Blobs | |
| * Count | 1.69 M |
| * Total size | 72.4 GiB |
If this still proves insufficient, a further option would be to deploy
a "generational repacking" strategy:
https://gitlab.com/gitlab-org/gitaly/-/issues/2861 (though apparently
it's not yet been implemented / deployed on gitlab so...).
But for now we'll see how it shakes out.
Close#489
- avoid pinging the author of the fw PR (which is the forward-bot
itself)
- instead ping the author and reviewer of the source, and possibly the
reviewer of the PR if any
- might also be a good idea to ping reviewers of intermediate PRs?
Github can fail to create the magic refs on PRs
(`pull/refs/?/head`). Since forwardport relies on these refs to fetch
PR content this is an issue when it occurs, as the forward ports fail
in a loop.
After discussion with Github support, it turns out Github enabled
`allowReachableSHA1InWant` a while back, meaning it's possible to
fetch content by commit (rather than ref) as long as the content is
"in network".
Use this property as fallback when checking if we can see the PR head
before forward porting.
Also:
- remove explicit configuration of GC during fetch, it doesn't disable
the autogc (yet?) but that's likely going to happen anyway
- update logging and logger hierarchy during forward port to make
things clearer and easier to extract, although based on PR id rather
than number
- rate limit failing forward ports to avoid running them on every cron
(~ every minute), run them every ~30mn instead, this provides higher
odds of recovery with less log garbage in case of transient github
failure, and if the PR is stuck it limits the log pollution
Fixes#658
If no stream data was captured (no stderr and no stdout), would just
log
git call error:
as error, with no further information.
Don't do that if we have neither stderr nor stdout data, since we're
re-raising the exception anyway, it's just confusing.
- if stderr was empty or had been redirected to stdout, no useful
information would be show, making debugging more complicated
- the fallback is the error itself, but since it's reraised odds are
pretty high the caller will eventually log the error itself, so
it's redundant
=> fallback to stdout if stderr is empty, and only log if either is
non-empty
Get mergebot updates from since the runbot was upgraded.
NOTE: updates forwardport.models.forwardport.Queue with slots for
compatibility with commit
odoo/odoo@ea3e39506a "use slots for
BaseModel", otherwise we get
TypeError: __bases__ assignment: 'ForwardPortTasks' object layout differs from 'BaseModel'
- if stderr has been rerouted or explicitely rerouted to STDOUT,
`e.stderr` is `None` and the error reporting blows up (which is
inconvenient). Handle this case.
- handle the case where `fp_github_name` has not been configured (it's
not a super useful handling but meh, apparently git/hub doesn't
really care if there's a username when using API tokens)
- minor improvement to the refline parser: per-spec, the trailing
newline is optional, so don't fail if it's missing
Stop *staging* release PRs: they are normally fairly simple and should
not fail their staging outside of unreliable tests (or possibly a few
edge cases e.g. forgot one version change thing), however staging them
creates the possibility of a "version hole" on the release branch
which is undesirable.
Instead, immediately and unconditionally push the release commits onto
the newly created branches, if there are things which don't work they
can be fixed afterwards (and the process refined, maybe).
Also add the same feature for *bump* PRs, with the difference that the
bump PRs are not created / requested by default (they have to be opted
in individually).
For convenience, add a feature which automatically finds the PRs via
inputting the label (not really tested yet).
Closes#603
Check remains a tad dodgy, but since we're actually porting from
`root` (the earliest ancestor of `self`) it makes more sense to
sanity-check that *its* commits remain visible.
Also log that as it makes a tad more sense, hopefully.
Closes#600
- trying to r+ a detached PR *via the forwardbot* should warn, same as
a non-forwardport PR
- the following sibling of a closed PR should be detached from
it (probably)
- when a closed forward-port PR is reopened, there should be a
notification that it is detached and merged via mergebot
Fixes#617
Existing conflict style is the local default ("merge", most
likely). `diff3` is a lot more informative as it provides the common
ancestor's code for the hunk, which helps see how the two branches
diverged and thus resolve the conflict.
Even better would be zdiff3 but that's a bit too recent...
Fixes#619
Old messages were quite inconsistent in their pinging of the PR author
and reviewer.
Reviewed messages (probably missed some but...) and try to more
consistently ping when the feedback requires some sort of action in
order to proceed.
Fixes#592
On two of the freezes, thereafter the logs showed serial crashes in
the forwardport cron when trying to find the insertion point for a new
forward-port.
The first time was not really diagnosed, the second time the cause was
found to be a retargeted PR which led to a failure of the "insertion"
forward port, which did not take that possibility in account (it
assumed -- sensibly I believed -- that an intermediate FP following a
branch insertion would always succeed, sadly the malevolent universe
had other plans).
So only insert the new forward port inside its sequence (if necessary)
if the forward port actually succeeded, otherwise ignore it.
Fixes#551
The freeze wizard has support for merging freeze / release PRs on each
of the newly created branches. But since this would be done by, well,
merging, those PRs would get forward-ported to master, and would have
to be closed there.
This creates additional work for the freeze master, and noise /
parasitic PRs.
Obviously it's possible for the freeze master to set some nonsense `up
to` (nonsense because the "real" limit doesn't exist yet at that
point), but really it never makes any sense to forward port release
PRs, so the wizard should do it.
Provides a less manual interface for creating the freeze:
* takes the name of the branch to create
* takes any number of PRs which must be part of the freeze
* takes PRs representing the HEADs of the new branches
Then essentially takes care of the test.
Implementation of the actual wizard is not trivial but fairly
straightforward and linear, biggest issue is not being able to
`project_id.branch_ids[1]` to get the new branch, not sure why but it
seems to ignore the ordering, clearing the cache doens't fix it.
When creating the branches, add a sleep after each one for secondary
rate limiting purposes. Same when deleting branches.
Also the forwardbot has been updated to disable the forwardport cron
while a freeze is ongoing, this simplifies the freezing process.
Note: after recommendation of @aab-odoo, tried using `_applyChanges`
in `_checkState` but it simply did not work: the two relational fields
got completely frozen and were impossible to update, which is less
than ideal. Oh well, hopefully it works well enough like this for now.
The list of outstanding forwardports was pretty messy as the ordering
was unclear and there was little way to really drill into the thing.
* Shows outstanding forward ports sorted by merged date ascending, the
oldest-merged PRs are the ones most in need of fixing while PRs
which were only just merged can safely be ignored.
* List reviewers with outstanding forward-ports, allow filtering by
clicking on their name, allow deseleting through the subtitle of the
page.
* Don't display reviewer in list when page is already filtered by
reviewer.
Also improve PR page a bit:
* Add reviewer.
* Add direct link to backend (closes#524).
Closes#529
Normally when a forwardport is updated the forward-bot cascades the
update onto its followups (if it has any), but takes care to keep the
followups attached as they were not updated "by hand".
In the case of odoo/odoo#77677 however that did not work and the
followup PRs got detached. Looking at the logs, it becomes flagrant
that there was a race condition: either Git took a long time to
respond to the push, or there was an IO slowdown which led to the
"local update" taking a very long time. Either way this allowed the
"synchronize" webhook from github to arrive before the current
transaction was committed, rolling back said transaction and making
the forwardbot assume this was a "real" sync and detach the followup
from its parent.
Locking the PR row up-front ought fix the issue, and also move the
local update to before having pushed: the "extra" commits in the local
cache don't matter too much even if pushing to github fails, they'll
be cleaned up by a GC eventually.
Also migrate the `-f` on push to `--force-if-includes` in order to
avoid possible race conditions on the push (since we're not fetching
the current branch, use the full-syntax explicit CAS form, that's
exactly what we're looking for anyway).
Fixes#541 (hopefully)
The forward-port process currently automatically adds delegates of a
PR as delegates on its forward ports, but that only works for
the *source* pull request.
If a delegate is added to a forward-port, they were not able to
approve the followups to that initial port, which makes little sense.
Fixes#548
When using the forwardport's shortcut, the bot would not skip
already-approved PRs leading to a warning from the mergebot that the
PR was already approved (out of nowhere which was weird).
During the walk to the ancestors, skip any PR which is not
approvable (either already approved or in a state where that makes no
sense e.g. closed).
Fixes#543
Because sometimes github updates are missed (usually because github
never triggers it), it's possible for the mergebot's view of a PR
description to be incorrect. In that case, the PR may get merged with
the wrong merge message entirely, through no fault of the user.
Since we already fetch the PR info when staging it, there's very
little overhead to checking that the PR message we store is correct
then, and update it if it's not. This means the forward-port's
description should also be correct.
While at it, clean the forward port PR's creation a bit:
- there should always be a message since the title is required on
PRs (only the body can be missing), therefore no need to check that
- as we're adding a bunch of pseudo-headers, there always is a body,
no need for the condition
- inline the `pr_data` and `URL`: they were extracted for the support
of draft PRs, since that's been removed it's now unnecessary
Fixes#530
On conflicts in multi-commit PRs developers sometimes get confused as
to what happened why.
If a conflict occurs and the source pull request had multiple
commits, list all the source commits and show which one broke.
Related to #505
* Remove the forwardport creating PRs in draft, that was mostly to
avoid codeowners triggering but we've removed the github one and
hand-rolled it, so not a concern anymore.
* Prevent merging `draft` PRs, the mergebot rejects approval on draft
PRs and insults people.
TBD (maybe): try to create *conflicting* forward-port PRs in draft so
it's clearer they need to be *fixed*? Issue of not being able to do
that on all private repositories remains so~~
Fixes#500
Proper attribution is important in general, but especially for
external contributors. Before this change, and the previous change
fixing authorship deduplication, it was rather easy for a "squashed"
conflict commit (attributed to the 'bot for lack of a really clean
option) to get merged by mistake.
This commit changes two things:
* The mergebot now refuses to stage commits without an email set, the
github API rejects those commits anyway so any integration mode
other than `merge` would fail, just with a very unclear error
* The forwardbot now creates commits with an empty author / committer
email when the pull request as a whole has multiple authors /
committers. This leverages the mergebot update.
Also clean up the staging process to provide richer error reporting
using bespoke exceptions instead of simple assertions. I'm not sure
we've ever encountered most of these errors so they're really sanity
checks but the old reporting would be... less than great.
Fixes#505
a45f7260fa had intended to use the
original authorship information for conflict commit even if there were
multiple commits, as long as there was only one author (/ committer)
for the entire sequence.
Sadly the deduplication was buggy as it took the *authorship date* in
account, which basically ensured commits would never be considered as
having the same authorship outside of tests (where it was possible for
commits to be created at the same second).
Related to #505
Though the forwardport posts regular reminders that an fw is outdated,
it can be easy to miss for the non-subject (and apparently the
subjects often just ignore the information entirely).
Add a few relevant links there:
* on PR pages, add a link to either the source or the
forward-ports (if applicable), as well as the merge date
* add a new page which lists all the PRs with outstanding
forwardports, as well as the forwardports in question
Fixes#474
Don't try to parse the response as JSON in the error case(s): if the
errors are bad enough github can return complete non-parseable
garbage.
Only access the "text" property (response body, decoded, but unparsed)
in error cases, only parse in the success case.
Also avoid reusing variables for completely different values, even if
they're of the same type, especially if they can overlap.
fixes#470
Initial thinking was to remove the check entirely and leave it to the
mergebot, but the lack of error reporting / forwarding means while
technically correct it would probably be somewhat difficult to grok.
Instead, improve the error reporting:
* add a dedicated message when trying to r+ via fwbot on a non-fw
PR (note: maybe the fwbot should not care? and just send it as-is
to the mergebot in that case?)
* clarify the ACL error
* post both message as the forwardbot rather than the mergebot
Also add a missing token note for the feedback from the forwardport
limit.
fix#469
If a PR is updated and has extent forward-ports, those forwardports
get updated automatically ("followup").
However there is an issue if the udpate causes a conflict in the
followup: the conflict gets silently pushed, and may fairly easily get
merged if it occurs in an area which the CI doesn't cover.
It's unclear what the policy really should be for this issue, and
there is no real way to *block* a pull request at the moment (save by
putting it in error at the mergebot level I guess?), so for now
clearly notify the user on both the modified PR and the followup,
with a comment on both.
We may want to revisit this eventually.
Fixes#467
There was already a check, but the way the check behaved
means *detached* PRs would not be prevented from setting their
forward-port, despite that not doing anything.
Fix it by checking if the current PR has a source, not a parent.
Fixes#465
Fix#457 hopefully: I didn't manage to repro / create a test for.
It looks like in some cases during the update process the PR ref lags
behind the branch itself. This means `forwardport.updates` creates a
new commit, pushes it, then on the next iteration updates the local
cache, tries to find the commit we just pushed... and that fails.
I can only assume this is because when there's enough load on the
github side the update to the `info/refs` pseudo-file can fall
behind (it's now 4MB and holding nearly 65k refs).
So cheat: take the commit we just pushed to the dev remote
and... immediately push it to the local cache under a dummy branch,
which we delete. Since we only gc "1 day ago" this should not vacuum.
On edition of an intermediate PR in a chain, merging the PR would lead
to *it* being forward-ported, duplicating the PRs already created
from *its* source.
Add a check for PRs in the target branch with the same source,
suppresses the forward-porting of the newly merged PR.
Fixes#451 (hopefully)
append `git status` data to stderr, should be somewhat more
informative especially when a conflict is a DU (where the file has
been deleted on one side, so there is no conflict marker anywhere).
Fixes#461
- When updating the local repo cache, always capture both stdout and
stderr and log them out rather than having them in journalctl hard
to relate to the main log.
- In the git layer, capture stderr by default and log it automatically
on command failure.
The process did properly update the state, but not the squash state.
It's somewhat unclear whether the state should be fully reset and
require reapproval though. Maybe only the validation should be reset?
The CI will eventually run and either succeed (re-validating) or
fail (devalidating, hopefully) but I'm not entirely sure this is
correct.
* fix small error which generated an extra commit in case of conflict
probably (would create a `temp` commit, then put the conflict
information on an empty commit on top of it). Avoids committing the
cherrypick though a probable alternative would be to commit the
message with `amend`.
* improve the cherrypick header: instead of one log line per commit,
put all commits within the sole header log line (with a newline),
should make things less noisy
* put *all* log records below the correct logger (two of them were on
the toplevel logger)
* log a purported inflateInit error as warning, should be sent to
Sentry instead of having to wait for people to wonder why the thing
is completely broken