Currently, if a PR forward-port PR gets detached the reason for it is
not always obvious, and may have to be hunted in the logs or in
"sibling" PRs.
By writing a forward port reason (hopefully) ever time we detach a PR,
and displaying that reason in the form and dashboard, the
justification should be a lot more obvious.
Fixes#679
When inserting a new branch, if there are extant forward ports
sequences spanning over the insertion, new forward ports must be
created for the new branch.
This was the case, *however* one of the cases was handled incorrectly:
PR batches (PRs to different repositories staged together) must be
forward-ported as batches. However when creating the "insert" batches,
these PRs would be split apart, leading to either inconsistent states
or an inability to merge the new forward ports (if they are
co-dependent which is a common case). This is because the handler
would look for all the relevant PRs to forward ports, but it would
fail to group them by label thereafter.
Fix that, create the `insert` batches correctly, and augment the
relevant test with an additional repository to test this case.
No gh issue was created, but this problem was reported on
odoo/odoo#110029 and odoo/enterprise#35838 (they were re-linked by
hand in the runbot and mergebot, but on github the labels remain
visibly different, one having the slug ZCwE while the other
hasO7Pr).
It's possible that the problem had occurred previously and not been
noticed (or reported), though it's also possible this issue had never
occurred before: for the latest freeze (16.1) there were 18 forward
ports spanning the insertion, but of them only 2 were the same batch.
It's almost certainly not useful, save as a minor convenience for
tests: decorrelating the branch sequence and the fp sequence seems
like it would only be extremely confusing, and on the mergebot all the
fp_sequence values are set to the default while the sequence values
are set to something useful and sensible (kinda).
Fixes#584
After review, there doesn't seem to be a single integer field created
by the mergebot or fortwardbot modules for which a `group_operator`
makes sense, let alone the default of `sum`.
So just disable them all.
Fixes#674
In case where the last branch (before the branch being frozen) is
disabled, the forwardport inserter screws up, and fails to correctly
create the intermediate forwardports from the new branch.
Also when disabling a branch, if there are FW PRs which target that
branch and have not been forward-ported further, automatically
forward-port them as if the branch had been disabled when they were
created, this should limit data loss and confusion.
Also change the message set on PRs when disabling a branch: because of
user conflicts in test setup, the message about a branch being
disabled would close the PRs, which would then orphan the followup,
leading to unexpected / inconsistent behaviour.
Fixes#665
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.
Turns out there are a few issues with this:
- the magic refs have become a bit unreliable, so we need a fallback
(b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
the magic refs live forever and diverge from all other content
(since we rarely `merge` PRs), for a large repo with lots of PRs
this leads to a significant inflation in size of repo (both on-disk
and objects count) e.g. odoo/odoo has about 25% more objects
with the pull refs included (3486550 to 4395319) and takes nearly
50% more space on disk (1.21G to 1.77)
As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.
Eventually this could be optimised further by
- performing an update of the cache repo at the start of the cron iff
there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
single `fetch`
- performing all the ports by cloning the temp clone (and not
`fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
forward port clones
Small followup for #489
The current system makes / lets GC run during fetching. This has a few
issues:
- the autogc consumes resources during the forward-porting
process (not that it's hugely urgent but it seems unnecessary)
- the autogc commonly fails due to the combination of large repository
(odoo/odoo) and low memory limits (hardmem for odoo, which get
translated into soft ulimits)
As a result, the garbage collection of the repository sometimes stops
entirely, leading to an increase in repository size and a decrease in
performances.
To mitigate this issue, disable the automagic gc and maintenance
during normal operation, and instead add a weekly cron which runs an
aggressive GC with memory limits disabled (as far as they can get, if
the limits are imposed externally there's nothing to be done).
The maintenance is implemented using a full lockout of the
forward-port cron and an in-place GC rather than a copy/gc/swap, as
doing this maintenance at the small hours of the week-end (sat-sun
night) seems like a non-issue: currently an aggressive GC of odoo/odoo
(using the default aggressive options) takes a total of 2:30 wallclock
(5h user) on a fairly elderly machine (it's closer to 20mn wallclock
and 2h user on my local machine, also turns out the cache repos are
kinda badly configured leading to ~30% more objects than necessary
which doesn't help).
For the record, a fresh checkout of odoo/odoo right now yields:
| Overall repository size | |
| * Commits | |
| * Count | 199 k |
| * Total size | 102 MiB |
| * Trees | |
| * Count | 1.60 M |
| * Total size | 2.67 GiB |
| * Total tree entries | 74.1 M |
| * Blobs | |
| * Count | 1.69 M |
| * Total size | 72.4 GiB |
If this still proves insufficient, a further option would be to deploy
a "generational repacking" strategy:
https://gitlab.com/gitlab-org/gitaly/-/issues/2861 (though apparently
it's not yet been implemented / deployed on gitlab so...).
But for now we'll see how it shakes out.
Close#489
- avoid pinging the author of the fw PR (which is the forward-bot
itself)
- instead ping the author and reviewer of the source, and possibly the
reviewer of the PR if any
- might also be a good idea to ping reviewers of intermediate PRs?
Github can fail to create the magic refs on PRs
(`pull/refs/?/head`). Since forwardport relies on these refs to fetch
PR content this is an issue when it occurs, as the forward ports fail
in a loop.
After discussion with Github support, it turns out Github enabled
`allowReachableSHA1InWant` a while back, meaning it's possible to
fetch content by commit (rather than ref) as long as the content is
"in network".
Use this property as fallback when checking if we can see the PR head
before forward porting.
Also:
- remove explicit configuration of GC during fetch, it doesn't disable
the autogc (yet?) but that's likely going to happen anyway
- update logging and logger hierarchy during forward port to make
things clearer and easier to extract, although based on PR id rather
than number
- rate limit failing forward ports to avoid running them on every cron
(~ every minute), run them every ~30mn instead, this provides higher
odds of recovery with less log garbage in case of transient github
failure, and if the PR is stuck it limits the log pollution
Fixes#658
If no stream data was captured (no stderr and no stdout), would just
log
git call error:
as error, with no further information.
Don't do that if we have neither stderr nor stdout data, since we're
re-raising the exception anyway, it's just confusing.
- if stderr was empty or had been redirected to stdout, no useful
information would be show, making debugging more complicated
- the fallback is the error itself, but since it's reraised odds are
pretty high the caller will eventually log the error itself, so
it's redundant
=> fallback to stdout if stderr is empty, and only log if either is
non-empty
- if stderr has been rerouted or explicitely rerouted to STDOUT,
`e.stderr` is `None` and the error reporting blows up (which is
inconvenient). Handle this case.
- handle the case where `fp_github_name` has not been configured (it's
not a super useful handling but meh, apparently git/hub doesn't
really care if there's a username when using API tokens)
- minor improvement to the refline parser: per-spec, the trailing
newline is optional, so don't fail if it's missing
Check remains a tad dodgy, but since we're actually porting from
`root` (the earliest ancestor of `self`) it makes more sense to
sanity-check that *its* commits remain visible.
Also log that as it makes a tad more sense, hopefully.
Closes#600
- trying to r+ a detached PR *via the forwardbot* should warn, same as
a non-forwardport PR
- the following sibling of a closed PR should be detached from
it (probably)
- when a closed forward-port PR is reopened, there should be a
notification that it is detached and merged via mergebot
Fixes#617
Existing conflict style is the local default ("merge", most
likely). `diff3` is a lot more informative as it provides the common
ancestor's code for the hunk, which helps see how the two branches
diverged and thus resolve the conflict.
Even better would be zdiff3 but that's a bit too recent...
Fixes#619
Old messages were quite inconsistent in their pinging of the PR author
and reviewer.
Reviewed messages (probably missed some but...) and try to more
consistently ping when the feedback requires some sort of action in
order to proceed.
Fixes#592
The list of outstanding forwardports was pretty messy as the ordering
was unclear and there was little way to really drill into the thing.
* Shows outstanding forward ports sorted by merged date ascending, the
oldest-merged PRs are the ones most in need of fixing while PRs
which were only just merged can safely be ignored.
* List reviewers with outstanding forward-ports, allow filtering by
clicking on their name, allow deseleting through the subtitle of the
page.
* Don't display reviewer in list when page is already filtered by
reviewer.
Also improve PR page a bit:
* Add reviewer.
* Add direct link to backend (closes#524).
Closes#529
The forward-port process currently automatically adds delegates of a
PR as delegates on its forward ports, but that only works for
the *source* pull request.
If a delegate is added to a forward-port, they were not able to
approve the followups to that initial port, which makes little sense.
Fixes#548
When using the forwardport's shortcut, the bot would not skip
already-approved PRs leading to a warning from the mergebot that the
PR was already approved (out of nowhere which was weird).
During the walk to the ancestors, skip any PR which is not
approvable (either already approved or in a state where that makes no
sense e.g. closed).
Fixes#543
Because sometimes github updates are missed (usually because github
never triggers it), it's possible for the mergebot's view of a PR
description to be incorrect. In that case, the PR may get merged with
the wrong merge message entirely, through no fault of the user.
Since we already fetch the PR info when staging it, there's very
little overhead to checking that the PR message we store is correct
then, and update it if it's not. This means the forward-port's
description should also be correct.
While at it, clean the forward port PR's creation a bit:
- there should always be a message since the title is required on
PRs (only the body can be missing), therefore no need to check that
- as we're adding a bunch of pseudo-headers, there always is a body,
no need for the condition
- inline the `pr_data` and `URL`: they were extracted for the support
of draft PRs, since that's been removed it's now unnecessary
Fixes#530
On conflicts in multi-commit PRs developers sometimes get confused as
to what happened why.
If a conflict occurs and the source pull request had multiple
commits, list all the source commits and show which one broke.
Related to #505
* Remove the forwardport creating PRs in draft, that was mostly to
avoid codeowners triggering but we've removed the github one and
hand-rolled it, so not a concern anymore.
* Prevent merging `draft` PRs, the mergebot rejects approval on draft
PRs and insults people.
TBD (maybe): try to create *conflicting* forward-port PRs in draft so
it's clearer they need to be *fixed*? Issue of not being able to do
that on all private repositories remains so~~
Fixes#500
Proper attribution is important in general, but especially for
external contributors. Before this change, and the previous change
fixing authorship deduplication, it was rather easy for a "squashed"
conflict commit (attributed to the 'bot for lack of a really clean
option) to get merged by mistake.
This commit changes two things:
* The mergebot now refuses to stage commits without an email set, the
github API rejects those commits anyway so any integration mode
other than `merge` would fail, just with a very unclear error
* The forwardbot now creates commits with an empty author / committer
email when the pull request as a whole has multiple authors /
committers. This leverages the mergebot update.
Also clean up the staging process to provide richer error reporting
using bespoke exceptions instead of simple assertions. I'm not sure
we've ever encountered most of these errors so they're really sanity
checks but the old reporting would be... less than great.
Fixes#505
a45f7260fa had intended to use the
original authorship information for conflict commit even if there were
multiple commits, as long as there was only one author (/ committer)
for the entire sequence.
Sadly the deduplication was buggy as it took the *authorship date* in
account, which basically ensured commits would never be considered as
having the same authorship outside of tests (where it was possible for
commits to be created at the same second).
Related to #505
Though the forwardport posts regular reminders that an fw is outdated,
it can be easy to miss for the non-subject (and apparently the
subjects often just ignore the information entirely).
Add a few relevant links there:
* on PR pages, add a link to either the source or the
forward-ports (if applicable), as well as the merge date
* add a new page which lists all the PRs with outstanding
forwardports, as well as the forwardports in question
Fixes#474
Don't try to parse the response as JSON in the error case(s): if the
errors are bad enough github can return complete non-parseable
garbage.
Only access the "text" property (response body, decoded, but unparsed)
in error cases, only parse in the success case.
Also avoid reusing variables for completely different values, even if
they're of the same type, especially if they can overlap.
fixes#470
Initial thinking was to remove the check entirely and leave it to the
mergebot, but the lack of error reporting / forwarding means while
technically correct it would probably be somewhat difficult to grok.
Instead, improve the error reporting:
* add a dedicated message when trying to r+ via fwbot on a non-fw
PR (note: maybe the fwbot should not care? and just send it as-is
to the mergebot in that case?)
* clarify the ACL error
* post both message as the forwardbot rather than the mergebot
Also add a missing token note for the feedback from the forwardport
limit.
fix#469
There was already a check, but the way the check behaved
means *detached* PRs would not be prevented from setting their
forward-port, despite that not doing anything.
Fix it by checking if the current PR has a source, not a parent.
Fixes#465
On edition of an intermediate PR in a chain, merging the PR would lead
to *it* being forward-ported, duplicating the PRs already created
from *its* source.
Add a check for PRs in the target branch with the same source,
suppresses the forward-porting of the newly merged PR.
Fixes#451 (hopefully)
append `git status` data to stderr, should be somewhat more
informative especially when a conflict is a DU (where the file has
been deleted on one side, so there is no conflict marker anywhere).
Fixes#461
- When updating the local repo cache, always capture both stdout and
stderr and log them out rather than having them in journalctl hard
to relate to the main log.
- In the git layer, capture stderr by default and log it automatically
on command failure.
* fix small error which generated an extra commit in case of conflict
probably (would create a `temp` commit, then put the conflict
information on an empty commit on top of it). Avoids committing the
cherrypick though a probable alternative would be to commit the
message with `amend`.
* improve the cherrypick header: instead of one log line per commit,
put all commits within the sole header log line (with a newline),
should make things less noisy
* put *all* log records below the correct logger (two of them were on
the toplevel logger)
* log a purported inflateInit error as warning, should be sent to
Sentry instead of having to wait for people to wonder why the thing
is completely broken
Before this change, a CI override would have to be replicated on most
/ all forward-ports of the base PR. This was intentional to see how it
would shake out, the answer being that it's rather annoying.
Also add a `statuses_full` computed field on PRs for the aggregate
status: the existing `statuses` field is just a copy of the commit
statuses which I didn't remember I kept free of the overrides so the
commit statuses could be displayed "as-is" in the backend (the
overrides are displayed separately). And while at it fix the PR
dashboard to use that new field: that was basically the intention but
then I went on to use the "wrong" field hence #433.
Mebbe the UI part should be displayed using a computed M2M (?)
as a table or as tags instead? This m2m could indicate whether the
status is an override or an "intrinsic" status.
Also removed some dead code:
* leftover from the removed tagging feature (removed the tag
manipulation but forgot some of the setup / computations)
* unused local variables
* an empty skipped test case
Fixes#439.
Fixes#433.
When a new branch is created between other branches, the process to
try and look for forward-port PRs to create to "fill in" currently has
no logging so it's very difficult to figure out why it decides not to
do something.
Add some logging to the process to try and better understand what
happens.
Fixes#441
On a cherrypick failing due to renamelimit issues, the cherrypick
would be retried after resetting the target to its *original* commit.
This only works correctly if the first commit works after a retry, if
a latter commit has to be retried then it gets re-cherry-picked onto
the target branch rather than its own parent.
Fix by remembering the previously successful cherry-picked commit and
resetting to *that*. However I can't really test it because it's
rather hard to get into a situation where the rename detection fails
using synthetic tests.
While at it, clean the logs by stripping the "performing inexact
rename detection" stuff from all stderr (both the CherrypickError from
which it was already stripped and the debug messages).
Fixes#444
Adds an `override` mergebot command. The ability to override is set on
an individual per-context per-repository basis, similar to but
independent from review rights. That is, a given individual may be
able to override the status X on repository A and unable to do so on
repository B.
Overrides are stored in the same format as regular statuses, but
independent from them in order to persist them across builds.
Only PR statuses can be overridden, statuses which are overridable on
PRs would simply not be required on stagings.
An alternative to implementing this feature in the mergebot would be
to add it to individual status-generating tools on a per-need
basis.
Pros of that alternative:
* display the correct status on PRs, currently the PR will be failing
status-wise (on github) but correct as far as the mergebot is
concerned
* remove complexity from the mergebot
Cons of that alternative:
* each status-generating tool would have to implement some sort of ACL
system
* each status-generating tool would have to receive & parse PR
comments
* each status-generating tool would have to maintain per-pr state in
order to track overrides
Some sort of helper library / framework ought make that rather easy
though. It could also be linked into the central provisioning system
thing.
Closes#376
The exponential backoff offsets from the write_date of the children
PRs, however it doesn't reset, so the offsetting gets bumped up way
more than originally expected or designed if the child PRs are under
active development for some reason.
Fix this by adding a field to specifically record the date of merge of
a PR, and check that feature against the backoff offset. This should
provide more regular and reliable backoff.
Fixes#369
Given a PR batch getting forward-ported together, if one of the PRs
has a conflict the others should be considered "in conflict" as well,
and should have a note pointing in that direction and indicating that
the PR should be approved the normal way eventually. Which they do.
However, the message is confusing as it gets bolted on the normal
non-conflicting message, either noting that it's part of a chain
or (worse because it gives conflicting indication) the "terminal"
message recommending using the forwardbot to approve of the entire
chain.
I've no idea why I did it that way instead of just adding a case to
the conditional, and the commit message provides no indication. But
perform that change, it seems innocuous, hopefully there weren't good
reasons I forgot about for doing it the other way around.
Fixes#367
Provides a `skipci` command to PR reviewers. This makes it so the
followup PRs (after the first one) get created immediately, without
waiting for CI to succeed on a given forward-port PR.
This can be useful if for some reason a change *must* be merged in
branch N+1 before it can be merged in branch N.
Fixes#363