Implemented rebase (-ish?) by hand, following the model of the one
built on the github API, so we have all the information we need *and*
still don't need a local working copy.
Had to update *several* tests due to formatting changes from using the
CLI git (maybe that could be avoided by feeding the commit message
through stdin but w/e).
Also added a dupe of an existing helper for the formatting changes
bit, will have to remove it once all's done.
If the primary email is made public, it is returned directly as part
of the /users endpoint, in which case we don't need to fetch it via
/user/emails.
Also improve error messages, and fix the incorrect checks on the
existence of the github name and email. And allow manually updating
both via the project form.
Move *almost* all the staging code to free functions, in a separate
module, and extensively typed.
The only bits which didn't move are:
- the entry point (the cron hook), because it has to be a model method
in order to be called
- the `_build_merge_message` method, because it needs to be
overridable
There's also a bit of an import mess, because the cron &
`_build_merge_message` need to call into the new module, but the new
module wants the types they belong to, so it's a bit circular.
If the stagings are going to be created locally (via a git working
copy rather than the github API), the mergebot part needs to have
access to the cache, so move the cache over. Also move the maintenance
cron.
In an extermely minor way, this prefigures the (hopeful) eventual
merging of the ~~planes~~ modules.
In 81ce4ea02b the delta for PRs being
listed in the `/forwardport/outstanding` page was increased from 3
days to 7 (1 week), however the warning box in the home page still
used the old cutoffs leading to
An inconsistency between the two and an effective severe overcounting,
as the reason why the cutoff was increased to a week is forward ports
can take a while or the author / reviewer can be a touch busy at end
of week, so 3~4 days are routine when a PR is merged on thursday or
friday (and even worse if there's bank holidays in the mix).
Fix outstanding query to make a positive `state` filtering, instead of
negative, matching 3b52b1aace8674259812a76b1566260937dbcacb.
Also manually create a map of stagings (grouped by branch) sharing a
single prefetch set.
For odoo the mergebot home page has 12 branches in the odoo project
and 8 in spreadsheet, 6 stagings each. This means 120 queries to
retrieve all the heads (Odoo stagings have 5 heads and spreadsheet
have 1, but that seems immaterial).
By fixing `_compute_statuses` and creating a single prefetch set for
all stagings of all branches we can fetch all the commits in a single
query instead of 120.
- add support for authorship (not just approval)
- make display counts directly
- fix `state` filter: postgres can't do negative index lookups
- add indexes for author and reviewed_by as we look them up
- ensure we handle the entire source filtering via a single subquery
Closes#778
Currently sentry is only hooked from the outside, which doesn't
necessarily provide sufficiently actionable information.
Add some a few hooks to (try and) report odoo / mergebot metadata:
- add the user to WSGI transactions
- add a transaction (with users) around crons
- add the webhook event info to webhook requests
- add a few spans to the long-running crons, when they cover multiple
units per iteration (e.g. a span per branch being staged)
Closes#544
Current system makes it hard to iterate feedback messages and make
them clearer, this should improve things a touch.
Use a bespoke model to avoid concerns with qweb rendering
complexity (we just want GFM output and should not need logic).
Also update fwbot test setup to always configure an fwbot name, in
order to avoid ping messages closing the PRs they're talking
about, that took a while to debug, and given the old message I assume
I'd already hit it and just been too lazy to fix. This requires
updating a bunch of tests as fwbot ping are sent *to*
`fp_github_name`, but sent *from* the reference user (because that's
the key we set).
Note: noupdate on CSV files doesn't seem to work anymore, which isn't
great. But instead set tracking on the template's templates, it's not
quite as good but should be sufficient.
Fixes#769
I'd been convinced this was an ORM error because the field is not
recursive... in runbot_merge, in forwardbot it is and thus does indeed
need to be flagged to avoid the warning.
- currently disabling staging only works globally, allow disabling on
a single branch
- use a toggle
- remove a pair of tests which work specifically with `fp_target`,
can't work with `active` (probably)
- cleanup search of possible and active stagings, add relevant
indexes and use direct search of relevant branches instead of
looking up from the project
- also use toggle button for `active` on branches
- shitty workaround for upgrading DB: apparently mail really wants to
have a `user_id` to do some weird thing, so need to re-add it after
resetting everything
Fixes#727
I DECLARE BANKRUPTCY!!!
The previous implementation of labels lookup was really not
intuitive (it was just a char field, and matched labels by equality
including the owner tag), and was also full of broken edge
cases (e.g. traceback if a label matched multiple PRs in the same repo
because people reuse branch names).
Tried messing about with contextual `display_name` and `name_search`
on PRs but the client goes wonky in that case, and there is no clean
autocomplete for non-relational fields.
So created a view which reifies labels, and that can be used as the
basis for our search. It doesn't have to be maintained by hand, can be
searched somewhat flexibly, we can add new view fields in the future
if desirable, and it seems to work fine providing a nice
understandable UX, with the reliability of using a normal Odoo model
the normal way.
Also fixed the handling of bump PRs, clearly clearing the entire field
before trying to update existing records (even with a link_to
inbetween) is not the web client's fancy, re-selecting the current
label would just empty the thing entirely.
So use a two-step process slightly closer to the release PRs instead:
- first update or delete the existing bump PRs
- then add the new ones
The second part is because bump PRs are somewhat less critical than
release, so it can be a bit more DWIM compared to the more deliberate
process of release PRs where first the list of repositories involved
has to be set up just so, then the PRs can be filled in each of them.
Fixes#697
Previously the mergebot would only sync the head commit, but synching
more is useful.
Also update the final sanity check on staging:
- as with check, update the message & target branch
- reset PR state and post a message when updating message instead of
doing so silently
Note: maybe only fail the staging if the message is updated *and*
relevant to staging (aka there's a merge method and it's not
`rebase`)?
Fixes#680
Currently, if a PR forward-port PR gets detached the reason for it is
not always obvious, and may have to be hunted in the logs or in
"sibling" PRs.
By writing a forward port reason (hopefully) ever time we detach a PR,
and displaying that reason in the form and dashboard, the
justification should be a lot more obvious.
Fixes#679
When inserting a new branch, if there are extant forward ports
sequences spanning over the insertion, new forward ports must be
created for the new branch.
This was the case, *however* one of the cases was handled incorrectly:
PR batches (PRs to different repositories staged together) must be
forward-ported as batches. However when creating the "insert" batches,
these PRs would be split apart, leading to either inconsistent states
or an inability to merge the new forward ports (if they are
co-dependent which is a common case). This is because the handler
would look for all the relevant PRs to forward ports, but it would
fail to group them by label thereafter.
Fix that, create the `insert` batches correctly, and augment the
relevant test with an additional repository to test this case.
No gh issue was created, but this problem was reported on
odoo/odoo#110029 and odoo/enterprise#35838 (they were re-linked by
hand in the runbot and mergebot, but on github the labels remain
visibly different, one having the slug ZCwE while the other
hasO7Pr).
It's possible that the problem had occurred previously and not been
noticed (or reported), though it's also possible this issue had never
occurred before: for the latest freeze (16.1) there were 18 forward
ports spanning the insertion, but of them only 2 were the same batch.
It's almost certainly not useful, save as a minor convenience for
tests: decorrelating the branch sequence and the fp sequence seems
like it would only be extremely confusing, and on the mergebot all the
fp_sequence values are set to the default while the sequence values
are set to something useful and sensible (kinda).
Fixes#584
After review, there doesn't seem to be a single integer field created
by the mergebot or fortwardbot modules for which a `group_operator`
makes sense, let alone the default of `sum`.
So just disable them all.
Fixes#674
In case where the last branch (before the branch being frozen) is
disabled, the forwardport inserter screws up, and fails to correctly
create the intermediate forwardports from the new branch.
Also when disabling a branch, if there are FW PRs which target that
branch and have not been forward-ported further, automatically
forward-port them as if the branch had been disabled when they were
created, this should limit data loss and confusion.
Also change the message set on PRs when disabling a branch: because of
user conflicts in test setup, the message about a branch being
disabled would close the PRs, which would then orphan the followup,
leading to unexpected / inconsistent behaviour.
Fixes#665
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.
Turns out there are a few issues with this:
- the magic refs have become a bit unreliable, so we need a fallback
(b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
the magic refs live forever and diverge from all other content
(since we rarely `merge` PRs), for a large repo with lots of PRs
this leads to a significant inflation in size of repo (both on-disk
and objects count) e.g. odoo/odoo has about 25% more objects
with the pull refs included (3486550 to 4395319) and takes nearly
50% more space on disk (1.21G to 1.77)
As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.
Eventually this could be optimised further by
- performing an update of the cache repo at the start of the cron iff
there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
single `fetch`
- performing all the ports by cloning the temp clone (and not
`fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
forward port clones
Small followup for #489
The current system makes / lets GC run during fetching. This has a few
issues:
- the autogc consumes resources during the forward-porting
process (not that it's hugely urgent but it seems unnecessary)
- the autogc commonly fails due to the combination of large repository
(odoo/odoo) and low memory limits (hardmem for odoo, which get
translated into soft ulimits)
As a result, the garbage collection of the repository sometimes stops
entirely, leading to an increase in repository size and a decrease in
performances.
To mitigate this issue, disable the automagic gc and maintenance
during normal operation, and instead add a weekly cron which runs an
aggressive GC with memory limits disabled (as far as they can get, if
the limits are imposed externally there's nothing to be done).
The maintenance is implemented using a full lockout of the
forward-port cron and an in-place GC rather than a copy/gc/swap, as
doing this maintenance at the small hours of the week-end (sat-sun
night) seems like a non-issue: currently an aggressive GC of odoo/odoo
(using the default aggressive options) takes a total of 2:30 wallclock
(5h user) on a fairly elderly machine (it's closer to 20mn wallclock
and 2h user on my local machine, also turns out the cache repos are
kinda badly configured leading to ~30% more objects than necessary
which doesn't help).
For the record, a fresh checkout of odoo/odoo right now yields:
| Overall repository size | |
| * Commits | |
| * Count | 199 k |
| * Total size | 102 MiB |
| * Trees | |
| * Count | 1.60 M |
| * Total size | 2.67 GiB |
| * Total tree entries | 74.1 M |
| * Blobs | |
| * Count | 1.69 M |
| * Total size | 72.4 GiB |
If this still proves insufficient, a further option would be to deploy
a "generational repacking" strategy:
https://gitlab.com/gitlab-org/gitaly/-/issues/2861 (though apparently
it's not yet been implemented / deployed on gitlab so...).
But for now we'll see how it shakes out.
Close#489
- avoid pinging the author of the fw PR (which is the forward-bot
itself)
- instead ping the author and reviewer of the source, and possibly the
reviewer of the PR if any
- might also be a good idea to ping reviewers of intermediate PRs?
Github can fail to create the magic refs on PRs
(`pull/refs/?/head`). Since forwardport relies on these refs to fetch
PR content this is an issue when it occurs, as the forward ports fail
in a loop.
After discussion with Github support, it turns out Github enabled
`allowReachableSHA1InWant` a while back, meaning it's possible to
fetch content by commit (rather than ref) as long as the content is
"in network".
Use this property as fallback when checking if we can see the PR head
before forward porting.
Also:
- remove explicit configuration of GC during fetch, it doesn't disable
the autogc (yet?) but that's likely going to happen anyway
- update logging and logger hierarchy during forward port to make
things clearer and easier to extract, although based on PR id rather
than number
- rate limit failing forward ports to avoid running them on every cron
(~ every minute), run them every ~30mn instead, this provides higher
odds of recovery with less log garbage in case of transient github
failure, and if the PR is stuck it limits the log pollution
Fixes#658
If no stream data was captured (no stderr and no stdout), would just
log
git call error:
as error, with no further information.
Don't do that if we have neither stderr nor stdout data, since we're
re-raising the exception anyway, it's just confusing.
- if stderr was empty or had been redirected to stdout, no useful
information would be show, making debugging more complicated
- the fallback is the error itself, but since it's reraised odds are
pretty high the caller will eventually log the error itself, so
it's redundant
=> fallback to stdout if stderr is empty, and only log if either is
non-empty
Get mergebot updates from since the runbot was upgraded.
NOTE: updates forwardport.models.forwardport.Queue with slots for
compatibility with commit
odoo/odoo@ea3e39506a "use slots for
BaseModel", otherwise we get
TypeError: __bases__ assignment: 'ForwardPortTasks' object layout differs from 'BaseModel'
- if stderr has been rerouted or explicitely rerouted to STDOUT,
`e.stderr` is `None` and the error reporting blows up (which is
inconvenient). Handle this case.
- handle the case where `fp_github_name` has not been configured (it's
not a super useful handling but meh, apparently git/hub doesn't
really care if there's a username when using API tokens)
- minor improvement to the refline parser: per-spec, the trailing
newline is optional, so don't fail if it's missing
I'm surprised this ever worked, I guess concurrent tests stopped
working long before that? Or I misunderstood some of the historical
failures as transient?
During the cleanup of the forwardport test, I'd empty out the
`user_cache_dir('forwardport') / owner`, except the owner is always
the same (more or less) so all the tests check out their repos (and
working copies) in the same directory. If one test is cleaning up
while an other is performing a forward port, the second will blow up.
Also move the filestore to a tempdir, especially during creation of
the template db: it gets leaked so over time that generates gigabytes
of data which doesn't get cleaned up. But the template db filestore is
only "necessary" during the creation of the template, once the
template's been created it's of no use and won't be copied to create
the test dbs (though it could be, I guess).
Stop *staging* release PRs: they are normally fairly simple and should
not fail their staging outside of unreliable tests (or possibly a few
edge cases e.g. forgot one version change thing), however staging them
creates the possibility of a "version hole" on the release branch
which is undesirable.
Instead, immediately and unconditionally push the release commits onto
the newly created branches, if there are things which don't work they
can be fixed afterwards (and the process refined, maybe).
Also add the same feature for *bump* PRs, with the difference that the
bump PRs are not created / requested by default (they have to be opted
in individually).
For convenience, add a feature which automatically finds the PRs via
inputting the label (not really tested yet).
Closes#603
It's likely that the PR pages are seen more commonly than the
dashboard by most users, so add alerts there in case users wonder
what's happening.
Fixes#580
Check remains a tad dodgy, but since we're actually porting from
`root` (the earliest ancestor of `self`) it makes more sense to
sanity-check that *its* commits remain visible.
Also log that as it makes a tad more sense, hopefully.
Closes#600
This is an important bit of information but it was not visible without
going into the backend.
`user-select-none` doesn't work in BS3 but that way it'll be ready for
an eventual update. Currently when hovering the badge the cursor
switches to text selection, and the text is selectable, which is
useless.
Complements 4e235a2 and finishes the fixes for #617
- trying to r+ a detached PR *via the forwardbot* should warn, same as
a non-forwardport PR
- the following sibling of a closed PR should be detached from
it (probably)
- when a closed forward-port PR is reopened, there should be a
notification that it is detached and merged via mergebot
Fixes#617
Existing conflict style is the local default ("merge", most
likely). `diff3` is a lot more informative as it provides the common
ancestor's code for the hunk, which helps see how the two branches
diverged and thus resolve the conflict.
Even better would be zdiff3 but that's a bit too recent...
Fixes#619
Old messages were quite inconsistent in their pinging of the PR author
and reviewer.
Reviewed messages (probably missed some but...) and try to more
consistently ping when the feedback requires some sort of action in
order to proceed.
Fixes#592
- code in the various menus added over time through the UI (queues,
configuration, ...)
- update / improve PR layout a tick
- fix "outstanding forward ports" count on the dashboard
- improve hover title / help on dashboard
- add date of last modification (usually date of success / failure)
- make casing more coherent (everything lowercase)
- add explicit note that UTC date on staged at label is staged at datetime
- rediscover yet again that the staging information is when hovering
on the staging *except the staged at label*
- improve `PullRequest.unstage` to always insert the PR at the start of the
reason when cancelling the staging, for clarity / traceability
Closes#560, closes#609
On two of the freezes, thereafter the logs showed serial crashes in
the forwardport cron when trying to find the insertion point for a new
forward-port.
The first time was not really diagnosed, the second time the cause was
found to be a retargeted PR which led to a failure of the "insertion"
forward port, which did not take that possibility in account (it
assumed -- sensibly I believed -- that an intermediate FP following a
branch insertion would always succeed, sadly the malevolent universe
had other plans).
So only insert the new forward port inside its sequence (if necessary)
if the forward port actually succeeded, otherwise ignore it.
Fixes#551
The freeze wizard has support for merging freeze / release PRs on each
of the newly created branches. But since this would be done by, well,
merging, those PRs would get forward-ported to master, and would have
to be closed there.
This creates additional work for the freeze master, and noise /
parasitic PRs.
Obviously it's possible for the freeze master to set some nonsense `up
to` (nonsense because the "real" limit doesn't exist yet at that
point), but really it never makes any sense to forward port release
PRs, so the wizard should do it.
Apparently 2.34:
* flipped around the "auto-merging" and "CONFLICT" messages on stdout,
so just match the second one with wildcards around to ignore the
location of the first
* changed the casing and content of everything after the `error` line
on stderr, so just ignore it all (none of it's actually useful
anyway)
Provides a less manual interface for creating the freeze:
* takes the name of the branch to create
* takes any number of PRs which must be part of the freeze
* takes PRs representing the HEADs of the new branches
Then essentially takes care of the test.
Implementation of the actual wizard is not trivial but fairly
straightforward and linear, biggest issue is not being able to
`project_id.branch_ids[1]` to get the new branch, not sure why but it
seems to ignore the ordering, clearing the cache doens't fix it.
When creating the branches, add a sleep after each one for secondary
rate limiting purposes. Same when deleting branches.
Also the forwardbot has been updated to disable the forwardport cron
while a freeze is ongoing, this simplifies the freezing process.
Note: after recommendation of @aab-odoo, tried using `_applyChanges`
in `_checkState` but it simply did not work: the two relational fields
got completely frozen and were impossible to update, which is less
than ideal. Oh well, hopefully it works well enough like this for now.
* Adds a changelog page, linked from the main, with content
automatically loaded from the source. To avoid conflicts, each entry
is its own file and entries are grouped by the month during which
the update will (probably) be deployed
* The last group (most likely "last update") doesn't have a title, the
rest do.
* Add changelog entries from the last update so it's not too empty.
* Also update the layout for the alerts a bit: remove bottom margin to
reduce loss of whitespace.