Requires parsing the commit messages as github plain doesn't collate
the info from there, but also the descriptions: apparently github only
adds those to the references if the PR targets the default
branch. That's not a limitation we want.
Meaning at the end of the day, the only thing we're calling graphql
for is explicit manual linking, which can't be tested without
extending the github API (and then would only be testable on DC), and
which I'm not sure anyone bothers with...
Limitations
-----------
- Links are retrieved on staging (so if one is added later it won't be
taken in account).
- Only repository-local links are handled, not cross-repository.
Fixes#777
The weekly maintenance would not prune refs. This is not an issue on
odoo/odoo because development branches are in a separate repository,
thus never fetched (we push to them but only using local commits and
remote refs).
However on repos like odoo/documentation the reference and development
branches are collocated, the lack of pruning thus keeps every
development branch alive locally, even years after the branch has been
deleted in the repository.
By pruning remote-tracking refs before GC-ing, we should have cleaner
local clones, and better packing.
Since every cron runs on a fresh database, on the first `run_crons`
every single cron in the db will run even though almost none of them
is relevant.
Aside from the slight inefficiency, this creates unnecessary extra
garbage in the test logs.
By setting the `nextcall` of all crons to infinity in the template we
avoid this issue, only triggered crons (or the crons whose nextcall we
set ourselves) will trigger during calls.
This requires adjusting the branch cleanup cron slightly: it didn't
correctly handle the initial run (`lastcall` being false).
Mergebot / forwardport crons need to run in a specific ordering in
order to flow into one another correctly. The default ordering being
unspecified, it was not possible to use the normal cron
runner (instead of the external driver running crons in sequence one
at a time). This can be fixed by setting *sequences* on crons, as the
cron runner (`_process_jobs`) will use that order to acquire and run
crons.
Also override `_process_jobs` however: the built-in cron runner
fetches a static list of ready crons, then runs that.
This is fine for normal situation where the cron runner runs in a loop
anyway but it's any issue for the tests, as we expect that cron A can
trigger cron B, and we want cron B to run *right now* even if it
hadn't been triggered before cron A ran.
We can replace `_process_job` with a cut down version which does
that (cut down because we don't need most of the error handling /
resilience, there's no concurrent workers, there's no module being
installed, versions must match, ...). This allows e.g. the cron
propagating commit statuses to trigger the staging cron, and both will
run within the same `run_crons` session.
Something I didn't touch is that `_process_jobs` internally creates
completely new environments so there is no way to pass context into
the cron jobs anymore (whereas it works for `method_direct_trigger`),
this means the context values have to be shunted elsewhere for that
purpose which is gross. But even though I'm replacing `_process_jobs`,
this seems a bit too much of a change in cron execution semantics. So
left it out.
While at it tho, silence the spammy `py.warnings` stuff I can't do
much about.
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes#847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
`gc --prune` can not take a *separate* parameter, it has to be part of
the same arg (the `=` is not optional), otherwise the `gc` call blows
up.
So use the positional form of the git command to generate the correct
invocation, Python-level `foo=bar` generates a split-style option in
two args which does not please git.
Before this, we would check if a repository had a name and run
maintenance on it, leading to repeated (but unnoticed until now
because I didn't monitor it) tracebacks as the maintenance cron would
fail to find the local repo then run maintenance on nowhere anyway.
Also augment the repo-finding process to try and get better
information about what it's doing when it fails, rather than failing
completely silently.
If a branch `foo` is disabled, then `tmp.foo` and `staging.foo` become
unnecessary (with #247 fixed the tmp refs are not used for creating
stagings anymore, but for now they're still used for the "safety
dance" of merging a successful staging into the corresponding
mainline).
Fixes#605
If the stagings are going to be created locally (via a git working
copy rather than the github API), the mergebot part needs to have
access to the cache, so move the cache over. Also move the maintenance
cron.
In an extermely minor way, this prefigures the (hopeful) eventual
merging of the ~~planes~~ modules.