Starting in 3.12 `TarFile.extact(all)` can take a `filter` parameter
which alters the behaviour of the extractor as it relates to the
application of tar metadata. Not passing this parameter is deprecated
unless the also new `TarFile.extraction_filter` attribute is set.
Now there are a few wrinkles here:
1. Both parameter and attributes were added in older patch releases
e.g. 3.9.17 and 3.10.12 and 3.11.4.
2. The attribute form will *not* resolve string filters, it needs the
callables.
As such just to be on the safe side of things set the attribute using
a `getattr`, in releases with the feature it will be set to a proper
value (the future default which ignores most or all metadata from the
archive), and in releases without it just sets the attribute to `None`
which should do no harm.
This was the root cause of the incident of Feb 13/14: because the
patcher pushed to the local branch before pushing to the remote
failing to push to the remote would leave the local ref broken, as
`fetch("refs/heads/*:refs/heads/*")` apparently does not do non-ff
updates (which does make some sense I guess).
So in this case a staging finished, was pushed to the remote, then git
delayed the read side just enough that when the patcher looked up the
target it got the old commit. It applied a patch on top of that, tried
to push, and got a failure (non-ff update), which led the local and
remote branches divergent, and caused any further update of the local
reference branches to fail, thus every forward port to be blocked.
Using symbolic branches during patching was completely dumb (and
updating the local branch unnecessary), so switch the entire thing to
using just commits, and update a bunch of error reporting while at it.
Partial mitigation of #1065, not complete by any means. Avoiding
updating a local ref during staging is probably the most important bit
here, we're staging on commits (into a new ref) anyway so it should
not be needed, and that avoids conflicts between the staging cron and
e.g. the forwardport cron.
Commits can take some time to propagate through the network (I guess),
or human error can lead to the wrong commit being set.
Either way, because the entire thing was done using a single fetch in
`check=True` the cron job would fail entirely if any of the patch
commits was yet unavailable.
Update the updater to:
- fallback on individual fetches
- remove the patch from the set of applicable patch if we (still)
can't find its commit
I'd have hoped `fetch` could retrieve whatever it found, but
apparently the server just crashes out when it doesn't find the commit
we ask for, and `fetch` doesn't update anything.
No linked issue because I apparently forgot to jot it down (and only
remembered about this issue with the #1063 patching issue) but this
was reported by mat last week (2025-02-21) when they were wondering
why one of their patches was taking a while:
- At 0832 patch was created by automated script.
- At 0947, an attempt to apply was made, the commit was not found.
- At 1126, a second attempt was made but an other patch had been
created whose commit was not found, failing both.
- At 1255, there was a concurrency error ("cannot lock ref" on the
target branch).
- Finally at 1427 the patch was applied.
All in all it took 6 hours to apply the patch, which is 3-4 staging
cycles.
It's shorter, it's simpler (kinda), and it's 70% faster (although
that's unlikely to be any sort of bottleneck given applying patches
involves multiple network roundtrips).
Verify that the tree is different before and after applying the patch,
otherwise if there's a mistake made (e.g. a script does not check that
they have content and request a patch applying an existing commit
which is how odoo/enterprise#612a9cf3cadba64e4b18d535ca0ac7e3f4429a08
occurred) we end up with a completely empty commit and a duplicated
commit message.
Fixes#1063
Note: using `rev-parse` to retrieve the commit's tree would be 50%
faster, but we're talking 3.2 versus 2.4ms and it requires string
formatting instead of nice clean arguments so it's a bit meh.
Hopefully this is the last fix to the patcher. From the start of the
implementation I relied on the idea that `git show` was adding a line
composed of a single space (and a newline) before and after the patch
message, as that is what I observed in my terminal, and it's
consistent with RFC 3676 signatures (two dashes, a space, and a
newline).
Turns out that single space, while present in my terminal indeed, was
completely made up by `less(1)`. `git show` itself doesn't generate
that, neither does it appear when using most pagers, or even when
piping the output of `less` into something (a file, an other pager,
...). It's pretty much just something `less(1)` sends to a terminal
during interactive sessions to fuck with you.
Fixes#1037
Show patch metadata on the patch screen, so it's easier to understand
what the parser sees in case of issues.
Behaviour is not *entirely* consisten, `file_ids` is correctly set but
it looks like during the initial `web_read` it gets stripped out in at
least some cases and the files list is empty even though files have
been found in the patch. nm.
Fixes#987
- Apparently if a user is on windows the ACE editor can swap out their
line end from unix to windows. The patch parsers were predicated
upon all patches being in unix mode (because git, and diff).
Fixup both parsers to convert windows-style line end to unix before
trying to parse the patch data. Also add a few fallbacks to limit
the odds of an unhelpful `StopIteration` (though that might hide
errors more than reveal them...)
- Make sure we support `format-patch --no-signature`, just requires
using the correct partition direction: I assume I used `rpartition`
as a form of micro-optimisation *but*
- If the separator is not found the "patch body" ends up in the
third parameter rather than the first, which makes the fallback
difficult.
- There doesn't seem to be anything preventing *multiple* signature
separators in a message, and logically the first one should hold
and the rest is all part of the signature.
As a result, for both reasons we need to look *forwards* for the
signature separator, not backwards. Hence `str.partition`.
Fixes#992
`get_content` round-trips the text part through `ascii` with
`error=replace`, so if the input is not ascii it screws up
tremendously, which leads to either failing to apply patches (the more
likely situation) or corrupting the patches.
`get_payload`, if called without `decode`, pretty much just returns
the payload unless it needs a decoding pass (e.g. because it contains
raw surrogates, but that should not be an issue for us). So this is
really what we want.
While at it, increase `patch`'s verbosity in case it can give us more
info.
Turns out you can configure format-patch with `--no-prefix` and some
people (*cough cough* mat) have that in their standard setup, so the
assumption of needing to strip 1 level of prefix does not necessarily
hold.
Also fix a few more issues:
- some people (*cough cough* still mat) also use `-n` by default,
which adds the series sequence (`n/m`) even for a single patch,
handle that correctly
- logging patch application errors is pretty useful when patching
fails and I'm trying to get the information via logs, do that
- especially when I decide to add error messages to tracking *but
forgot to show the chatter by default*, fix that as well
The commit-based patcher worked first try, and patch-based would have
worked too if not for those meddling kids. In the future it might be a
good idea to reify the stripping level (`-p`) on the patch object
though, and maybe provide a computed preview of the list of files to
patch, so issues are easier for the operator to diagnose.
Unstaged changes can be useful or necessary for some tasks
e.g. absolute emergency (where even faking the state of a staging is
not really desirable, if that's even possible anymore), or changes
which are so broad they're difficult to stage (e.g. t10s updates).
Add a new object which serves as a queue for patch to direct-apply,
with support for either text patches (udiff style out of git show or
format-patch) or commits to cherry-pick. In the former case, the part
of the show / format-patch before the diff itself is used for the
commit metadata (author, committer, dates, message) whereas for the
commit version the commit itself is reused as-is.
Applied patches are simply disabled for traceability.
Fixes#926