2019-08-23 21:16:30 +07:00
|
|
|
# -*- coding: utf-8 -*-
|
|
|
|
"""
|
|
|
|
Technically could be independent from mergebot but would require a lot of
|
|
|
|
duplicate work e.g. keeping track of statuses (including on commits which
|
|
|
|
might not be in PRs yet), handling co-dependent PRs, ...
|
|
|
|
|
|
|
|
However extending the mergebot also leads to messiness: fpbot should have
|
|
|
|
its own user / feedback / API keys, mergebot and fpbot both have branch
|
|
|
|
ordering but for mergebot it's completely cosmetics, being slaved to mergebot
|
|
|
|
means PR creation is trickier (as mergebot assumes opened event will always
|
|
|
|
lead to PR creation but fpbot wants to attach meaning to the PR when setting
|
|
|
|
it up), ...
|
|
|
|
"""
|
2024-06-10 23:47:49 +07:00
|
|
|
from __future__ import annotations
|
|
|
|
|
[IMP] *: crons tests running for better triggered compatibility
Mergebot / forwardport crons need to run in a specific ordering in
order to flow into one another correctly. The default ordering being
unspecified, it was not possible to use the normal cron
runner (instead of the external driver running crons in sequence one
at a time). This can be fixed by setting *sequences* on crons, as the
cron runner (`_process_jobs`) will use that order to acquire and run
crons.
Also override `_process_jobs` however: the built-in cron runner
fetches a static list of ready crons, then runs that.
This is fine for normal situation where the cron runner runs in a loop
anyway but it's any issue for the tests, as we expect that cron A can
trigger cron B, and we want cron B to run *right now* even if it
hadn't been triggered before cron A ran.
We can replace `_process_job` with a cut down version which does
that (cut down because we don't need most of the error handling /
resilience, there's no concurrent workers, there's no module being
installed, versions must match, ...). This allows e.g. the cron
propagating commit statuses to trigger the staging cron, and both will
run within the same `run_crons` session.
Something I didn't touch is that `_process_jobs` internally creates
completely new environments so there is no way to pass context into
the cron jobs anymore (whereas it works for `method_direct_trigger`),
this means the context values have to be shunted elsewhere for that
purpose which is gross. But even though I'm replacing `_process_jobs`,
this seems a bit too much of a change in cron execution semantics. So
left it out.
While at it tho, silence the spammy `py.warnings` stuff I can't do
much about.
2024-07-30 14:21:05 +07:00
|
|
|
import builtins
|
2019-10-10 17:07:57 +07:00
|
|
|
import datetime
|
2019-08-23 21:16:30 +07:00
|
|
|
import itertools
|
|
|
|
import json
|
|
|
|
import logging
|
2019-11-20 13:43:56 +07:00
|
|
|
import operator
|
2019-08-23 21:16:30 +07:00
|
|
|
import subprocess
|
2021-10-19 19:39:19 +07:00
|
|
|
import typing
|
2019-08-23 21:16:30 +07:00
|
|
|
|
2020-05-20 17:42:45 +07:00
|
|
|
import dateutil.relativedelta
|
2019-08-23 21:16:30 +07:00
|
|
|
import requests
|
|
|
|
|
2023-08-29 20:59:05 +07:00
|
|
|
from odoo import models, fields, api
|
2019-08-23 21:16:30 +07:00
|
|
|
from odoo.exceptions import UserError
|
2024-02-23 19:46:37 +07:00
|
|
|
from odoo.osv import expression
|
[FIX] forwardport: correct batching of intermediate PRs
When inserting a new branch, if there are extant forward ports
sequences spanning over the insertion, new forward ports must be
created for the new branch.
This was the case, *however* one of the cases was handled incorrectly:
PR batches (PRs to different repositories staged together) must be
forward-ported as batches. However when creating the "insert" batches,
these PRs would be split apart, leading to either inconsistent states
or an inability to merge the new forward ports (if they are
co-dependent which is a common case). This is because the handler
would look for all the relevant PRs to forward ports, but it would
fail to group them by label thereafter.
Fix that, create the `insert` batches correctly, and augment the
relevant test with an additional repository to test this case.
No gh issue was created, but this problem was reported on
odoo/odoo#110029 and odoo/enterprise#35838 (they were re-linked by
hand in the runbot and mergebot, but on github the labels remain
visibly different, one having the slug ZCwE while the other
hasO7Pr).
It's possible that the problem had occurred previously and not been
noticed (or reported), though it's also possible this issue had never
occurred before: for the latest freeze (16.1) there were 18 forward
ports spanning the insertion, but of them only 2 were the same batch.
2023-01-19 16:49:16 +07:00
|
|
|
from odoo.tools.misc import topological_sort, groupby
|
2023-10-06 18:19:01 +07:00
|
|
|
from odoo.addons.base.models.res_partner import Partner
|
[FIX] forwardport: don't break forward porting on huge conflicts
On forward-porting, odoo/odoo#170183 generates a conflict on pretty
much every one of the 1111 files it touches, because they are
modify/delete conflicts that generates a conflict message over 200
bytes per file, which is over 200kB of output.
For this specific scenario, the commit message was being passed
through arguments to the `git` command, resulting in a command line
exceeding `MAX_ARG_STRLEN`[^1]. The obvious way to fix this is to pass
the commit message via stdin as is done literally in the line above
where we just copy a non-generated commit message.
However I don't think hundreds of kbytes worth of stdout[^2] is of any
use, so shorten that a bit, and stderr while at it.
Don't touch the commit message size for now, possibly forever, but
note that some log diving reveals a commit with a legit 18kB message
(odoo/odoo@42a3b704f7c7889a74338253138d0201d3b6e7a3) so if we want to
restrict that the limit should be at least 32k, and possibly 64. But
it might be a good idea to make that limit part of the ready / merge
checks too, rather than cut things off or trigger errors during
staging.
Fixes #900
[^1]: Most resources on "Argument list too long" reference `ARG_MAX`,
but on both my machine and the server it is 2097152 (25% of the
default stack), which is ~10x larger than the commit message we
tried to generate. The actual limit is `MAX_ARG_STRLEN` which
can't be queried directly but is essentially hard-coded to
PAGE_SIZE * 32 = 128kiB, which tracks.
[^2]: Somewhat unexpectedly, that's where `git-cherry-pick` sends the
conflict info.
2024-06-25 18:11:32 +07:00
|
|
|
from odoo.addons.runbot_merge import git, utils
|
2024-05-21 20:44:47 +07:00
|
|
|
from odoo.addons.runbot_merge.models.pull_requests import Branch
|
2023-08-16 20:31:42 +07:00
|
|
|
from odoo.addons.runbot_merge.models.stagings_create import Message
|
2020-03-12 14:33:15 +07:00
|
|
|
|
2024-10-22 15:53:45 +07:00
|
|
|
Conflict = tuple[str, str, str, list[str]]
|
|
|
|
|
2019-10-10 17:07:57 +07:00
|
|
|
DEFAULT_DELTA = dateutil.relativedelta.relativedelta(days=3)
|
2019-08-23 21:16:30 +07:00
|
|
|
|
|
|
|
_logger = logging.getLogger('odoo.addons.forwardport')
|
|
|
|
|
|
|
|
class Project(models.Model):
|
|
|
|
_inherit = 'runbot_merge.project'
|
|
|
|
|
2023-10-06 18:19:01 +07:00
|
|
|
id: int
|
|
|
|
github_prefix: str
|
2019-08-23 21:16:30 +07:00
|
|
|
|
2020-01-27 21:39:25 +07:00
|
|
|
def write(self, vals):
|
|
|
|
# check on branches both active and inactive so disabling branches doesn't
|
|
|
|
# make it look like the sequence changed.
|
|
|
|
self_ = self.with_context(active_test=False)
|
2022-11-10 21:55:21 +07:00
|
|
|
previously_active_branches = {project: project.branch_ids.filtered('active') for project in self_}
|
2020-01-27 21:39:25 +07:00
|
|
|
branches_before = {project: project._forward_port_ordered() for project in self_}
|
|
|
|
|
|
|
|
r = super().write(vals)
|
2022-11-10 21:55:21 +07:00
|
|
|
self_._followup_prs(previously_active_branches)
|
|
|
|
self_._insert_intermediate_prs(branches_before)
|
|
|
|
return r
|
|
|
|
|
|
|
|
def _followup_prs(self, previously_active_branches):
|
|
|
|
"""If a branch has been disabled and had PRs without a followup (e.g.
|
|
|
|
because no CI or CI failed), create followup, as if the branch had been
|
|
|
|
originally disabled (and thus skipped over)
|
|
|
|
"""
|
[CHG] *: move forward-porting over to batches
Thank god I have a bunch of tests because once again I forgot / missed
a bunch of edge cases in doing the conversion, which the tests
caught (sadly that means I almost certainly broke a few untested edge
cases).
Important notes:
Handling of parent links
------------------------
Unlike PRs, batches don't lose their parent info ever, the link is
permanent, which is convenient to trawl through a forward port
(currently implemented very inefficiently, maybe we'll optimise that
in the future).
However this means the batch having a parent and the batch's PRs
having parents are slightly different informations, one of the edge
cases I missed is that of conflicting PRs, which are deparented and
have to be merged by hand before being forward ported further, I had
originally replaced the checks on a pr and its sibling having parents
by just the batch.
Batches & targets
-----------------
Batches were originally concepted as being fixed to a target and PRs
having that target, a PR being retargeted would move it from one batch
to an other.
As it turns out this does not work in the case where people retarget
forward-port PRs, which I know they do because #551
(2337bd85186de000d074e5a7178cfeb110d15de7). I could not think of a
good way to handle this issue as is, so scrapped the moving PRs thing,
instead one of the coherence checks of a batch being ready is that all
its PRs have the same target, and a batch only has a target if all its
PRs have the same target.
It's possible for somewhat odd effects to arise, notably if a PR is
closed (removed from batch), the other PRs are retargeted, and the new
PR is reopened, it will now be on a separate batch even if it also
gets retargeted. This is weird. I don't quite know how I should handle
it, maybe batches could merge if they have the same target and label?
however batches don't currently have a label so...
Improve limits
--------------
Keep limits on the PRs rather than lift them on the batchL if we can
add/remove PRs of batches having different limits on different PRs of
the same batch is reasonable.
Also leave limit unset by default: previously, the limit was eagerly
set to the tip (accessible) branch. That doesn't really seem
necessary, so stop doing that.
Also remove completely unnecessary `max` when trying to find a PR's
next target: `root` is either `self` or `self.source_id`, so it should
not be possible for that to have a later target.
And for now ensure the limits are consistent per batch: a PR defaults
to the limit of their batch-mate if they don't have one, and if a
limit is set via command it's set on all PRs of a batch.
This commit does not allow differential limits via commands, they are
allowed via the backend but not really tested. The issue is mostly
that it's not clear what the UX should look like to have clear and not
super error prone interactions. So punt on it for now, and hopefully
there's no hole I missed which will create inconsistent batches.
2024-05-21 20:45:53 +07:00
|
|
|
Batch = self.env['runbot_merge.batch']
|
2024-02-23 19:46:37 +07:00
|
|
|
ported = self.env['runbot_merge.pull_requests']
|
2022-11-10 21:55:21 +07:00
|
|
|
for p in self:
|
|
|
|
actives = previously_active_branches[p]
|
|
|
|
for deactivated in p.branch_ids.filtered(lambda b: not b.active) & actives:
|
2024-02-23 19:46:37 +07:00
|
|
|
# if a non-merged batch targets a deactivated branch which is
|
|
|
|
# not its limit
|
[CHG] *: move forward-porting over to batches
Thank god I have a bunch of tests because once again I forgot / missed
a bunch of edge cases in doing the conversion, which the tests
caught (sadly that means I almost certainly broke a few untested edge
cases).
Important notes:
Handling of parent links
------------------------
Unlike PRs, batches don't lose their parent info ever, the link is
permanent, which is convenient to trawl through a forward port
(currently implemented very inefficiently, maybe we'll optimise that
in the future).
However this means the batch having a parent and the batch's PRs
having parents are slightly different informations, one of the edge
cases I missed is that of conflicting PRs, which are deparented and
have to be merged by hand before being forward ported further, I had
originally replaced the checks on a pr and its sibling having parents
by just the batch.
Batches & targets
-----------------
Batches were originally concepted as being fixed to a target and PRs
having that target, a PR being retargeted would move it from one batch
to an other.
As it turns out this does not work in the case where people retarget
forward-port PRs, which I know they do because #551
(2337bd85186de000d074e5a7178cfeb110d15de7). I could not think of a
good way to handle this issue as is, so scrapped the moving PRs thing,
instead one of the coherence checks of a batch being ready is that all
its PRs have the same target, and a batch only has a target if all its
PRs have the same target.
It's possible for somewhat odd effects to arise, notably if a PR is
closed (removed from batch), the other PRs are retargeted, and the new
PR is reopened, it will now be on a separate batch even if it also
gets retargeted. This is weird. I don't quite know how I should handle
it, maybe batches could merge if they have the same target and label?
however batches don't currently have a label so...
Improve limits
--------------
Keep limits on the PRs rather than lift them on the batchL if we can
add/remove PRs of batches having different limits on different PRs of
the same batch is reasonable.
Also leave limit unset by default: previously, the limit was eagerly
set to the tip (accessible) branch. That doesn't really seem
necessary, so stop doing that.
Also remove completely unnecessary `max` when trying to find a PR's
next target: `root` is either `self` or `self.source_id`, so it should
not be possible for that to have a later target.
And for now ensure the limits are consistent per batch: a PR defaults
to the limit of their batch-mate if they don't have one, and if a
limit is set via command it's set on all PRs of a batch.
This commit does not allow differential limits via commands, they are
allowed via the backend but not really tested. The issue is mostly
that it's not clear what the UX should look like to have clear and not
super error prone interactions. So punt on it for now, and hopefully
there's no hole I missed which will create inconsistent batches.
2024-05-21 20:45:53 +07:00
|
|
|
extant = Batch.search([
|
2024-02-23 19:46:37 +07:00
|
|
|
('parent_id', '!=', False),
|
2022-11-10 21:55:21 +07:00
|
|
|
('target', '=', deactivated.id),
|
2024-02-23 19:46:37 +07:00
|
|
|
# if at least one of the PRs has a different limit
|
[CHG] *: move forward-porting over to batches
Thank god I have a bunch of tests because once again I forgot / missed
a bunch of edge cases in doing the conversion, which the tests
caught (sadly that means I almost certainly broke a few untested edge
cases).
Important notes:
Handling of parent links
------------------------
Unlike PRs, batches don't lose their parent info ever, the link is
permanent, which is convenient to trawl through a forward port
(currently implemented very inefficiently, maybe we'll optimise that
in the future).
However this means the batch having a parent and the batch's PRs
having parents are slightly different informations, one of the edge
cases I missed is that of conflicting PRs, which are deparented and
have to be merged by hand before being forward ported further, I had
originally replaced the checks on a pr and its sibling having parents
by just the batch.
Batches & targets
-----------------
Batches were originally concepted as being fixed to a target and PRs
having that target, a PR being retargeted would move it from one batch
to an other.
As it turns out this does not work in the case where people retarget
forward-port PRs, which I know they do because #551
(2337bd85186de000d074e5a7178cfeb110d15de7). I could not think of a
good way to handle this issue as is, so scrapped the moving PRs thing,
instead one of the coherence checks of a batch being ready is that all
its PRs have the same target, and a batch only has a target if all its
PRs have the same target.
It's possible for somewhat odd effects to arise, notably if a PR is
closed (removed from batch), the other PRs are retargeted, and the new
PR is reopened, it will now be on a separate batch even if it also
gets retargeted. This is weird. I don't quite know how I should handle
it, maybe batches could merge if they have the same target and label?
however batches don't currently have a label so...
Improve limits
--------------
Keep limits on the PRs rather than lift them on the batchL if we can
add/remove PRs of batches having different limits on different PRs of
the same batch is reasonable.
Also leave limit unset by default: previously, the limit was eagerly
set to the tip (accessible) branch. That doesn't really seem
necessary, so stop doing that.
Also remove completely unnecessary `max` when trying to find a PR's
next target: `root` is either `self` or `self.source_id`, so it should
not be possible for that to have a later target.
And for now ensure the limits are consistent per batch: a PR defaults
to the limit of their batch-mate if they don't have one, and if a
limit is set via command it's set on all PRs of a batch.
This commit does not allow differential limits via commands, they are
allowed via the backend but not really tested. The issue is mostly
that it's not clear what the UX should look like to have clear and not
super error prone interactions. So punt on it for now, and hopefully
there's no hole I missed which will create inconsistent batches.
2024-05-21 20:45:53 +07:00
|
|
|
('prs.limit_id', '!=', deactivated.id),
|
|
|
|
('merge_date', '=', False),
|
2024-02-23 19:46:37 +07:00
|
|
|
]).filtered(lambda b:\
|
|
|
|
# and has a next target (should already be a function of
|
|
|
|
# the search but doesn't hurt)
|
|
|
|
b._find_next_target() \
|
|
|
|
# and has not already been forward ported
|
|
|
|
and Batch.search_count([('parent_id', '=', b.id)]) == 0
|
|
|
|
)
|
2024-06-28 17:22:23 +07:00
|
|
|
|
|
|
|
# PRs may have different limits in the same batch so only notify
|
|
|
|
# those which actually needed porting
|
|
|
|
ported |= extant._schedule_fp_followup(force_fw=True)\
|
|
|
|
.prs.filtered(lambda p: p._find_next_target())
|
2022-11-10 21:55:21 +07:00
|
|
|
|
2024-02-23 19:46:37 +07:00
|
|
|
if not ported:
|
|
|
|
return
|
|
|
|
|
|
|
|
for feedback in self.env['runbot_merge.pull_requests.feedback'].search(expression.OR(
|
|
|
|
[('repository', '=', p.repository.id), ('pull_request', '=', p.number)]
|
|
|
|
for p in ported
|
|
|
|
)):
|
|
|
|
# FIXME: better signal
|
|
|
|
if 'disabled' in feedback.message:
|
|
|
|
feedback.message += '\n\nAs this was not its limit, it will automatically be forward ported to the next active branch.'
|
2022-11-10 21:55:21 +07:00
|
|
|
|
|
|
|
def _insert_intermediate_prs(self, branches_before):
|
|
|
|
"""If new branches have been added to the sequence inbetween existing
|
|
|
|
branches (mostly a freeze inserted before the main branch), fill in
|
|
|
|
forward-ports for existing sequences
|
|
|
|
"""
|
|
|
|
Branches = self.env['runbot_merge.branch']
|
|
|
|
for p in self:
|
2020-01-27 21:39:25 +07:00
|
|
|
# check if the branches sequence has been modified
|
|
|
|
bbefore = branches_before[p]
|
|
|
|
bafter = p._forward_port_ordered()
|
|
|
|
if bafter.ids == bbefore.ids:
|
|
|
|
continue
|
2021-01-12 18:54:35 +07:00
|
|
|
|
|
|
|
logger = _logger.getChild('project').getChild(p.name)
|
|
|
|
logger.debug("branches updated %s -> %s", bbefore, bafter)
|
2020-01-27 21:39:25 +07:00
|
|
|
# if it's just that a branch was inserted at the end forwardport
|
|
|
|
# should keep on keeping normally
|
|
|
|
if bafter.ids[:-1] == bbefore.ids:
|
|
|
|
continue
|
|
|
|
|
|
|
|
if bafter <= bbefore:
|
|
|
|
raise UserError("Branches can not be reordered or removed after saving.")
|
|
|
|
|
|
|
|
# Last possibility: branch was inserted but not at end, get all
|
|
|
|
# branches before and all branches after
|
|
|
|
before = new = after = Branches
|
|
|
|
for b in bafter:
|
|
|
|
if b in bbefore:
|
|
|
|
if new:
|
|
|
|
after += b
|
|
|
|
else:
|
|
|
|
before += b
|
|
|
|
else:
|
|
|
|
if new:
|
|
|
|
raise UserError("Inserting multiple branches at the same time is not supported")
|
|
|
|
new = b
|
2021-01-12 18:54:35 +07:00
|
|
|
logger.debug('before: %s new: %s after: %s', before.ids, new.ids, after.ids)
|
2020-01-27 21:39:25 +07:00
|
|
|
# find all FPs whose ancestry spans the insertion
|
|
|
|
leaves = self.env['runbot_merge.pull_requests'].search([
|
|
|
|
('state', 'not in', ['closed', 'merged']),
|
|
|
|
('target', 'in', after.ids),
|
|
|
|
('source_id.target', 'in', before.ids),
|
|
|
|
])
|
|
|
|
# get all PRs just preceding the insertion point which either are
|
|
|
|
# sources of the above or have the same source
|
|
|
|
candidates = self.env['runbot_merge.pull_requests'].search([
|
|
|
|
('target', '=', before[-1].id),
|
|
|
|
'|', ('id', 'in', leaves.mapped('source_id').ids),
|
|
|
|
('source_id', 'in', leaves.mapped('source_id').ids),
|
|
|
|
])
|
2021-01-12 18:54:35 +07:00
|
|
|
logger.debug("\nPRs spanning new: %s\nto port: %s", leaves, candidates)
|
2020-01-27 21:39:25 +07:00
|
|
|
# enqueue the creation of a new forward-port based on our candidates
|
[FIX] forwardport: correct batching of intermediate PRs
When inserting a new branch, if there are extant forward ports
sequences spanning over the insertion, new forward ports must be
created for the new branch.
This was the case, *however* one of the cases was handled incorrectly:
PR batches (PRs to different repositories staged together) must be
forward-ported as batches. However when creating the "insert" batches,
these PRs would be split apart, leading to either inconsistent states
or an inability to merge the new forward ports (if they are
co-dependent which is a common case). This is because the handler
would look for all the relevant PRs to forward ports, but it would
fail to group them by label thereafter.
Fix that, create the `insert` batches correctly, and augment the
relevant test with an additional repository to test this case.
No gh issue was created, but this problem was reported on
odoo/odoo#110029 and odoo/enterprise#35838 (they were re-linked by
hand in the runbot and mergebot, but on github the labels remain
visibly different, one having the slug ZCwE while the other
hasO7Pr).
It's possible that the problem had occurred previously and not been
noticed (or reported), though it's also possible this issue had never
occurred before: for the latest freeze (16.1) there were 18 forward
ports spanning the insertion, but of them only 2 were the same batch.
2023-01-19 16:49:16 +07:00
|
|
|
# but it should only create a single step and needs to stitch back
|
2020-01-27 21:39:25 +07:00
|
|
|
# the parents linked list, so it has a special type
|
[FIX] forwardport: correct batching of intermediate PRs
When inserting a new branch, if there are extant forward ports
sequences spanning over the insertion, new forward ports must be
created for the new branch.
This was the case, *however* one of the cases was handled incorrectly:
PR batches (PRs to different repositories staged together) must be
forward-ported as batches. However when creating the "insert" batches,
these PRs would be split apart, leading to either inconsistent states
or an inability to merge the new forward ports (if they are
co-dependent which is a common case). This is because the handler
would look for all the relevant PRs to forward ports, but it would
fail to group them by label thereafter.
Fix that, create the `insert` batches correctly, and augment the
relevant test with an additional repository to test this case.
No gh issue was created, but this problem was reported on
odoo/odoo#110029 and odoo/enterprise#35838 (they were re-linked by
hand in the runbot and mergebot, but on github the labels remain
visibly different, one having the slug ZCwE while the other
hasO7Pr).
It's possible that the problem had occurred previously and not been
noticed (or reported), though it's also possible this issue had never
occurred before: for the latest freeze (16.1) there were 18 forward
ports spanning the insertion, but of them only 2 were the same batch.
2023-01-19 16:49:16 +07:00
|
|
|
for _, cs in groupby(candidates, key=lambda p: p.label):
|
2020-01-27 21:39:25 +07:00
|
|
|
self.env['forwardport.batches'].create({
|
[CHG] *: persistent batches
This probably has latent bugs, and is only the start of the road to v2
(#789): PR batches are now created up-front (alongside the PR), with
PRs attached and detached as needed, hopefully such that things are
not broken (tests pass but...), this required a fair number of
ajustments to code not taking batches into account, or creating
batches on the fly.
`PullRequests.blocked` has also been updated to rely on the batch to
get its batch-mates, such that it can now be a stored field with the
right dependencies.
The next step is to better leverage this change:
- move cross-PR state up to the batch (e.g. skipchecks, priority, ...)
- add fw info to the batch, perform forward-ports batchwise in order
to avoid redundant batch-selection work, and allow altering batches
during fw (e.g. adding or removing PRs)
- use batches to select stagings
- maybe expose staging history of a batch?
2023-12-19 17:10:11 +07:00
|
|
|
'batch_id': cs[0].batch_id.id,
|
2020-01-27 21:39:25 +07:00
|
|
|
'source': 'insert',
|
|
|
|
})
|
|
|
|
|
2019-08-23 21:16:30 +07:00
|
|
|
class Repository(models.Model):
|
|
|
|
_inherit = 'runbot_merge.repository'
|
|
|
|
|
2023-10-06 18:19:01 +07:00
|
|
|
id: int
|
|
|
|
project_id: Project
|
|
|
|
name: str
|
|
|
|
branch_filter: str
|
2019-08-23 21:16:30 +07:00
|
|
|
fp_remote_target = fields.Char(help="where FP branches get pushed")
|
|
|
|
|
|
|
|
class PullRequests(models.Model):
|
|
|
|
_inherit = 'runbot_merge.pull_requests'
|
|
|
|
|
2023-10-06 18:19:01 +07:00
|
|
|
id: int
|
|
|
|
display_name: str
|
|
|
|
number: int
|
|
|
|
repository: Repository
|
|
|
|
target: Branch
|
|
|
|
reviewed_by: Partner
|
|
|
|
head: str
|
|
|
|
state: str
|
2024-06-10 23:47:49 +07:00
|
|
|
merge_date: datetime.datetime
|
|
|
|
parent_id: PullRequests
|
2019-08-23 21:16:30 +07:00
|
|
|
|
2022-12-07 21:13:55 +07:00
|
|
|
reminder_backoff_factor = fields.Integer(default=-4, group_operator=None)
|
2023-06-12 19:41:42 +07:00
|
|
|
|
[MERGE] bot from 16.0 to 17.0
Broken (can't run odoo at all):
- In Odoo 17.0, the `pre_init_hook` takes an env, not a cursor, update
`_check_citext`.
- Odoo 17.0 rejects `@attrs` and doesn't say where they are or how to
update them, fun, hunt down `attrs={'invisible': ...` and try to fix
them.
- Odoo 17.0 warns on non-multi creates, update them, most were very
reasonable, one very wasn't.
Test failures:
- Odoo 17.0 deprecates `name_get` and doesn't use it as a *source*
anymore, replace overrides by overrides to `_compute_display_name`.
- Multiple tracking changes:
- `_track_set_author` takes a `Partner` not an id.
- `_message_compute_author` still requires overriding in order to
handle record creation, which in standard doesn't support author
overriding.
- `mail.tracking.value.field_type` has been removed, the field type
now needs to be retrieved from the `field_id`.
- Some tracking ordering have changed and require adjusting a few
tests.
Also added a few flushes before SQL queries which are not (obviously
at least) at the start of a cron or controller, no test failure
observed but better safe than sorry (probably).
2024-08-12 18:13:03 +07:00
|
|
|
@api.model_create_multi
|
|
|
|
def create(self, vals_list):
|
|
|
|
created = []
|
|
|
|
to_create = []
|
|
|
|
old = self.browse(())
|
|
|
|
for vals in vals_list:
|
|
|
|
# PR opened event always creates a new PR, override so we can precreate PRs
|
|
|
|
existing = self.search([
|
|
|
|
('repository', '=', vals['repository']),
|
|
|
|
('number', '=', vals['number']),
|
|
|
|
])
|
|
|
|
created.append(not existing)
|
|
|
|
if existing:
|
|
|
|
old |= existing
|
|
|
|
continue
|
|
|
|
to_create.append(vals)
|
|
|
|
if vals.get('parent_id') and 'source_id' not in vals:
|
|
|
|
vals['source_id'] = self.browse(vals['parent_id']).root_id.id
|
2024-11-19 20:58:41 +07:00
|
|
|
if not to_create:
|
2024-11-20 14:05:41 +07:00
|
|
|
return old
|
[MERGE] bot from 16.0 to 17.0
Broken (can't run odoo at all):
- In Odoo 17.0, the `pre_init_hook` takes an env, not a cursor, update
`_check_citext`.
- Odoo 17.0 rejects `@attrs` and doesn't say where they are or how to
update them, fun, hunt down `attrs={'invisible': ...` and try to fix
them.
- Odoo 17.0 warns on non-multi creates, update them, most were very
reasonable, one very wasn't.
Test failures:
- Odoo 17.0 deprecates `name_get` and doesn't use it as a *source*
anymore, replace overrides by overrides to `_compute_display_name`.
- Multiple tracking changes:
- `_track_set_author` takes a `Partner` not an id.
- `_message_compute_author` still requires overriding in order to
handle record creation, which in standard doesn't support author
overriding.
- `mail.tracking.value.field_type` has been removed, the field type
now needs to be retrieved from the `field_id`.
- Some tracking ordering have changed and require adjusting a few
tests.
Also added a few flushes before SQL queries which are not (obviously
at least) at the start of a cron or controller, no test failure
observed but better safe than sorry (probably).
2024-08-12 18:13:03 +07:00
|
|
|
|
2024-11-20 14:05:41 +07:00
|
|
|
new = super().create(to_create)
|
[MERGE] bot from 16.0 to 17.0
Broken (can't run odoo at all):
- In Odoo 17.0, the `pre_init_hook` takes an env, not a cursor, update
`_check_citext`.
- Odoo 17.0 rejects `@attrs` and doesn't say where they are or how to
update them, fun, hunt down `attrs={'invisible': ...` and try to fix
them.
- Odoo 17.0 warns on non-multi creates, update them, most were very
reasonable, one very wasn't.
Test failures:
- Odoo 17.0 deprecates `name_get` and doesn't use it as a *source*
anymore, replace overrides by overrides to `_compute_display_name`.
- Multiple tracking changes:
- `_track_set_author` takes a `Partner` not an id.
- `_message_compute_author` still requires overriding in order to
handle record creation, which in standard doesn't support author
overriding.
- `mail.tracking.value.field_type` has been removed, the field type
now needs to be retrieved from the `field_id`.
- Some tracking ordering have changed and require adjusting a few
tests.
Also added a few flushes before SQL queries which are not (obviously
at least) at the start of a cron or controller, no test failure
observed but better safe than sorry (probably).
2024-08-12 18:13:03 +07:00
|
|
|
for pr in new:
|
|
|
|
# added a new PR to an already forward-ported batch: port the PR
|
|
|
|
if self.env['runbot_merge.batch'].search_count([
|
|
|
|
('parent_id', '=', pr.batch_id.id),
|
|
|
|
]):
|
|
|
|
self.env['forwardport.batches'].create({
|
|
|
|
'batch_id': pr.batch_id.id,
|
|
|
|
'source': 'complete',
|
|
|
|
'pr_id': pr.id,
|
|
|
|
})
|
2024-11-20 14:05:41 +07:00
|
|
|
if not old:
|
|
|
|
return new
|
[MERGE] bot from 16.0 to 17.0
Broken (can't run odoo at all):
- In Odoo 17.0, the `pre_init_hook` takes an env, not a cursor, update
`_check_citext`.
- Odoo 17.0 rejects `@attrs` and doesn't say where they are or how to
update them, fun, hunt down `attrs={'invisible': ...` and try to fix
them.
- Odoo 17.0 warns on non-multi creates, update them, most were very
reasonable, one very wasn't.
Test failures:
- Odoo 17.0 deprecates `name_get` and doesn't use it as a *source*
anymore, replace overrides by overrides to `_compute_display_name`.
- Multiple tracking changes:
- `_track_set_author` takes a `Partner` not an id.
- `_message_compute_author` still requires overriding in order to
handle record creation, which in standard doesn't support author
overriding.
- `mail.tracking.value.field_type` has been removed, the field type
now needs to be retrieved from the `field_id`.
- Some tracking ordering have changed and require adjusting a few
tests.
Also added a few flushes before SQL queries which are not (obviously
at least) at the start of a cron or controller, no test failure
observed but better safe than sorry (probably).
2024-08-12 18:13:03 +07:00
|
|
|
|
|
|
|
new = iter(new)
|
|
|
|
old = iter(old)
|
|
|
|
return self.browse(next(new).id if c else next(old).id for c in created)
|
2019-08-23 21:16:30 +07:00
|
|
|
|
|
|
|
def write(self, vals):
|
|
|
|
# if the PR's head is updated, detach (should split off the FP lines as this is not the original code)
|
|
|
|
# TODO: better way to do this? Especially because we don't want to
|
|
|
|
# recursively create updates
|
|
|
|
# also a bit odd to only handle updating 1 head at a time, but then
|
|
|
|
# again 2 PRs with same head is weird so...
|
|
|
|
newhead = vals.get('head')
|
2023-08-29 20:59:05 +07:00
|
|
|
with_parents = {
|
|
|
|
p: p.parent_id
|
|
|
|
for p in self
|
2024-03-19 17:46:36 +07:00
|
|
|
if p.state not in ('merged', 'closed')
|
2023-08-29 20:59:05 +07:00
|
|
|
if p.parent_id
|
|
|
|
}
|
2022-06-29 15:58:13 +07:00
|
|
|
closed_fp = self.filtered(lambda p: p.state == 'closed' and p.source_id)
|
2019-08-23 21:16:30 +07:00
|
|
|
if newhead and not self.env.context.get('ignore_head_update') and newhead != self.head:
|
|
|
|
vals.setdefault('parent_id', False)
|
2023-01-19 20:23:41 +07:00
|
|
|
if with_parents and vals['parent_id'] is False:
|
|
|
|
vals['detach_reason'] = f"Head updated from {self.head} to {newhead}"
|
2019-08-23 21:16:30 +07:00
|
|
|
# if any children, this is an FP PR being updated, enqueue
|
|
|
|
# updating children
|
|
|
|
if self.search_count([('parent_id', '=', self.id)]):
|
|
|
|
self.env['forwardport.updates'].create({
|
2023-10-06 18:19:01 +07:00
|
|
|
'original_root': self.root_id.id,
|
2019-08-23 21:16:30 +07:00
|
|
|
'new_root': self.id
|
|
|
|
})
|
|
|
|
|
|
|
|
if vals.get('parent_id') and 'source_id' not in vals:
|
2023-10-31 13:36:10 +07:00
|
|
|
parent = self.browse(vals['parent_id'])
|
|
|
|
vals['source_id'] = (parent.source_id or parent).id
|
2019-09-23 16:50:21 +07:00
|
|
|
r = super().write(vals)
|
|
|
|
if self.env.context.get('forwardport_detach_warn', True):
|
2023-08-29 20:59:05 +07:00
|
|
|
for p, parent in with_parents.items():
|
|
|
|
if p.parent_id:
|
|
|
|
continue
|
|
|
|
self.env.ref('runbot_merge.forwardport.update.detached')._send(
|
|
|
|
repository=p.repository,
|
|
|
|
pull_request=p.number,
|
|
|
|
token_field='fp_github_token',
|
|
|
|
format_args={'pr': p},
|
|
|
|
)
|
2024-03-12 20:57:50 +07:00
|
|
|
if parent.state not in ('closed', 'merged'):
|
|
|
|
self.env.ref('runbot_merge.forwardport.update.parent')._send(
|
|
|
|
repository=parent.repository,
|
|
|
|
pull_request=parent.number,
|
|
|
|
token_field='fp_github_token',
|
|
|
|
format_args={'pr': parent, 'child': p},
|
2022-06-29 15:58:13 +07:00
|
|
|
)
|
2019-09-23 16:50:21 +07:00
|
|
|
return r
|
2019-08-23 21:16:30 +07:00
|
|
|
|
2019-09-12 16:58:10 +07:00
|
|
|
def _commits_lazy(self):
|
2019-09-12 19:35:56 +07:00
|
|
|
s = requests.Session()
|
|
|
|
s.headers['Authorization'] = 'token %s' % self.repository.project_id.fp_github_token
|
2019-09-12 16:58:10 +07:00
|
|
|
for page in itertools.count(1):
|
2019-09-12 19:35:56 +07:00
|
|
|
r = s.get('https://api.github.com/repos/{}/pulls/{}/commits'.format(
|
2019-09-12 16:58:10 +07:00
|
|
|
self.repository.name,
|
|
|
|
self.number
|
|
|
|
), params={'page': page})
|
|
|
|
r.raise_for_status()
|
|
|
|
yield from r.json()
|
|
|
|
if not r.links.get('next'):
|
|
|
|
return
|
|
|
|
|
|
|
|
def commits(self):
|
|
|
|
""" Returns a PR's commits oldest first (that's what GH does &
|
|
|
|
is what we want)
|
|
|
|
"""
|
|
|
|
commits = list(self._commits_lazy())
|
|
|
|
# map shas to the position the commit *should* have
|
|
|
|
idx = {
|
|
|
|
c: i
|
|
|
|
for i, c in enumerate(topological_sort({
|
|
|
|
c['sha']: [p['sha'] for p in c['parents']]
|
|
|
|
for c in commits
|
|
|
|
}))
|
|
|
|
}
|
|
|
|
return sorted(commits, key=lambda c: idx[c['sha']])
|
|
|
|
|
2021-03-01 16:35:09 +07:00
|
|
|
|
2024-10-22 15:53:45 +07:00
|
|
|
def _create_port_branch(
|
|
|
|
self,
|
|
|
|
source: git.Repo,
|
|
|
|
target_branch: Branch,
|
|
|
|
*,
|
|
|
|
forward: bool,
|
|
|
|
) -> tuple[typing.Optional[Conflict], str]:
|
2019-08-23 21:16:30 +07:00
|
|
|
""" Creates a forward-port for the current PR to ``target_branch`` under
|
|
|
|
``fp_branch_name``.
|
|
|
|
|
2024-10-22 15:53:45 +07:00
|
|
|
:param source: the git repository to work with / in
|
|
|
|
:param target_branch: the branch to port ``self`` to
|
|
|
|
:param forward: whether this is a forward (``True``) or a back
|
|
|
|
(``False``) port
|
|
|
|
:returns: a conflict if one happened, and the head of the port branch
|
|
|
|
(either a succcessful port of the entire `self`, or a conflict
|
|
|
|
commit)
|
2019-08-23 21:16:30 +07:00
|
|
|
"""
|
2022-10-28 13:03:12 +07:00
|
|
|
logger = _logger.getChild(str(self.id))
|
2023-10-06 18:19:01 +07:00
|
|
|
root = self.root_id
|
2022-10-28 13:03:12 +07:00
|
|
|
logger.info(
|
2024-10-22 15:53:45 +07:00
|
|
|
"%s %s (%s) to %s",
|
|
|
|
"Forward-porting" if forward else "Back-porting",
|
2022-10-28 13:03:12 +07:00
|
|
|
self.display_name, root.display_name, target_branch.name
|
|
|
|
)
|
2024-07-26 19:48:59 +07:00
|
|
|
fetch = source.with_config(stdout=subprocess.PIPE, stderr=subprocess.STDOUT).fetch()
|
|
|
|
logger.info("Updated cache repo %s:\n%s", source._directory, fetch.stdout.decode())
|
[FIX] forwardport: storage of old garbage, repo cache sizes
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.
Turns out there are a few issues with this:
- the magic refs have become a bit unreliable, so we need a fallback
(b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
the magic refs live forever and diverge from all other content
(since we rarely `merge` PRs), for a large repo with lots of PRs
this leads to a significant inflation in size of repo (both on-disk
and objects count) e.g. odoo/odoo has about 25% more objects
with the pull refs included (3486550 to 4395319) and takes nearly
50% more space on disk (1.21G to 1.77)
As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.
Eventually this could be optimised further by
- performing an update of the cache repo at the start of the cron iff
there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
single `fetch`
- performing all the ports by cloning the temp clone (and not
`fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
forward port clones
Small followup for #489
2022-11-08 18:02:45 +07:00
|
|
|
|
2024-07-26 19:48:59 +07:00
|
|
|
head_fetch = source.with_config(stdout=subprocess.PIPE, stderr=subprocess.STDOUT, check=False) \
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
.fetch(git.source_url(self.repository), root.head)
|
2024-07-26 19:48:59 +07:00
|
|
|
if head_fetch.returncode:
|
[FIX] forwardport: storage of old garbage, repo cache sizes
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.
Turns out there are a few issues with this:
- the magic refs have become a bit unreliable, so we need a fallback
(b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
the magic refs live forever and diverge from all other content
(since we rarely `merge` PRs), for a large repo with lots of PRs
this leads to a significant inflation in size of repo (both on-disk
and objects count) e.g. odoo/odoo has about 25% more objects
with the pull refs included (3486550 to 4395319) and takes nearly
50% more space on disk (1.21G to 1.77)
As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.
Eventually this could be optimised further by
- performing an update of the cache repo at the start of the cron iff
there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
single `fetch`
- performing all the ports by cloning the temp clone (and not
`fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
forward port clones
Small followup for #489
2022-11-08 18:02:45 +07:00
|
|
|
raise ForwardPortError(
|
|
|
|
f"During forward port of {self.display_name}, unable to find "
|
|
|
|
f"expected head of {root.display_name} ({root.head})"
|
|
|
|
)
|
|
|
|
|
|
|
|
logger.info(
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
"Fetched head of %s (%s):\n%s",
|
[FIX] forwardport: storage of old garbage, repo cache sizes
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.
Turns out there are a few issues with this:
- the magic refs have become a bit unreliable, so we need a fallback
(b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
the magic refs live forever and diverge from all other content
(since we rarely `merge` PRs), for a large repo with lots of PRs
this leads to a significant inflation in size of repo (both on-disk
and objects count) e.g. odoo/odoo has about 25% more objects
with the pull refs included (3486550 to 4395319) and takes nearly
50% more space on disk (1.21G to 1.77)
As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.
Eventually this could be optimised further by
- performing an update of the cache repo at the start of the cron iff
there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
single `fetch`
- performing all the ports by cloning the temp clone (and not
`fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
forward port clones
Small followup for #489
2022-11-08 18:02:45 +07:00
|
|
|
root.display_name,
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
root.head,
|
2024-07-26 19:48:59 +07:00
|
|
|
head_fetch.stdout.decode()
|
2019-08-23 21:16:30 +07:00
|
|
|
)
|
|
|
|
|
|
|
|
try:
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
return None, root._cherry_pick(source, target_branch.name)
|
2019-08-23 21:16:30 +07:00
|
|
|
except CherrypickError as e:
|
[FIX] forwardport: storage of old garbage, repo cache sizes
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.
Turns out there are a few issues with this:
- the magic refs have become a bit unreliable, so we need a fallback
(b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
the magic refs live forever and diverge from all other content
(since we rarely `merge` PRs), for a large repo with lots of PRs
this leads to a significant inflation in size of repo (both on-disk
and objects count) e.g. odoo/odoo has about 25% more objects
with the pull refs included (3486550 to 4395319) and takes nearly
50% more space on disk (1.21G to 1.77)
As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.
Eventually this could be optimised further by
- performing an update of the cache repo at the start of the cron iff
there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
single `fetch`
- performing all the ports by cloning the temp clone (and not
`fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
forward port clones
Small followup for #489
2022-11-08 18:02:45 +07:00
|
|
|
h, out, err, commits = e.args
|
|
|
|
|
2021-08-02 14:18:30 +07:00
|
|
|
# commits returns oldest first, so youngest (head) last
|
[FIX] forwardport: storage of old garbage, repo cache sizes
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.
Turns out there are a few issues with this:
- the magic refs have become a bit unreliable, so we need a fallback
(b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
the magic refs live forever and diverge from all other content
(since we rarely `merge` PRs), for a large repo with lots of PRs
this leads to a significant inflation in size of repo (both on-disk
and objects count) e.g. odoo/odoo has about 25% more objects
with the pull refs included (3486550 to 4395319) and takes nearly
50% more space on disk (1.21G to 1.77)
As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.
Eventually this could be optimised further by
- performing an update of the cache repo at the start of the cron iff
there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
single `fetch`
- performing all the ports by cloning the temp clone (and not
`fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
forward port clones
Small followup for #489
2022-11-08 18:02:45 +07:00
|
|
|
head_commit = commits[-1]['commit']
|
2019-09-12 16:58:10 +07:00
|
|
|
|
2021-08-02 14:18:30 +07:00
|
|
|
to_tuple = operator.itemgetter('name', 'email')
|
2019-11-20 13:43:56 +07:00
|
|
|
authors, committers = set(), set()
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
for commit in (c['commit'] for c in commits):
|
|
|
|
authors.add(to_tuple(commit['author']))
|
|
|
|
committers.add(to_tuple(commit['committer']))
|
|
|
|
fp_authorship = (self.repository.project_id.fp_github_name, '', '')
|
[FIX] forwardport: don't break forward porting on huge conflicts
On forward-porting, odoo/odoo#170183 generates a conflict on pretty
much every one of the 1111 files it touches, because they are
modify/delete conflicts that generates a conflict message over 200
bytes per file, which is over 200kB of output.
For this specific scenario, the commit message was being passed
through arguments to the `git` command, resulting in a command line
exceeding `MAX_ARG_STRLEN`[^1]. The obvious way to fix this is to pass
the commit message via stdin as is done literally in the line above
where we just copy a non-generated commit message.
However I don't think hundreds of kbytes worth of stdout[^2] is of any
use, so shorten that a bit, and stderr while at it.
Don't touch the commit message size for now, possibly forever, but
note that some log diving reveals a commit with a legit 18kB message
(odoo/odoo@42a3b704f7c7889a74338253138d0201d3b6e7a3) so if we want to
restrict that the limit should be at least 32k, and possibly 64. But
it might be a good idea to make that limit part of the ready / merge
checks too, rather than cut things off or trigger errors during
staging.
Fixes #900
[^1]: Most resources on "Argument list too long" reference `ARG_MAX`,
but on both my machine and the server it is 2097152 (25% of the
default stack), which is ~10x larger than the commit message we
tried to generate. The actual limit is `MAX_ARG_STRLEN` which
can't be queried directly but is essentially hard-coded to
PAGE_SIZE * 32 = 128kiB, which tracks.
[^2]: Somewhat unexpectedly, that's where `git-cherry-pick` sends the
conflict info.
2024-06-25 18:11:32 +07:00
|
|
|
author = fp_authorship if len(authors) != 1 \
|
2021-08-02 14:18:30 +07:00
|
|
|
else authors.pop() + (head_commit['author']['date'],)
|
|
|
|
committer = fp_authorship if len(committers) != 1 \
|
|
|
|
else committers.pop() + (head_commit['committer']['date'],)
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
conf = source.with_params(
|
2024-06-04 19:18:36 +07:00
|
|
|
'merge.renamelimit=0',
|
|
|
|
'merge.renames=copies',
|
|
|
|
'merge.conflictstyle=zdiff3'
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
).with_config(stdout=subprocess.PIPE, stderr=subprocess.PIPE)
|
[FIX] forwardport: don't break forward porting on huge conflicts
On forward-porting, odoo/odoo#170183 generates a conflict on pretty
much every one of the 1111 files it touches, because they are
modify/delete conflicts that generates a conflict message over 200
bytes per file, which is over 200kB of output.
For this specific scenario, the commit message was being passed
through arguments to the `git` command, resulting in a command line
exceeding `MAX_ARG_STRLEN`[^1]. The obvious way to fix this is to pass
the commit message via stdin as is done literally in the line above
where we just copy a non-generated commit message.
However I don't think hundreds of kbytes worth of stdout[^2] is of any
use, so shorten that a bit, and stderr while at it.
Don't touch the commit message size for now, possibly forever, but
note that some log diving reveals a commit with a legit 18kB message
(odoo/odoo@42a3b704f7c7889a74338253138d0201d3b6e7a3) so if we want to
restrict that the limit should be at least 32k, and possibly 64. But
it might be a good idea to make that limit part of the ready / merge
checks too, rather than cut things off or trigger errors during
staging.
Fixes #900
[^1]: Most resources on "Argument list too long" reference `ARG_MAX`,
but on both my machine and the server it is 2097152 (25% of the
default stack), which is ~10x larger than the commit message we
tried to generate. The actual limit is `MAX_ARG_STRLEN` which
can't be queried directly but is essentially hard-coded to
PAGE_SIZE * 32 = 128kiB, which tracks.
[^2]: Somewhat unexpectedly, that's where `git-cherry-pick` sends the
conflict info.
2024-06-25 18:11:32 +07:00
|
|
|
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
tree = conf.with_config(check=False).merge_tree(
|
|
|
|
'--merge-base', commits[0]['parents'][0]['sha'],
|
|
|
|
target_branch.name,
|
|
|
|
root.head,
|
[IMP] forwardport: surfacing of modify/delete conflicts
Given branch A, and branch B forked from it. If B removes a file which
a PR to A later modifies, on forward port Git generates a
modify/delete conflict (as in one side modifies a file which the
other deleted).
So far so good, except while it does notify the caller of the issue
the modified file is just dumped as-is into the working copy (or
commit), which essentially resurrects it.
This is an issue, *especially* as the file is already part of a
commit (rather tan just a U local file), if the developer fixes the
conflict "in place" rather than re-doing the forward-port from scratch
it's easy to miss the reincarnated file (and possibly the changes that
were part of it too), which at best leaves parasitic dead code in the
working copy. There is also no easy way for the runbot to check it as
adding unimported standalone files while rare is not unknown
e.g. utility scripts (to say nothing of JS where we can't really track
the usages / imports at all).
To resolve this issue, during conflict generation post-process
modify/delete to insert artificial conflict markers, the file should
be syntactically invalid so linters / checkers should complain, and
the minimal check has a step looking for conflict markers which should
fire and prevent merging the PR.
Fixes #896
2024-09-27 17:37:49 +07:00
|
|
|
).stdout.decode().splitlines(keepends=False)[0]
|
2019-09-20 21:07:22 +07:00
|
|
|
# if there was a single commit, reuse its message when committing
|
|
|
|
# the conflict
|
[FIX] forwardport: storage of old garbage, repo cache sizes
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.
Turns out there are a few issues with this:
- the magic refs have become a bit unreliable, so we need a fallback
(b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
the magic refs live forever and diverge from all other content
(since we rarely `merge` PRs), for a large repo with lots of PRs
this leads to a significant inflation in size of repo (both on-disk
and objects count) e.g. odoo/odoo has about 25% more objects
with the pull refs included (3486550 to 4395319) and takes nearly
50% more space on disk (1.21G to 1.77)
As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.
Eventually this could be optimised further by
- performing an update of the cache repo at the start of the cron iff
there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
single `fetch`
- performing all the ports by cloning the temp clone (and not
`fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
forward port clones
Small followup for #489
2022-11-08 18:02:45 +07:00
|
|
|
if len(commits) == 1:
|
|
|
|
msg = root._make_fp_message(commits[0])
|
2019-09-20 21:07:22 +07:00
|
|
|
else:
|
[FIX] forwardport: don't break forward porting on huge conflicts
On forward-porting, odoo/odoo#170183 generates a conflict on pretty
much every one of the 1111 files it touches, because they are
modify/delete conflicts that generates a conflict message over 200
bytes per file, which is over 200kB of output.
For this specific scenario, the commit message was being passed
through arguments to the `git` command, resulting in a command line
exceeding `MAX_ARG_STRLEN`[^1]. The obvious way to fix this is to pass
the commit message via stdin as is done literally in the line above
where we just copy a non-generated commit message.
However I don't think hundreds of kbytes worth of stdout[^2] is of any
use, so shorten that a bit, and stderr while at it.
Don't touch the commit message size for now, possibly forever, but
note that some log diving reveals a commit with a legit 18kB message
(odoo/odoo@42a3b704f7c7889a74338253138d0201d3b6e7a3) so if we want to
restrict that the limit should be at least 32k, and possibly 64. But
it might be a good idea to make that limit part of the ready / merge
checks too, rather than cut things off or trigger errors during
staging.
Fixes #900
[^1]: Most resources on "Argument list too long" reference `ARG_MAX`,
but on both my machine and the server it is 2097152 (25% of the
default stack), which is ~10x larger than the commit message we
tried to generate. The actual limit is `MAX_ARG_STRLEN` which
can't be queried directly but is essentially hard-coded to
PAGE_SIZE * 32 = 128kiB, which tracks.
[^2]: Somewhat unexpectedly, that's where `git-cherry-pick` sends the
conflict info.
2024-06-25 18:11:32 +07:00
|
|
|
out = utils.shorten(out, 8*1024, '[...]')
|
|
|
|
err = utils.shorten(err, 8*1024, '[...]')
|
|
|
|
msg = f"""Cherry pick of {h} failed
|
2019-08-23 21:16:30 +07:00
|
|
|
|
|
|
|
stdout:
|
[FIX] forwardport: don't break forward porting on huge conflicts
On forward-porting, odoo/odoo#170183 generates a conflict on pretty
much every one of the 1111 files it touches, because they are
modify/delete conflicts that generates a conflict message over 200
bytes per file, which is over 200kB of output.
For this specific scenario, the commit message was being passed
through arguments to the `git` command, resulting in a command line
exceeding `MAX_ARG_STRLEN`[^1]. The obvious way to fix this is to pass
the commit message via stdin as is done literally in the line above
where we just copy a non-generated commit message.
However I don't think hundreds of kbytes worth of stdout[^2] is of any
use, so shorten that a bit, and stderr while at it.
Don't touch the commit message size for now, possibly forever, but
note that some log diving reveals a commit with a legit 18kB message
(odoo/odoo@42a3b704f7c7889a74338253138d0201d3b6e7a3) so if we want to
restrict that the limit should be at least 32k, and possibly 64. But
it might be a good idea to make that limit part of the ready / merge
checks too, rather than cut things off or trigger errors during
staging.
Fixes #900
[^1]: Most resources on "Argument list too long" reference `ARG_MAX`,
but on both my machine and the server it is 2097152 (25% of the
default stack), which is ~10x larger than the commit message we
tried to generate. The actual limit is `MAX_ARG_STRLEN` which
can't be queried directly but is essentially hard-coded to
PAGE_SIZE * 32 = 128kiB, which tracks.
[^2]: Somewhat unexpectedly, that's where `git-cherry-pick` sends the
conflict info.
2024-06-25 18:11:32 +07:00
|
|
|
{out}
|
2019-08-23 21:16:30 +07:00
|
|
|
stderr:
|
[FIX] forwardport: don't break forward porting on huge conflicts
On forward-porting, odoo/odoo#170183 generates a conflict on pretty
much every one of the 1111 files it touches, because they are
modify/delete conflicts that generates a conflict message over 200
bytes per file, which is over 200kB of output.
For this specific scenario, the commit message was being passed
through arguments to the `git` command, resulting in a command line
exceeding `MAX_ARG_STRLEN`[^1]. The obvious way to fix this is to pass
the commit message via stdin as is done literally in the line above
where we just copy a non-generated commit message.
However I don't think hundreds of kbytes worth of stdout[^2] is of any
use, so shorten that a bit, and stderr while at it.
Don't touch the commit message size for now, possibly forever, but
note that some log diving reveals a commit with a legit 18kB message
(odoo/odoo@42a3b704f7c7889a74338253138d0201d3b6e7a3) so if we want to
restrict that the limit should be at least 32k, and possibly 64. But
it might be a good idea to make that limit part of the ready / merge
checks too, rather than cut things off or trigger errors during
staging.
Fixes #900
[^1]: Most resources on "Argument list too long" reference `ARG_MAX`,
but on both my machine and the server it is 2097152 (25% of the
default stack), which is ~10x larger than the commit message we
tried to generate. The actual limit is `MAX_ARG_STRLEN` which
can't be queried directly but is essentially hard-coded to
PAGE_SIZE * 32 = 128kiB, which tracks.
[^2]: Somewhat unexpectedly, that's where `git-cherry-pick` sends the
conflict info.
2024-06-25 18:11:32 +07:00
|
|
|
{err}
|
|
|
|
"""
|
|
|
|
|
[IMP] forwardport: surfacing of modify/delete conflicts
Given branch A, and branch B forked from it. If B removes a file which
a PR to A later modifies, on forward port Git generates a
modify/delete conflict (as in one side modifies a file which the
other deleted).
So far so good, except while it does notify the caller of the issue
the modified file is just dumped as-is into the working copy (or
commit), which essentially resurrects it.
This is an issue, *especially* as the file is already part of a
commit (rather tan just a U local file), if the developer fixes the
conflict "in place" rather than re-doing the forward-port from scratch
it's easy to miss the reincarnated file (and possibly the changes that
were part of it too), which at best leaves parasitic dead code in the
working copy. There is also no easy way for the runbot to check it as
adding unimported standalone files while rare is not unknown
e.g. utility scripts (to say nothing of JS where we can't really track
the usages / imports at all).
To resolve this issue, during conflict generation post-process
modify/delete to insert artificial conflict markers, the file should
be syntactically invalid so linters / checkers should complain, and
the minimal check has a step looking for conflict markers which should
fire and prevent merging the PR.
Fixes #896
2024-09-27 17:37:49 +07:00
|
|
|
# if a file is modified by the original PR and added by the forward
|
|
|
|
# port / conflict it's a modify/delete conflict: the file was
|
|
|
|
# deleted in the target branch, and the update (modify) in the
|
|
|
|
# original PR causes it to be added back
|
|
|
|
original_modified = set(conf.diff(
|
|
|
|
"--diff-filter=M", "--name-only",
|
|
|
|
"--merge-base", root.target.name,
|
|
|
|
root.head,
|
|
|
|
).stdout.decode().splitlines(keepends=False))
|
|
|
|
conflict_added = set(conf.diff(
|
|
|
|
"--diff-filter=A", "--name-only",
|
|
|
|
target_branch.name,
|
|
|
|
tree,
|
|
|
|
).stdout.decode().splitlines(keepends=False))
|
|
|
|
if modify_delete := (conflict_added & original_modified):
|
|
|
|
# rewrite the tree with conflict markers added to modify/deleted blobs
|
|
|
|
tree = conf.modify_delete(tree, modify_delete)
|
|
|
|
|
[FIX] *: ensure I don't get bollocked up again by tags
Today (or really a month ago) I learned: when giving git a symbolic
ref (e.g. a ref name), if it's ambiguous then
1. If `$GIT_DIR/$name` exists, that is what you mean (this is usually
useful only for `HEAD`, `FETCH_HEAD`, `ORIG_HEAD`, `MERGE_HEAD`,
`REBASE_HEAD`, `REVERT_HEAD`, `CHERRY_PICK_HEAD`, `BISECT_HEAD` and
`AUTO_MERGE`)
2. otherwise, `refs/$name` if it exists
3. otherwise, `refs/tags/$name` if it exists
4. otherwise, `refs/heads/$name` if it exists
5. otherwise, `refs/remotes/$name` if it exists
6. otherwise, `refs/remotes/$name/HEAD` if it exists
This means if a tag and a branch have the same name and only the name
is provided (not the full ref), git will select the tag, which gets
very confusing for the mergebot as it now tries to rebase onto the tag
(which because that's not fun otherwise was not even on the branch of
the same name).
Fix by providing full refs to `rev-parse` when trying to retrieve the
head of the target branches. And as a defense in depth opportunity,
also exclude tags when fetching refs by spec: apparently fetching a
specific commit does not trigger the retrieval of tags, but any sort
of spec will see the tags come along for the ride even if the tags are
not in any way under the fetched ref e.g. `refs/heads/*` will very
much retrieve the tags despite them being located at `refs/tags/*`.
Fixes #922
2024-09-06 20:09:08 +07:00
|
|
|
target_head = source.stdout().rev_parse(f"refs/heads/{target_branch.name}")\
|
|
|
|
.stdout.decode().strip()
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
commit = conf.commit_tree(
|
[IMP] forwardport: surfacing of modify/delete conflicts
Given branch A, and branch B forked from it. If B removes a file which
a PR to A later modifies, on forward port Git generates a
modify/delete conflict (as in one side modifies a file which the
other deleted).
So far so good, except while it does notify the caller of the issue
the modified file is just dumped as-is into the working copy (or
commit), which essentially resurrects it.
This is an issue, *especially* as the file is already part of a
commit (rather tan just a U local file), if the developer fixes the
conflict "in place" rather than re-doing the forward-port from scratch
it's easy to miss the reincarnated file (and possibly the changes that
were part of it too), which at best leaves parasitic dead code in the
working copy. There is also no easy way for the runbot to check it as
adding unimported standalone files while rare is not unknown
e.g. utility scripts (to say nothing of JS where we can't really track
the usages / imports at all).
To resolve this issue, during conflict generation post-process
modify/delete to insert artificial conflict markers, the file should
be syntactically invalid so linters / checkers should complain, and
the minimal check has a step looking for conflict markers which should
fire and prevent merging the PR.
Fixes #896
2024-09-27 17:37:49 +07:00
|
|
|
tree=tree,
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
parents=[target_head],
|
|
|
|
message=str(msg),
|
|
|
|
author=author,
|
|
|
|
committer=committer[:2],
|
|
|
|
)
|
|
|
|
assert commit.returncode == 0,\
|
|
|
|
f"commit failed\n\n{commit.stdout.decode()}\n\n{commit.stderr.decode}"
|
|
|
|
hh = commit.stdout.strip()
|
[FIX] forwardport: don't break forward porting on huge conflicts
On forward-porting, odoo/odoo#170183 generates a conflict on pretty
much every one of the 1111 files it touches, because they are
modify/delete conflicts that generates a conflict message over 200
bytes per file, which is over 200kB of output.
For this specific scenario, the commit message was being passed
through arguments to the `git` command, resulting in a command line
exceeding `MAX_ARG_STRLEN`[^1]. The obvious way to fix this is to pass
the commit message via stdin as is done literally in the line above
where we just copy a non-generated commit message.
However I don't think hundreds of kbytes worth of stdout[^2] is of any
use, so shorten that a bit, and stderr while at it.
Don't touch the commit message size for now, possibly forever, but
note that some log diving reveals a commit with a legit 18kB message
(odoo/odoo@42a3b704f7c7889a74338253138d0201d3b6e7a3) so if we want to
restrict that the limit should be at least 32k, and possibly 64. But
it might be a good idea to make that limit part of the ready / merge
checks too, rather than cut things off or trigger errors during
staging.
Fixes #900
[^1]: Most resources on "Argument list too long" reference `ARG_MAX`,
but on both my machine and the server it is 2097152 (25% of the
default stack), which is ~10x larger than the commit message we
tried to generate. The actual limit is `MAX_ARG_STRLEN` which
can't be queried directly but is essentially hard-coded to
PAGE_SIZE * 32 = 128kiB, which tracks.
[^2]: Somewhat unexpectedly, that's where `git-cherry-pick` sends the
conflict info.
2024-06-25 18:11:32 +07:00
|
|
|
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
return (h, out, err, [c['sha'] for c in commits]), hh
|
2019-08-23 21:16:30 +07:00
|
|
|
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
def _cherry_pick(self, repo: git.Repo, branch: Branch) -> str:
|
|
|
|
""" Cherrypicks ``self`` into ``branch``
|
2019-08-23 21:16:30 +07:00
|
|
|
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
:return: the HEAD of the forward-port is successful
|
|
|
|
:raises CherrypickError: in case of conflict
|
2019-08-23 21:16:30 +07:00
|
|
|
"""
|
|
|
|
# <xxx>.cherrypick.<number>
|
2022-10-28 13:03:12 +07:00
|
|
|
logger = _logger.getChild(str(self.id)).getChild('cherrypick')
|
2019-08-23 21:16:30 +07:00
|
|
|
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
# target's head
|
[FIX] *: ensure I don't get bollocked up again by tags
Today (or really a month ago) I learned: when giving git a symbolic
ref (e.g. a ref name), if it's ambiguous then
1. If `$GIT_DIR/$name` exists, that is what you mean (this is usually
useful only for `HEAD`, `FETCH_HEAD`, `ORIG_HEAD`, `MERGE_HEAD`,
`REBASE_HEAD`, `REVERT_HEAD`, `CHERRY_PICK_HEAD`, `BISECT_HEAD` and
`AUTO_MERGE`)
2. otherwise, `refs/$name` if it exists
3. otherwise, `refs/tags/$name` if it exists
4. otherwise, `refs/heads/$name` if it exists
5. otherwise, `refs/remotes/$name` if it exists
6. otherwise, `refs/remotes/$name/HEAD` if it exists
This means if a tag and a branch have the same name and only the name
is provided (not the full ref), git will select the tag, which gets
very confusing for the mergebot as it now tries to rebase onto the tag
(which because that's not fun otherwise was not even on the branch of
the same name).
Fix by providing full refs to `rev-parse` when trying to retrieve the
head of the target branches. And as a defense in depth opportunity,
also exclude tags when fetching refs by spec: apparently fetching a
specific commit does not trigger the retrieval of tags, but any sort
of spec will see the tags come along for the ride even if the tags are
not in any way under the fetched ref e.g. `refs/heads/*` will very
much retrieve the tags despite them being located at `refs/tags/*`.
Fixes #922
2024-09-06 20:09:08 +07:00
|
|
|
head = repo.stdout().rev_parse(f"refs/heads/{branch}").stdout.decode().strip()
|
2019-08-23 21:16:30 +07:00
|
|
|
|
2019-09-12 16:58:10 +07:00
|
|
|
commits = self.commits()
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
logger.info(
|
|
|
|
"%s: copy %s commits to %s (%s)%s",
|
|
|
|
self, len(commits), branch, head, ''.join(
|
|
|
|
'\n- %s: %s' % (c['sha'], c['commit']['message'].splitlines()[0])
|
|
|
|
for c in commits
|
|
|
|
)
|
|
|
|
)
|
2019-09-12 16:58:10 +07:00
|
|
|
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
conf = repo.with_params(
|
2024-06-04 19:18:36 +07:00
|
|
|
'merge.renamelimit=0',
|
|
|
|
'merge.renames=copies',
|
|
|
|
).with_config(
|
|
|
|
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
check=False,
|
2024-06-04 19:18:36 +07:00
|
|
|
)
|
2019-08-23 21:16:30 +07:00
|
|
|
for commit in commits:
|
2019-09-12 16:58:10 +07:00
|
|
|
commit_sha = commit['sha']
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
# merge-tree is a bit stupid and gets confused when the options
|
|
|
|
# follow the parameters
|
|
|
|
r = conf.merge_tree('--merge-base', commit['parents'][0]['sha'], head, commit_sha)
|
|
|
|
new_tree = r.stdout.decode().splitlines(keepends=False)[0]
|
2019-09-13 18:13:58 +07:00
|
|
|
if r.returncode:
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
# For merge-tree the stdout on conflict is of the form
|
|
|
|
#
|
|
|
|
# oid of toplevel tree
|
|
|
|
# conflicted file info+
|
|
|
|
#
|
|
|
|
# informational messages+
|
|
|
|
#
|
|
|
|
# to match cherrypick we only want the informational messages,
|
|
|
|
# so strip everything else
|
|
|
|
r.stdout = r.stdout.split(b'\n\n')[-1]
|
|
|
|
else:
|
|
|
|
# By default cherry-pick fails if a non-empty commit becomes
|
|
|
|
# empty (--empty=stop), also it fails when cherrypicking already
|
|
|
|
# empty commits which I didn't think we prevented but clearly we
|
|
|
|
# do...?
|
|
|
|
parent_tree = conf.rev_parse(f'{head}^{{tree}}').stdout.decode().strip()
|
|
|
|
if parent_tree == new_tree:
|
|
|
|
r.returncode = 1
|
|
|
|
r.stdout = f"You are currently cherry-picking commit {commit_sha}.".encode()
|
|
|
|
r.stderr = b"The previous cherry-pick is now empty, possibly due to conflict resolution."
|
2019-08-23 21:16:30 +07:00
|
|
|
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
logger.debug("Cherry-picked %s: %s\n%s\n%s", commit_sha, r.returncode, r.stdout.decode(), _clean_rename(r.stderr.decode()))
|
|
|
|
if r.returncode: # pick failed, bail
|
2021-01-15 20:19:15 +07:00
|
|
|
# try to log inflateInit: out of memory errors as warning, they
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
# seem to return the status code 128 (nb: may not work anymore
|
|
|
|
# with merge-tree, idk)
|
2021-01-15 20:19:15 +07:00
|
|
|
logger.log(
|
|
|
|
logging.WARNING if r.returncode == 128 else logging.INFO,
|
|
|
|
"forward-port of %s (%s) failed at %s",
|
|
|
|
self, self.display_name, commit_sha)
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
|
2019-09-13 18:13:58 +07:00
|
|
|
raise CherrypickError(
|
|
|
|
commit_sha,
|
|
|
|
r.stdout.decode(),
|
2021-08-11 19:08:18 +07:00
|
|
|
_clean_rename(r.stderr.decode()),
|
[FIX] forwardport: storage of old garbage, repo cache sizes
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.
Turns out there are a few issues with this:
- the magic refs have become a bit unreliable, so we need a fallback
(b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
the magic refs live forever and diverge from all other content
(since we rarely `merge` PRs), for a large repo with lots of PRs
this leads to a significant inflation in size of repo (both on-disk
and objects count) e.g. odoo/odoo has about 25% more objects
with the pull refs included (3486550 to 4395319) and takes nearly
50% more space on disk (1.21G to 1.77)
As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.
Eventually this could be optimised further by
- performing an update of the cache repo at the start of the cron iff
there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
single `fetch`
- performing all the ports by cloning the temp clone (and not
`fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
forward port clones
Small followup for #489
2022-11-08 18:02:45 +07:00
|
|
|
commits
|
2019-09-13 18:13:58 +07:00
|
|
|
)
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
# get the "git" commit object rather than the "github" commit resource
|
|
|
|
cc = conf.commit_tree(
|
|
|
|
tree=new_tree,
|
|
|
|
parents=[head],
|
|
|
|
message=str(self._make_fp_message(commit)),
|
|
|
|
author=map_author(commit['commit']['author']),
|
|
|
|
committer=map_committer(commit['commit']['committer']),
|
|
|
|
)
|
|
|
|
if cc.returncode:
|
|
|
|
raise CherrypickError(commit_sha, cc.stdout.decode(), cc.stderr.decode(), commits)
|
2019-08-23 21:16:30 +07:00
|
|
|
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
head = cc.stdout.strip()
|
|
|
|
logger.info('%s -> %s', commit_sha, head)
|
2020-03-02 14:54:58 +07:00
|
|
|
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
return head
|
2020-03-02 14:54:58 +07:00
|
|
|
|
2019-11-20 13:43:56 +07:00
|
|
|
def _make_fp_message(self, commit):
|
|
|
|
cmap = json.loads(self.commits_map)
|
2023-08-16 20:31:42 +07:00
|
|
|
msg = Message.from_message(commit['commit']['message'])
|
2019-11-20 13:43:56 +07:00
|
|
|
# write the *merged* commit as "original", not the PR's
|
|
|
|
msg.headers['x-original-commit'] = cmap.get(commit['sha'], commit['sha'])
|
|
|
|
# don't stringify so caller can still perform alterations
|
|
|
|
return msg
|
|
|
|
|
2024-06-10 23:47:49 +07:00
|
|
|
def _outstanding(self, cutoff: str) -> typing.ItemsView[PullRequests, list[PullRequests]]:
|
2021-07-28 18:57:58 +07:00
|
|
|
""" Returns "outstanding" (unmerged and unclosed) forward-ports whose
|
|
|
|
source was merged before ``cutoff`` (all of them if not provided).
|
2019-10-10 17:07:57 +07:00
|
|
|
|
2024-06-10 23:47:49 +07:00
|
|
|
:param cutoff: a datetime (ISO-8601 formatted)
|
2021-07-28 18:57:58 +07:00
|
|
|
:returns: an iterator of (source, forward_ports)
|
|
|
|
"""
|
|
|
|
return groupby(self.env['runbot_merge.pull_requests'].search([
|
2019-10-10 17:07:57 +07:00
|
|
|
# only FP PRs
|
|
|
|
('source_id', '!=', False),
|
|
|
|
# active
|
|
|
|
('state', 'not in', ['merged', 'closed']),
|
2024-09-06 17:33:51 +07:00
|
|
|
# not ready
|
|
|
|
('blocked', '!=', False),
|
2021-10-19 19:39:19 +07:00
|
|
|
('source_id.merge_date', '<', cutoff),
|
2021-07-28 18:57:58 +07:00
|
|
|
], order='source_id, id'), lambda p: p.source_id)
|
|
|
|
|
|
|
|
def _reminder(self):
|
[IMP] *: crons tests running for better triggered compatibility
Mergebot / forwardport crons need to run in a specific ordering in
order to flow into one another correctly. The default ordering being
unspecified, it was not possible to use the normal cron
runner (instead of the external driver running crons in sequence one
at a time). This can be fixed by setting *sequences* on crons, as the
cron runner (`_process_jobs`) will use that order to acquire and run
crons.
Also override `_process_jobs` however: the built-in cron runner
fetches a static list of ready crons, then runs that.
This is fine for normal situation where the cron runner runs in a loop
anyway but it's any issue for the tests, as we expect that cron A can
trigger cron B, and we want cron B to run *right now* even if it
hadn't been triggered before cron A ran.
We can replace `_process_job` with a cut down version which does
that (cut down because we don't need most of the error handling /
resilience, there's no concurrent workers, there's no module being
installed, versions must match, ...). This allows e.g. the cron
propagating commit statuses to trigger the staging cron, and both will
run within the same `run_crons` session.
Something I didn't touch is that `_process_jobs` internally creates
completely new environments so there is no way to pass context into
the cron jobs anymore (whereas it works for `method_direct_trigger`),
this means the context values have to be shunted elsewhere for that
purpose which is gross. But even though I'm replacing `_process_jobs`,
this seems a bit too much of a change in cron execution semantics. So
left it out.
While at it tho, silence the spammy `py.warnings` stuff I can't do
much about.
2024-07-30 14:21:05 +07:00
|
|
|
cutoff = getattr(builtins, 'forwardport_updated_before', None) \
|
2021-07-28 18:57:58 +07:00
|
|
|
or fields.Datetime.to_string(datetime.datetime.now() - DEFAULT_DELTA)
|
|
|
|
cutoff_dt = fields.Datetime.from_string(cutoff)
|
|
|
|
|
|
|
|
for source, prs in self._outstanding(cutoff):
|
2020-03-16 18:19:06 +07:00
|
|
|
backoff = dateutil.relativedelta.relativedelta(days=2**source.reminder_backoff_factor)
|
2020-05-20 17:42:45 +07:00
|
|
|
if source.merge_date > (cutoff_dt - backoff):
|
2020-03-16 18:19:06 +07:00
|
|
|
continue
|
|
|
|
source.reminder_backoff_factor += 1
|
2024-06-10 23:47:49 +07:00
|
|
|
|
2024-09-06 17:33:51 +07:00
|
|
|
# only keep the PRs which don't have an attached descendant
|
|
|
|
for pr in set(prs).difference(p.parent_id for p in prs):
|
2024-06-03 20:29:47 +07:00
|
|
|
self.env.ref('runbot_merge.forwardport.reminder')._send(
|
|
|
|
repository=pr.repository,
|
|
|
|
pull_request=pr.number,
|
|
|
|
token_field='fp_github_token',
|
|
|
|
format_args={'pr': pr, 'source': source},
|
|
|
|
)
|
2022-11-03 17:46:31 +07:00
|
|
|
|
[CHG] forwardport: perform forward porting without working copies
The goal is to reduce maintenance and odd disk interactions &
concurrency issues, by not creating concurrent clones, not having to
push forks back in the repository, etc... it also removes the need to
cleanup "scratch" working copies though that looks not to have been an
issue in a while.
The work is done on isolated objects without using or mutating refs,
so even concurrent work should not be a problem.
This turns out to not be any more verbose (less so if anything) than
using `cherry-pick`, as that is not really designed for scripted /
non-interactive use, or for squashing commits thereafter. Working
directly with trees and commits is quite a bit cleaner even without a
ton of helpers.
Much of the credit goes to Julia Evans for [their investigation of
3-way merges as the underpinnings of cherry-picking][3-way merge],
this would have been a lot more difficult if I'd had to rediscover the
merge-base trick independently.
A few things have been changed by this:
- The old trace/stderr from cherrypick has disappeared as it's
generated by cherrypick, but for a non-interactive use it's kinda
useless anyway so I probably should have looked into removing it
earlier (I think the main use was investigation of the inflateinit
issue).
- Error on emptied commits has to be hand-rolled as `merge-tree`
couldn't care less, this is not hard but is a bit annoying.
- `merge-tree`'s conflict information only references raw commits,
which makes sense, but requires updating a bunch of tests. Then
again so does the fact that it *usually* doesn't send anything to
stderr, so that's usually disappearing.
Conveniently `merge-tree` merges the conflict marker directly in the
files / tree so we don't have to mess about moving them back out of
the repository and into the working copy as I assume cherry-pick does,
which means we don't have to try and commit them back in ether. That
is a huge part of the gain over faffing about with the working copy.
Fixes #847
[3-way merge]: https://jvns.ca/blog/2023/11/10/how-cherry-pick-and-revert-work/
2024-07-05 18:32:02 +07:00
|
|
|
|
|
|
|
map_author = operator.itemgetter('name', 'email', 'date')
|
|
|
|
map_committer = operator.itemgetter('name', 'email')
|
|
|
|
|
2022-11-03 17:46:31 +07:00
|
|
|
|
2019-08-23 21:16:30 +07:00
|
|
|
class Stagings(models.Model):
|
|
|
|
_inherit = 'runbot_merge.stagings'
|
|
|
|
|
|
|
|
def write(self, vals):
|
|
|
|
r = super().write(vals)
|
2019-10-15 13:54:25 +07:00
|
|
|
# we've just deactivated a successful staging (so it got ~merged)
|
|
|
|
if vals.get('active') is False and self.state == 'success':
|
[CHG] *: move forward-porting over to batches
Thank god I have a bunch of tests because once again I forgot / missed
a bunch of edge cases in doing the conversion, which the tests
caught (sadly that means I almost certainly broke a few untested edge
cases).
Important notes:
Handling of parent links
------------------------
Unlike PRs, batches don't lose their parent info ever, the link is
permanent, which is convenient to trawl through a forward port
(currently implemented very inefficiently, maybe we'll optimise that
in the future).
However this means the batch having a parent and the batch's PRs
having parents are slightly different informations, one of the edge
cases I missed is that of conflicting PRs, which are deparented and
have to be merged by hand before being forward ported further, I had
originally replaced the checks on a pr and its sibling having parents
by just the batch.
Batches & targets
-----------------
Batches were originally concepted as being fixed to a target and PRs
having that target, a PR being retargeted would move it from one batch
to an other.
As it turns out this does not work in the case where people retarget
forward-port PRs, which I know they do because #551
(2337bd85186de000d074e5a7178cfeb110d15de7). I could not think of a
good way to handle this issue as is, so scrapped the moving PRs thing,
instead one of the coherence checks of a batch being ready is that all
its PRs have the same target, and a batch only has a target if all its
PRs have the same target.
It's possible for somewhat odd effects to arise, notably if a PR is
closed (removed from batch), the other PRs are retargeted, and the new
PR is reopened, it will now be on a separate batch even if it also
gets retargeted. This is weird. I don't quite know how I should handle
it, maybe batches could merge if they have the same target and label?
however batches don't currently have a label so...
Improve limits
--------------
Keep limits on the PRs rather than lift them on the batchL if we can
add/remove PRs of batches having different limits on different PRs of
the same batch is reasonable.
Also leave limit unset by default: previously, the limit was eagerly
set to the tip (accessible) branch. That doesn't really seem
necessary, so stop doing that.
Also remove completely unnecessary `max` when trying to find a PR's
next target: `root` is either `self` or `self.source_id`, so it should
not be possible for that to have a later target.
And for now ensure the limits are consistent per batch: a PR defaults
to the limit of their batch-mate if they don't have one, and if a
limit is set via command it's set on all PRs of a batch.
This commit does not allow differential limits via commands, they are
allowed via the backend but not really tested. The issue is mostly
that it's not clear what the UX should look like to have clear and not
super error prone interactions. So punt on it for now, and hopefully
there's no hole I missed which will create inconsistent batches.
2024-05-21 20:45:53 +07:00
|
|
|
# check all batches to see if they should be forward ported
|
2019-08-23 21:16:30 +07:00
|
|
|
for b in self.with_context(active_test=False).batch_ids:
|
2024-06-28 21:03:03 +07:00
|
|
|
if b.fw_policy == 'no':
|
|
|
|
continue
|
|
|
|
|
2019-10-15 13:54:25 +07:00
|
|
|
# if all PRs of a batch have parents they're part of an FP
|
|
|
|
# sequence and thus handled separately, otherwise they're
|
|
|
|
# considered regular merges
|
|
|
|
if not all(p.parent_id for p in b.prs):
|
2019-08-23 21:16:30 +07:00
|
|
|
self.env['forwardport.batches'].create({
|
|
|
|
'batch_id': b.id,
|
|
|
|
'source': 'merge',
|
|
|
|
})
|
|
|
|
return r
|
|
|
|
|
2019-09-18 20:37:14 +07:00
|
|
|
class Feedback(models.Model):
|
|
|
|
_inherit = 'runbot_merge.pull_requests.feedback'
|
|
|
|
|
|
|
|
token_field = fields.Selection(selection_add=[('fp_github_token', 'Forwardport Bot')])
|
2019-08-23 21:16:30 +07:00
|
|
|
|
|
|
|
|
|
|
|
class CherrypickError(Exception):
|
|
|
|
...
|
2021-01-11 20:43:36 +07:00
|
|
|
|
2022-10-28 13:03:12 +07:00
|
|
|
class ForwardPortError(Exception):
|
|
|
|
pass
|
|
|
|
|
2021-01-11 20:43:36 +07:00
|
|
|
def _clean_rename(s):
|
|
|
|
""" Filters out the "inexact rename detection" spam of cherry-pick: it's
|
|
|
|
useless but there seems to be no good way to silence these messages.
|
|
|
|
"""
|
|
|
|
return '\n'.join(
|
|
|
|
l for l in s.splitlines()
|
|
|
|
if not l.startswith('Performing inexact rename detection')
|
|
|
|
)
|
2021-10-19 19:39:19 +07:00
|
|
|
|
|
|
|
class HallOfShame(typing.NamedTuple):
|
|
|
|
reviewers: list
|
|
|
|
outstanding: list
|
|
|
|
|
|
|
|
class Outstanding(typing.NamedTuple):
|
|
|
|
source: object
|
|
|
|
prs: object
|