runbot/forwardport/models/project.py
Xavier Morel d4fa1fd353 [CHG] *: rewrite commands set, rework status management
This commit revisits the commands set in order to make it more
regular, and limit inconsistent command-sets, although it includes
pseudo-command aliases for common tasks now removed from the core set.

Hard Errors
===========

The previous iteration of the commands set would ignore any
non-command term in a command line. This has been changed to hard
error (and ignoring the entire thing) if any command is unknown or
invalid.

This fixes inconsistent / unexpected interpretations where a user
sends a command, then writes a novel on the same line some words of
which happen to *also* be commands, leading to merge states they did
not expect. They should now be told to fuck off.

Priority Restructuring
----------------------

The numerical priority system was pretty messy in that it confused
"staging priority" (in ways which were not entirely straightforward)
with overrides to other concerns.

This has now being split along all the axis, with separate command
subsets for:

- staging prioritisation, now separated between `default`, `priority`,
  and `alone`,

  - `default` means PRs are picked by an unspecified order when
    creating a staging, if nothing better is available
  - `priority` means PRs are picked first when staging, however if
    `priority` PRs don't fill the staging the rest will be filled with
    `default`, this mode did not previously exist
  - `alone` means the PRs are picked first, before splits, and only
    `alone` PRs can be part of the staging (which usually matches the
    modename)
- `skipchecks` overrides both statuses and approval checks, for the
  batch, something previously implied in `p=0`, but now
  independent. Setting `skipchecks` basically makes the entire batch
  `ready`.

  For consistency this also sets the reviewer implicitly: since
  skipchecks overrides both statuses *and approval*, whoever enables
  this mode is essentially the reviewer.
- `cancel` cancels any ongoing staging when the marked PR becomes
  ready again, previously this was also implied (in a more restricted
  form) by setting `p=0`

FWBot removal
=============

While the "forwardport bot" still exists as an API level (to segregate
access rights between tokens) it has been removed as an interaction
point, as part of the modules merge plan. As a result,

fwbot stops responding
----------------------

Feedback messages are now always sent by the mergebot, the
forward-porting bot should not send any message or notification
anymore.

commands moved to the merge bot
-------------------------------

- `ignore`/`up to` simply changes bot
- `close` as well
- `skipci` is now a choice / flag of an `fw` command, which denotes
  the forward-port policy,

  - `fw=default` is the old `ci` and resets the policy to default,
    that is wait for the PR to be merged to create forward ports, and
    for the required statuses on each forward port to be received
    before creating the next
  - `fw=skipci` is the old `skipci`, it waits for the merge of the
    base PR but then creates all the forward ports immediately (unless
    it gets a conflict)
  - `fw=skipmerge` immediately creates all the forward ports, without
    even waiting for the PR to be merged

    This is a completely new mode, and may be rather broken as until
    now the 'bot has always assumed the source PR had been merged.

approval rework
---------------

Because of the previous section, there is no distinguishing feature
between `mergebot r+` = "merge this PR" and `forwardbot r+` = "merge
this PR and all its parent with different access rights".

As a result, the two have been merged under a single `mergebot r+`
with heuristics attempting to provide the best experience:

- if approving a non-forward port, the behavior does not change
- else, with review rights on the source, all ancestors are approved
- else, as author of the original, approves all ancestors which descend
  from a merged PR
- else, approves all ancestors up to and including the oldest ancestor
  to which we have review rights

Most notably, the source's author is not delegated on the source or
any of its descendants anymore. This might need to be revisited if it
provides too restrictive.

For the very specialized need of approving a forward-port *and none of
its ancestors*, `review=` can now take a comma (`,`) separated list of
pull request numbers (github numbers, not mergebot ids).

Computed State
==============

The `state` field of pull requests is now computed. Hopefully this
makes the status more consistent and predictable in the long run, and
importantly makes status management more reliable (because reference
datum get updated naturally flowing to the state).

For now however it makes things more complicated as some of the states
have to be separately signaled or updated:

- `closed` and `error` are now separate flags
- `merge_date` is pulled down from forwardport and becomes the
  transition signal for ready -> merged
- `reviewed_by` becomes the transition signal for approval (might be a
  good idea to rename it...)
- `status` is computed from the head's statuses and overrides, and
  *that* becomes the validation state

Ideally, batch-level flags like `skipchecks` should be on, well, the
batch, and `state` should have a dependency on the batch. However
currently the batch is not a durable / permanent member of the system,
so it's a PR-level flag and a messy pile.

On notable change is that *forcing* the state to `ready` now does that
but also sets the reviewer, `skipchecks`, and overrides to ensure the
API-mediated readying does not get rolled back by e.g. the runbot
sending a status.

This is useful for a few types of automated / programmatic PRs
e.g. translation exports, where we set the state programmatically to
limit noise.

recursive dependency hack
-------------------------

Given a sequence of PRs with an override of the source, if one of the
PRs is updated its descendants should not have the override
anymore. However if the updated PR gets overridden, its descendants
should have *that* override.

This requires some unholy manipulations via an override of `modified`,
as the ORM supports recursive fields but not recursive
dependencies (on a different field).

unconditional followup scheduling
---------------------------------

Previously scheduling forward-port followup was contigent on the FW
policy, but it's not actually correct if the new PR is *immediately*
validated (which can happen now that the field is computed, if there
are no required statuses *or* all of the required statuses are
overridden by an ancestor) as nothing will trigger the state change
and thus scheduling of the fp followup.

The followup function checks all the properties of the batch to port,
so this should not result on incorrect ports. Although it's a bit more
expensive, and will lead to more spam.

Previously this would not happen because on creation of a PR the
validation task (commit -> PR) would still have to execute.

Misc Changes
============

- If a PR is marked as overriding / canceling stagings, it now does
  so on retry not just when setting initially.

  This was not handled at all previously, so a PR in P0 going into
  error due to e.g. a non-deterministic bug would be retried and still
  p=0, but a current staging would not get cancelled. Same when a PR
  in p=0 goes into error because something was failed, then is updated
  with a fix.
- Add tracking to a bunch of relevant PR fields.

  Post-mortem analysis currently generally requires going through the
  text logs to see what happened, which is annoying.

  There is a nondeterminism / inconsistency in the tracking which
  sometimes leads the admin user to trigger tracking before the bot
  does, leading to the staging tracking being attributed to them
  during tests, shove under the carpet by ignoring the user to whom
  that tracking is attributed.

  When multiple users update tracked fields in the same transaction
  all the changes are attributed to the first one having triggered
  tracking (?), I couldn't find why the admin sometimes takes over.
- added and leveraged support for enum-backed selection fields
- moved variuous fields from forwardport to runbot_merge
- fix a migration which had never worked and which never run (because
  I forgot to bump the version on the module)
- remove some unnecessary intermediate de/serialisation

fixes #673, fixes #309, fixes #792, fixes #846 (probably)
2024-05-23 07:58:46 +02:00

1055 lines
45 KiB
Python

# -*- coding: utf-8 -*-
"""
Technically could be independent from mergebot but would require a lot of
duplicate work e.g. keeping track of statuses (including on commits which
might not be in PRs yet), handling co-dependent PRs, ...
However extending the mergebot also leads to messiness: fpbot should have
its own user / feedback / API keys, mergebot and fpbot both have branch
ordering but for mergebot it's completely cosmetics, being slaved to mergebot
means PR creation is trickier (as mergebot assumes opened event will always
lead to PR creation but fpbot wants to attach meaning to the PR when setting
it up), ...
"""
import ast
import base64
import contextlib
import datetime
import itertools
import json
import logging
import operator
import os
import re
import subprocess
import tempfile
import typing
from operator import itemgetter
from pathlib import Path
import dateutil.relativedelta
import psycopg2.errors
import requests
from odoo import models, fields, api
from odoo.osv import expression
from odoo.exceptions import UserError
from odoo.tools.misc import topological_sort, groupby, Reverse
from odoo.tools.sql import reverse_order
from odoo.tools.appdirs import user_cache_dir
from odoo.addons.base.models.res_partner import Partner
from odoo.addons.runbot_merge import git, utils
from odoo.addons.runbot_merge.models.pull_requests import RPLUS, Branch
from odoo.addons.runbot_merge.models.stagings_create import Message
footer = '\nMore info at https://github.com/odoo/odoo/wiki/Mergebot#forward-port\n'
DEFAULT_DELTA = dateutil.relativedelta.relativedelta(days=3)
_logger = logging.getLogger('odoo.addons.forwardport')
class Project(models.Model):
_inherit = 'runbot_merge.project'
id: int
github_prefix: str
def write(self, vals):
# check on branches both active and inactive so disabling branches doesn't
# make it look like the sequence changed.
self_ = self.with_context(active_test=False)
previously_active_branches = {project: project.branch_ids.filtered('active') for project in self_}
branches_before = {project: project._forward_port_ordered() for project in self_}
r = super().write(vals)
self_._followup_prs(previously_active_branches)
self_._insert_intermediate_prs(branches_before)
return r
def _followup_prs(self, previously_active_branches):
"""If a branch has been disabled and had PRs without a followup (e.g.
because no CI or CI failed), create followup, as if the branch had been
originally disabled (and thus skipped over)
"""
PRs = self.env['runbot_merge.pull_requests']
for p in self:
actives = previously_active_branches[p]
for deactivated in p.branch_ids.filtered(lambda b: not b.active) & actives:
# if a PR targets a deactivated branch, and that's not its limit,
# and it doesn't have a child (e.g. CI failed), enqueue a forward
# port as if the now deactivated branch had been skipped over (which
# is the normal fw behaviour)
extant = PRs.search([
('target', '=', deactivated.id),
('source_id.limit_id', '!=', deactivated.id),
('state', 'not in', ('closed', 'merged')),
])
for p in extant.with_context(force_fw=True):
next_target = p.source_id._find_next_target(p)
# should not happen since we already filtered out limits
if not next_target:
continue
# check if it has a descendant in the next branch, if so skip
if PRs.search_count([
('source_id', '=', p.source_id.id),
('target', '=', next_target.id)
]):
continue
# otherwise enqueue a followup
p._schedule_fp_followup()
def _insert_intermediate_prs(self, branches_before):
"""If new branches have been added to the sequence inbetween existing
branches (mostly a freeze inserted before the main branch), fill in
forward-ports for existing sequences
"""
Branches = self.env['runbot_merge.branch']
for p in self:
# check if the branches sequence has been modified
bbefore = branches_before[p]
bafter = p._forward_port_ordered()
if bafter.ids == bbefore.ids:
continue
logger = _logger.getChild('project').getChild(p.name)
logger.debug("branches updated %s -> %s", bbefore, bafter)
# if it's just that a branch was inserted at the end forwardport
# should keep on keeping normally
if bafter.ids[:-1] == bbefore.ids:
continue
if bafter <= bbefore:
raise UserError("Branches can not be reordered or removed after saving.")
# Last possibility: branch was inserted but not at end, get all
# branches before and all branches after
before = new = after = Branches
for b in bafter:
if b in bbefore:
if new:
after += b
else:
before += b
else:
if new:
raise UserError("Inserting multiple branches at the same time is not supported")
new = b
logger.debug('before: %s new: %s after: %s', before.ids, new.ids, after.ids)
# find all FPs whose ancestry spans the insertion
leaves = self.env['runbot_merge.pull_requests'].search([
('state', 'not in', ['closed', 'merged']),
('target', 'in', after.ids),
('source_id.target', 'in', before.ids),
])
# get all PRs just preceding the insertion point which either are
# sources of the above or have the same source
candidates = self.env['runbot_merge.pull_requests'].search([
('target', '=', before[-1].id),
'|', ('id', 'in', leaves.mapped('source_id').ids),
('source_id', 'in', leaves.mapped('source_id').ids),
])
logger.debug("\nPRs spanning new: %s\nto port: %s", leaves, candidates)
# enqueue the creation of a new forward-port based on our candidates
# but it should only create a single step and needs to stitch back
# the parents linked list, so it has a special type
for _, cs in groupby(candidates, key=lambda p: p.label):
self.env['forwardport.batches'].create({
'batch_id': self.env['runbot_merge.batch'].create({
'target': before[-1].id,
'prs': [(4, c.id, 0) for c in cs],
'active': False,
}).id,
'source': 'insert',
})
def _forward_port_ordered(self, domain=()):
Branches = self.env['runbot_merge.branch']
return Branches.search(expression.AND([
[('project_id', '=', self.id)],
domain or [],
]), order=reverse_order(Branches._order))
class Repository(models.Model):
_inherit = 'runbot_merge.repository'
id: int
project_id: Project
name: str
branch_filter: str
fp_remote_target = fields.Char(help="where FP branches get pushed")
class PullRequests(models.Model):
_inherit = 'runbot_merge.pull_requests'
id: int
display_name: str
number: int
repository: Repository
target: Branch
reviewed_by: Partner
head: str
state: str
reminder_backoff_factor = fields.Integer(default=-4, group_operator=None)
@api.model_create_single
def create(self, vals):
# PR opened event always creates a new PR, override so we can precreate PRs
existing = self.search([
('repository', '=', vals['repository']),
('number', '=', vals['number']),
])
if existing:
return existing
if 'limit_id' not in vals:
branch = self.env['runbot_merge.branch'].browse(vals['target'])
repo = self.env['runbot_merge.repository'].browse(vals['repository'])
vals['limit_id'] = branch.project_id._forward_port_ordered(
ast.literal_eval(repo.branch_filter or '[]')
)[-1].id
if vals.get('parent_id') and 'source_id' not in vals:
vals['source_id'] = self.browse(vals['parent_id']).root_id.id
return super().create(vals)
def write(self, vals):
# if the PR's head is updated, detach (should split off the FP lines as this is not the original code)
# TODO: better way to do this? Especially because we don't want to
# recursively create updates
# also a bit odd to only handle updating 1 head at a time, but then
# again 2 PRs with same head is weird so...
newhead = vals.get('head')
with_parents = {
p: p.parent_id
for p in self
if p.state not in ('merged', 'closed')
if p.parent_id
}
closed_fp = self.filtered(lambda p: p.state == 'closed' and p.source_id)
if newhead and not self.env.context.get('ignore_head_update') and newhead != self.head:
vals.setdefault('parent_id', False)
if with_parents and vals['parent_id'] is False:
vals['detach_reason'] = f"Head updated from {self.head} to {newhead}"
# if any children, this is an FP PR being updated, enqueue
# updating children
if self.search_count([('parent_id', '=', self.id)]):
self.env['forwardport.updates'].create({
'original_root': self.root_id.id,
'new_root': self.id
})
if vals.get('parent_id') and 'source_id' not in vals:
parent = self.browse(vals['parent_id'])
vals['source_id'] = (parent.source_id or parent).id
r = super().write(vals)
if self.env.context.get('forwardport_detach_warn', True):
for p, parent in with_parents.items():
if p.parent_id:
continue
self.env.ref('runbot_merge.forwardport.update.detached')._send(
repository=p.repository,
pull_request=p.number,
token_field='fp_github_token',
format_args={'pr': p},
)
if parent.state not in ('closed', 'merged'):
self.env.ref('runbot_merge.forwardport.update.parent')._send(
repository=parent.repository,
pull_request=parent.number,
token_field='fp_github_token',
format_args={'pr': parent, 'child': p},
)
if vals.get('merge_date'):
self.env['forwardport.branch_remover'].create([
{'pr_id': p.id}
for p in self
])
# if we change the policy to skip CI, schedule followups on existing FPs
if vals.get('fw_policy') == 'skipci' and self.state == 'merged':
self.env['runbot_merge.pull_requests'].search([
('source_id', '=', self.id),
('state', 'not in', ('closed', 'merged')),
])._schedule_fp_followup()
return r
def _try_closing(self, by):
r = super()._try_closing(by)
if r:
self.with_context(forwardport_detach_warn=False).write({
'parent_id': False,
'detach_reason': f"Closed by {by}",
})
self.search([('parent_id', '=', self.id)]).write({
'parent_id': False,
'detach_reason': f"{by} closed parent PR {self.display_name}",
})
return r
def _maybe_update_limit(self, limit: str) -> typing.Tuple[bool, str]:
limit_id = self.env['runbot_merge.branch'].with_context(active_test=False).search([
('project_id', '=', self.repository.project_id.id),
('name', '=', limit),
])
if not limit_id:
return True, f"there is no branch {limit!r}, it can't be used as a forward port target."
if limit_id != self.target and not limit_id.active:
return True, f"branch {limit_id.name!r} is disabled, it can't be used as a forward port target."
# not forward ported yet, just acknowledge the request
if not self.source_id and self.state != 'merged':
self.limit_id = limit_id
if branch_key(limit_id) <= branch_key(self.target):
return False, "Forward-port disabled."
else:
return False, f"Forward-porting to {limit_id.name!r}."
# if the PR has been forwardported
prs = (self | self.forwardport_ids | self.source_id | self.source_id.forwardport_ids)
tip = max(prs, key=pr_key)
# if the fp tip was closed it's fine
if tip.state == 'closed':
return True, f"{tip.display_name} is closed, no forward porting is going on"
prs.limit_id = limit_id
real_limit = max(limit_id, tip.target, key=branch_key)
addendum = ''
# check if tip was queued for forward porting, try to cancel if we're
# supposed to stop here
if real_limit == tip.target and (task := self.env['forwardport.batches'].search([('batch_id', 'in', tip.batch_ids.ids)])):
try:
with self.env.cr.savepoint():
self.env.cr.execute(
"SELECT FROM forwardport_batches "
"WHERE id = %s FOR UPDATE NOWAIT",
[task.id])
except psycopg2.errors.LockNotAvailable:
# row locked = port occurring and probably going to succeed,
# so next(real_limit) likely a done deal already
return True, (
f"Forward port of {tip.display_name} likely already "
f"ongoing, unable to cancel, close next forward port "
f"when it completes.")
else:
self.env.cr.execute("DELETE FROM forwardport_batches WHERE id = %s", [task.id])
if real_limit != tip.target:
# forward porting was previously stopped at tip, and we want it to
# resume
if tip.state == 'merged':
self.env['forwardport.batches'].create({
'batch_id': tip.batch_ids.sorted('id')[-1].id,
'source': 'fp' if tip.parent_id else 'merge',
})
resumed = tip
else:
# reactivate batch
tip.batch_ids.sorted('id')[-1].active = True
resumed = tip._schedule_fp_followup()
if resumed:
addendum += f', resuming forward-port stopped at {tip.display_name}'
if real_limit != limit_id:
addendum += f' (instead of the requested {limit_id.name!r} because {tip.display_name} already exists)'
# get a "stable" root rather than self's to avoid divertences between
# PRs across a root divide (where one post-root would point to the root,
# and one pre-root would point to the source, or a previous root)
root = tip.root_id
# reference the root being forward ported unless we are the root
root_ref = '' if root == self else f' {root.display_name}'
msg = f"Forward-porting{root_ref} to {real_limit.name!r}{addendum}."
# send a message to the source & root except for self, if they exist
root_msg = f'Forward-porting to {real_limit.name!r} (from {self.display_name}).'
self.env['runbot_merge.pull_requests.feedback'].create([
{
'repository': p.repository.id,
'pull_request': p.number,
'message': root_msg,
'token_field': 'fp_github_token',
}
# send messages to source and root unless root is self (as it
# already gets the normal message)
for p in (self.source_id | root) - self
])
return False, msg
def _notify_ci_failed(self, ci):
# only care about FP PRs which are not staged / merged yet
# NB: probably ignore approved PRs as normal message will handle them?
if not (self.state == 'opened' and self.parent_id):
return
self.env.ref('runbot_merge.forwardport.ci.failed')._send(
repository=self.repository,
pull_request=self.number,
token_field='fp_github_token',
format_args={'pr': self, 'ci': ci},
)
def _validate(self, statuses):
failed = super()._validate(statuses)
self._schedule_fp_followup()
return failed
def _schedule_fp_followup(self):
_logger = logging.getLogger(__name__).getChild('forwardport.next')
# if the PR has a parent and is CI-validated, enqueue the next PR
scheduled = self.browse(())
for pr in self:
_logger.info('Checking if forward-port %s (%s)', pr.display_name, pr)
if not pr.parent_id:
_logger.info('-> no parent %s (%s)', pr.display_name, pr.parent_id)
continue
if not self.env.context.get('force_fw') and self.source_id.fw_policy != 'skipci' and pr.state not in ['validated', 'ready']:
_logger.info('-> wrong state %s (%s)', pr.display_name, pr.state)
continue
# check if we've already forward-ported this branch:
# it has a batch without a staging
batch = self.env['runbot_merge.batch'].with_context(active_test=False).search([
('staging_id', '=', False),
('prs', 'in', pr.id),
], limit=1)
# if the batch is inactive, the forward-port has been done *or*
# the PR's own forward port is in error, so bail
if not batch.active:
_logger.info('-> forward port done or in error (%s.active=%s)', batch, batch.active)
continue
# otherwise check if we already have a pending forward port
_logger.info("%s %s %s", pr.display_name, batch, ', '.join(batch.mapped('prs.display_name')))
if self.env['forwardport.batches'].search_count([('batch_id', '=', batch.id)]):
_logger.warning('-> already recorded')
continue
# check if batch-mate are all valid
mates = batch.prs
# wait until all of them are validated or ready
if not self.env.context.get('force_fw') and any(pr.source_id.fw_policy != 'skipci' and pr.state not in ('validated', 'ready') for pr in mates):
_logger.info("-> not ready (%s)", [(pr.display_name, pr.state) for pr in mates])
continue
# check that there's no weird-ass state
if not all(pr.parent_id for pr in mates):
_logger.warning("Found a batch (%s) with only some PRs having parents, ignoring", mates)
continue
if self.search_count([('parent_id', 'in', mates.ids)]):
_logger.warning("Found a batch (%s) with only some of the PRs having children", mates)
continue
_logger.info('-> ok')
self.env['forwardport.batches'].create({
'batch_id': batch.id,
'source': 'fp',
})
scheduled |= pr
return scheduled
def _find_next_target(self, reference):
""" Finds the branch between target and limit_id which follows
reference
"""
if reference.target == self.limit_id:
return
# NOTE: assumes even disabled branches are properly sequenced, would
# probably be a good idea to have the FP view show all branches
branches = list(self.target.project_id
.with_context(active_test=False)
._forward_port_ordered(ast.literal_eval(self.repository.branch_filter or '[]')))
# get all branches between max(root.target, ref.target) (excluded) and limit (included)
from_ = max(branches.index(self.target), branches.index(reference.target))
to_ = branches.index(self.limit_id)
# return the first active branch in the set
return next((
branch
for branch in branches[from_+1:to_+1]
if branch.active
), None)
def _commits_lazy(self):
s = requests.Session()
s.headers['Authorization'] = 'token %s' % self.repository.project_id.fp_github_token
for page in itertools.count(1):
r = s.get('https://api.github.com/repos/{}/pulls/{}/commits'.format(
self.repository.name,
self.number
), params={'page': page})
r.raise_for_status()
yield from r.json()
if not r.links.get('next'):
return
def commits(self):
""" Returns a PR's commits oldest first (that's what GH does &
is what we want)
"""
commits = list(self._commits_lazy())
# map shas to the position the commit *should* have
idx = {
c: i
for i, c in enumerate(topological_sort({
c['sha']: [p['sha'] for p in c['parents']]
for c in commits
}))
}
return sorted(commits, key=lambda c: idx[c['sha']])
def _port_forward(self):
if not self:
return
all_sources = [(p.source_id or p) for p in self]
all_targets = [s._find_next_target(p) for s, p in zip(all_sources, self)]
ref = self[0]
base = all_sources[0]
target = all_targets[0]
if target is None:
_logger.info(
"Will not forward-port %s: no next target",
ref.display_name,
)
return # QUESTION: do the prs need to be updated?
# check if the PRs have already been forward-ported: is there a PR
# with the same source targeting the next branch in the series
for source in all_sources:
if self.search_count([('source_id', '=', source.id), ('target', '=', target.id)]):
_logger.info("Will not forward-port %s: already ported", ref.display_name)
return
# check if all PRs in the batch have the same "next target" , bail if
# that's not the case as it doesn't make sense for forward one PR from
# a to b and a linked pr from a to c
different_target = next((t for t in all_targets if t != target), None)
if different_target:
different_pr = next(p for p, t in zip(self, all_targets) if t == different_target)
for pr, t in zip(self, all_targets):
linked, other = different_pr, different_target
if t != target:
linked, other = ref, target
self.env.ref('runbot_merge.forwardport.failure.discrepancy')._send(
repository=pr.repository,
pull_request=pr.number,
token_field='fp_github_token',
format_args={'pr': pr, 'linked': linked, 'next': t.name, 'other': other.name},
)
_logger.warning(
"Cancelling forward-port of %s: found different next branches (%s)",
self, all_targets
)
return
proj = self.mapped('target.project_id')
if not proj.fp_github_token:
_logger.warning(
"Can not forward-port %s: no token on project %s",
ref.display_name,
proj.name
)
return
notarget = [p.repository.name for p in self if not p.repository.fp_remote_target]
if notarget:
_logger.error(
"Can not forward-port %s: repos %s don't have a remote configured",
self, ', '.join(notarget)
)
return
# take only the branch bit
new_branch = '%s-%s-%s-fw' % (
target.name,
base.refname,
# avoid collisions between fp branches (labels can be reused
# or conflict especially as we're chopping off the owner)
base64.urlsafe_b64encode(os.urandom(3)).decode()
)
# TODO: send outputs to logging?
conflicts = {}
with contextlib.ExitStack() as s:
for pr in self:
conflicts[pr], working_copy = pr._create_fp_branch(
target, new_branch, s)
working_copy.push('target', new_branch)
gh = requests.Session()
gh.headers['Authorization'] = 'token %s' % proj.fp_github_token
has_conflicts = any(conflicts.values())
# problemo: this should forward port a batch at a time, if porting
# one of the PRs in the batch fails is huge problem, though this loop
# only concerns itself with the creation of the followup objects so...
new_batch = self.browse(())
self.env.cr.execute('LOCK runbot_merge_pull_requests IN SHARE MODE')
for pr in self:
owner, _ = pr.repository.fp_remote_target.split('/', 1)
source = pr.source_id or pr
root = pr.root_id
message = source.message + '\n\n' + '\n'.join(
"Forward-Port-Of: %s" % p.display_name
for p in root | source
)
title, body = re.match(r'(?P<title>[^\n]+)\n*(?P<body>.*)', message, flags=re.DOTALL).groups()
r = gh.post(f'https://api.github.com/repos/{pr.repository.name}/pulls', json={
'base': target.name,
'head': f'{owner}:{new_branch}',
'title': '[FW]' + (' ' if title[0] != '[' else '') + title,
'body': body
})
if not r.ok:
_logger.warning("Failed to create forward-port PR for %s, deleting branches", pr.display_name)
# delete all the branches this should automatically close the
# PRs if we've created any. Using the API here is probably
# simpler than going through the working copies
for repo in self.mapped('repository'):
d = gh.delete(f'https://api.github.com/repos/{repo.fp_remote_target}/git/refs/heads/{new_branch}')
if d.ok:
_logger.info("Deleting %s:%s=success", repo.fp_remote_target, new_branch)
else:
_logger.warning("Deleting %s:%s=%s", repo.fp_remote_target, new_branch, d.text)
raise RuntimeError("Forwardport failure: %s (%s)" % (pr.display_name, r.text))
new_pr = self._from_gh(r.json())
_logger.info("Created forward-port PR %s", new_pr)
new_batch |= new_pr
# allows PR author to close or skipci
new_pr.write({
'merge_method': pr.merge_method,
'source_id': source.id,
# only link to previous PR of sequence if cherrypick passed
'parent_id': pr.id if not has_conflicts else False,
'detach_reason': "conflicts: {}".format(
f'\n{conflicts[pr]}\n{conflicts[pr]}'.strip()
) if has_conflicts else None,
})
if has_conflicts and pr.parent_id and pr.state not in ('merged', 'closed'):
self.env.ref('runbot_merge.forwardport.failure.conflict')._send(
repository=pr.repository,
pull_request=pr.number,
token_field='fp_github_token',
format_args={'source': source, 'pr': pr, 'new': new_pr, 'footer': footer},
)
for pr, new_pr in zip(self, new_batch):
(h, out, err, hh) = conflicts.get(pr) or (None, None, None, None)
if h:
sout = serr = ''
if out.strip():
sout = f"\nstdout:\n```\n{out}\n```\n"
if err.strip():
serr = f"\nstderr:\n```\n{err}\n```\n"
lines = ''
if len(hh) > 1:
lines = '\n' + ''.join(
'* %s%s\n' % (sha, ' <- on this commit' if sha == h else '')
for sha in hh
)
template = 'runbot_merge.forwardport.failure'
format_args = {
'pr': new_pr,
'commits': lines,
'stdout': sout,
'stderr': serr,
'footer': footer,
}
elif has_conflicts:
template = 'runbot_merge.forwardport.linked'
format_args = {
'pr': new_pr,
'siblings': ', '.join(p.display_name for p in (new_batch - new_pr)),
'footer': footer,
}
elif base._find_next_target(new_pr) is None:
ancestors = "".join(
"* %s\n" % p.display_name
for p in pr._iter_ancestors()
if p.parent_id
)
template = 'runbot_merge.forwardport.final'
format_args = {
'pr': new_pr,
'containing': ' containing:' if ancestors else '.',
'ancestors': ancestors,
'footer': footer,
}
else:
template = 'runbot_merge.forwardport.intermediate'
format_args = {
'pr': new_pr,
'footer': footer,
}
self.env.ref(template)._send(
repository=new_pr.repository,
pull_request=new_pr.number,
token_field='fp_github_token',
format_args=format_args,
)
labels = ['forwardport']
if has_conflicts:
labels.append('conflict')
self.env['runbot_merge.pull_requests.tagging'].create({
'repository': new_pr.repository.id,
'pull_request': new_pr.number,
'tags_add': labels,
})
# batch the PRs so _validate can perform the followup FP properly
# (with the entire batch). If there are conflict then create a
# deactivated batch so the interface is coherent but we don't pickup
# an active batch we're never going to deactivate.
b = self.env['runbot_merge.batch'].create({
'target': target.id,
'prs': [(6, 0, new_batch.ids)],
'active': not has_conflicts,
})
# try to schedule followup
new_batch[0]._schedule_fp_followup()
return b
def _create_fp_branch(self, target_branch, fp_branch_name, cleanup):
""" Creates a forward-port for the current PR to ``target_branch`` under
``fp_branch_name``.
:param target_branch: the branch to port forward to
:param fp_branch_name: the name of the branch to create the FP under
:param ExitStack cleanup: so the working directories can be cleaned up
:return: A pair of an optional conflict information and a repository. If
present the conflict information is composed of the hash of the
conflicting commit, the stderr and stdout of the failed
cherrypick and a list of all PR commit hashes
:rtype: (None | (str, str, str, list[commit]), Repo)
"""
logger = _logger.getChild(str(self.id))
root = self.root_id
logger.info(
"Forward-porting %s (%s) to %s",
self.display_name, root.display_name, target_branch.name
)
source = git.get_local(self.repository, 'fp_github')
r = source.with_config(stdout=subprocess.PIPE, stderr=subprocess.STDOUT).fetch()
logger.info("Updated cache repo %s:\n%s", source._directory, r.stdout.decode())
logger.info("Create working copy...")
cache_dir = user_cache_dir('forwardport')
# PullRequest.display_name is `owner/repo#number`, so `owner` becomes a
# directory, `TemporaryDirectory` only creates the leaf, so we need to
# make sure `owner` exists in `cache_dir`.
Path(cache_dir, root.repository.name).parent.mkdir(parents=True, exist_ok=True)
working_copy = source.clone(
cleanup.enter_context(
tempfile.TemporaryDirectory(
prefix=f'{root.display_name}-to-{target_branch.name}',
dir=cache_dir)),
branch=target_branch.name
)
r = working_copy.with_config(stdout=subprocess.PIPE, stderr=subprocess.STDOUT) \
.fetch(git.source_url(self.repository, 'fp_github'), root.head)
logger.info(
"Fetched head of %s into %s:\n%s",
root.display_name,
working_copy._directory,
r.stdout.decode()
)
if working_copy.check(False).cat_file(e=root.head).returncode:
raise ForwardPortError(
f"During forward port of {self.display_name}, unable to find "
f"expected head of {root.display_name} ({root.head})"
)
project_id = self.repository.project_id
# add target remote
working_copy.remote(
'add', 'target',
'https://{p.fp_github_token}@github.com/{r.fp_remote_target}'.format(
r=self.repository,
p=project_id
)
)
logger.info("Create FP branch %s in %s", fp_branch_name, working_copy._directory)
working_copy.checkout(b=fp_branch_name)
try:
root._cherry_pick(working_copy)
return None, working_copy
except CherrypickError as e:
h, out, err, commits = e.args
# using git diff | git apply -3 to get the entire conflict set
# turns out to not work correctly: in case files have been moved
# / removed (which turns out to be a common source of conflicts
# when forward-porting) it'll just do nothing to the working copy
# so the "conflict commit" will be empty
# switch to a squashed-pr branch
working_copy.check(True).checkout('-bsquashed', root.head)
# commits returns oldest first, so youngest (head) last
head_commit = commits[-1]['commit']
to_tuple = operator.itemgetter('name', 'email')
def to_dict(term, vals):
return {'GIT_%s_NAME' % term: vals[0], 'GIT_%s_EMAIL' % term: vals[1], 'GIT_%s_DATE' % term: vals[2]}
authors, committers = set(), set()
for c in (c['commit'] for c in commits):
authors.add(to_tuple(c['author']))
committers.add(to_tuple(c['committer']))
fp_authorship = (project_id.fp_github_name, '', '')
author = fp_authorship if len(authors) != 1\
else authors.pop() + (head_commit['author']['date'],)
committer = fp_authorship if len(committers) != 1 \
else committers.pop() + (head_commit['committer']['date'],)
conf = working_copy.with_config(env={
**to_dict('AUTHOR', author),
**to_dict('COMMITTER', committer),
'GIT_COMMITTER_DATE': '',
})
# squash to a single commit
conf.reset('--soft', commits[0]['parents'][0]['sha'])
conf.commit(a=True, message="temp")
squashed = conf.stdout().rev_parse('HEAD').stdout.strip().decode()
# switch back to the PR branch
conf.checkout(fp_branch_name)
# cherry-pick the squashed commit to generate the conflict
conf.with_params('merge.renamelimit=0', 'merge.conflictstyle=diff3')\
.with_config(check=False)\
.cherry_pick(squashed, no_commit=True)
status = conf.stdout().status(short=True, untracked_files='no').stdout.decode()
if err.strip():
err = err.rstrip() + '\n----------\nstatus:\n' + status
else:
err = 'status:\n' + status
# if there was a single commit, reuse its message when committing
# the conflict
# TODO: still add conflict information to this?
if len(commits) == 1:
msg = root._make_fp_message(commits[0])
conf.with_config(input=str(msg).encode()) \
.commit(all=True, allow_empty=True, file='-')
else:
conf.commit(
all=True, allow_empty=True,
message="""Cherry pick of %s failed
stdout:
%s
stderr:
%s
""" % (h, out, err))
return (h, out, err, [c['sha'] for c in commits]), working_copy
def _cherry_pick(self, working_copy):
""" Cherrypicks ``self`` into the working copy
:return: ``True`` if the cherrypick was successful, ``False`` otherwise
"""
# <xxx>.cherrypick.<number>
logger = _logger.getChild(str(self.id)).getChild('cherrypick')
# original head so we can reset
prev = original_head = working_copy.stdout().rev_parse('HEAD').stdout.decode().strip()
commits = self.commits()
logger.info("%s: copy %s commits to %s\n%s", self, len(commits), original_head, '\n'.join(
'- %s (%s)' % (c['sha'], c['commit']['message'].splitlines()[0])
for c in commits
))
for commit in commits:
commit_sha = commit['sha']
# config (global -c) or commit options don't really give access to
# setting dates
cm = commit['commit'] # get the "git" commit object rather than the "github" commit resource
env = {
'GIT_AUTHOR_NAME': cm['author']['name'],
'GIT_AUTHOR_EMAIL': cm['author']['email'],
'GIT_AUTHOR_DATE': cm['author']['date'],
'GIT_COMMITTER_NAME': cm['committer']['name'],
'GIT_COMMITTER_EMAIL': cm['committer']['email'],
}
configured = working_copy.with_config(env=env)
conf = working_copy.with_config(
env={**env, 'GIT_TRACE': 'true'},
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
check=False
)
# first try with default / low renamelimit
r = conf.cherry_pick(commit_sha)
logger.debug("Cherry-picked %s: %s\n%s\n%s", commit_sha, r.returncode, r.stdout.decode(), _clean_rename(r.stderr.decode()))
if r.returncode:
# if it failed, retry with high renamelimit
configured.reset('--hard', prev)
r = conf.with_params('merge.renamelimit=0').cherry_pick(commit_sha)
logger.debug("Cherry-picked %s (renamelimit=0): %s\n%s\n%s", commit_sha, r.returncode, r.stdout.decode(), _clean_rename(r.stderr.decode()))
if r.returncode: # pick failed, reset and bail
# try to log inflateInit: out of memory errors as warning, they
# seem to return the status code 128
logger.log(
logging.WARNING if r.returncode == 128 else logging.INFO,
"forward-port of %s (%s) failed at %s",
self, self.display_name, commit_sha)
configured.reset('--hard', original_head)
raise CherrypickError(
commit_sha,
r.stdout.decode(),
_clean_rename(r.stderr.decode()),
commits
)
msg = self._make_fp_message(commit)
# replace existing commit message with massaged one
configured \
.with_config(input=str(msg).encode())\
.commit(amend=True, file='-')
prev = configured.stdout().rev_parse('HEAD').stdout.decode()
logger.info('%s: success -> %s', commit_sha, prev)
def _build_merge_message(self, message, related_prs=()):
msg = super()._build_merge_message(message, related_prs=related_prs)
# ensures all reviewers in the review path are on the PR in order:
# original reviewer, then last conflict reviewer, then current PR
reviewers = (self | self.root_id | self.source_id)\
.mapped('reviewed_by.formatted_email')
sobs = msg.headers.getlist('signed-off-by')
msg.headers.remove('signed-off-by')
msg.headers.extend(
('signed-off-by', signer)
for signer in sobs
if signer not in reviewers
)
msg.headers.extend(
('signed-off-by', reviewer)
for reviewer in reversed(reviewers)
)
return msg
def _make_fp_message(self, commit):
cmap = json.loads(self.commits_map)
msg = Message.from_message(commit['commit']['message'])
# write the *merged* commit as "original", not the PR's
msg.headers['x-original-commit'] = cmap.get(commit['sha'], commit['sha'])
# don't stringify so caller can still perform alterations
return msg
def _outstanding(self, cutoff):
""" Returns "outstanding" (unmerged and unclosed) forward-ports whose
source was merged before ``cutoff`` (all of them if not provided).
:param str cutoff: a datetime (ISO-8601 formatted)
:returns: an iterator of (source, forward_ports)
"""
return groupby(self.env['runbot_merge.pull_requests'].search([
# only FP PRs
('source_id', '!=', False),
# active
('state', 'not in', ['merged', 'closed']),
('source_id.merge_date', '<', cutoff),
], order='source_id, id'), lambda p: p.source_id)
def _reminder(self):
cutoff = self.env.context.get('forwardport_updated_before') \
or fields.Datetime.to_string(datetime.datetime.now() - DEFAULT_DELTA)
cutoff_dt = fields.Datetime.from_string(cutoff)
for source, prs in self._outstanding(cutoff):
backoff = dateutil.relativedelta.relativedelta(days=2**source.reminder_backoff_factor)
prs = list(prs)
if source.merge_date > (cutoff_dt - backoff):
continue
source.reminder_backoff_factor += 1
self.env.ref('runbot_merge.forwardport.reminder')._send(
repository=source.repository,
pull_request=source.number,
token_field='fp_github_token',
format_args={
'pr': source,
'outstanding': ''.join(
f'\n- {pr.display_name}'
for pr in sorted(prs, key=lambda p: p.number)
),
}
)
# ordering is a bit unintuitive because the lowest sequence (and name)
# is the last link of the fp chain, reasoning is a bit more natural the
# other way around (highest object is the last), especially with Python
# not really having lazy sorts in the stdlib
def branch_key(b: Branch, /, _key=itemgetter('sequence', 'name')):
return Reverse(_key(b))
def pr_key(p: PullRequests, /):
return branch_key(p.target)
class Stagings(models.Model):
_inherit = 'runbot_merge.stagings'
def write(self, vals):
r = super().write(vals)
# we've just deactivated a successful staging (so it got ~merged)
if vals.get('active') is False and self.state == 'success':
# check al batches to see if they should be forward ported
for b in self.with_context(active_test=False).batch_ids:
# if all PRs of a batch have parents they're part of an FP
# sequence and thus handled separately, otherwise they're
# considered regular merges
if not all(p.parent_id for p in b.prs):
self.env['forwardport.batches'].create({
'batch_id': b.id,
'source': 'merge',
})
return r
class Feedback(models.Model):
_inherit = 'runbot_merge.pull_requests.feedback'
token_field = fields.Selection(selection_add=[('fp_github_token', 'Forwardport Bot')])
class CherrypickError(Exception):
...
class ForwardPortError(Exception):
pass
def _clean_rename(s):
""" Filters out the "inexact rename detection" spam of cherry-pick: it's
useless but there seems to be no good way to silence these messages.
"""
return '\n'.join(
l for l in s.splitlines()
if not l.startswith('Performing inexact rename detection')
)
class HallOfShame(typing.NamedTuple):
reviewers: list
outstanding: list
class Outstanding(typing.NamedTuple):
source: object
prs: object