When using @fw-bot close, a feedback would be created without a
message (rather than e.g. with an empty one). As a result, the
feedback-sending cron would crash, but not before having closed the
corresponding PR.
This would lead to closing the PR in a loop & spamming the logs with
tracebacks.
Runbot initial architechture was working for a single odoo repo, and was
adapted to build enterprise. Addition of upgrade repo and test began
to make result less intuitive revealing more weakness of the system.
Adding to the oddities of duplicate detection and branch matching,
there was some room for improvement in the runbot models.
This (small) commit introduce the runbot v5.0, designed for a closer
match of odoo's development flows, and hopefully improving devs
experience and making runbot configuration more flexible.
**Remotes:** remote intoduction helps to detect duplicate between odoo and
odoo-dev repos: a commit is now on a repo, a repo having multiple remote.
If a hash is in odoo-dev, we consider that it is the same in odoo.
Note: github seems to manage commit kind of the same way. It is possible
to send a status on a commit on odoo when the commit only exists in
odoo-dev.
This change also allows to remove some repo duplicate configuration
between a repo and his dev corresponding repo.
(modules, server files, manifests, ...)
**Trigger:** before v5.0, only one build per repo was created, making it
difficult to tweak what test to execute in what case. The example use
case was for upgrade. We want to test upgrade to master when pushing on
odoo. But we also want to test upgrade the same way when pushing on
upgrade. We introduce a build that should be ran on pushing on either
repo when each repo already have specific tests.
The trigger allows to specify a build to create with a specific config.
The trigger is executed when any repo of the trigger repo is pushed.
The trigger can define depedencies: only build enterprise when pushing
enterprise, but enterprise needs odoo. Test upgrade to master when pushing
either odoo or upgrade.
Trigger will also allows to extract some build like cla that where
executed on both enterprise and odoo, and hidden in a subbuild.
**Bundle:** Cross repo branches/pr branches matching was hidden in build
creation and can be confusing. A build can be detected as a duplicate
of a pr, but not always if naming is wrong or traget is invalid/changes.
This was mainly because of how a community ref will be found. This was
making ci on pr undeterministic if duplicate matching fails. This was
also creating two build, with one pointing to the other when duplicate
detection was working, but the visual result can be confusing.
Associtaions of remotes and bundles fix this by adding all pr and
related branches from all repo in a bundle. First of all this helps to
visualise what the runbot consider has branch matching and that should
be considered as part of the same task, giving a place where to warn
devs of some possible inconsistencies. Associate whith repo/remote, we
can consider branches in the same repo in a bundle as expected to have
the same head. Only one build is created since trigger considers repo,
not remotes.
**Batch:** A batch is a group of build, a batch on a bundle can be
compared to a build on a branch in previous version. When a branch
is pushed, the corresponding bundle creates a new batch, and wait for
new commit. Once no new update are detected in the batch for 60 seconds,
All the trigger are executed if elligible. The created build are added
to the batch in a batch_slot. It is also possible that an corresponding
build exists (duplicate) and is added to the slot instead of creating a
new build.
Co-authored-by d-fence <moc@odoo.com>
In Odoo 13, the cache middleware was modified to straight hit
`http.root` assuming it's the Odoo root object. When `http.root` is
replaced by a wrapping middleware, the entire thing blows up and shits
the bed.
Patch up by automatically delegating attribute accesses to the wrapped
application (which is probably Root), although why this is not just
folded into Root is getting less and less clear.
Seems to be a pretty long-standing issue but I'd not noticed it before
as it's rather rarely taken & our sentry remains rather blown to hell,
I only happened to stumble upon the issue in the logs.
There's no ``number`` attribute on the repository object (to which
``_load_pr`` belongs). We obviously want to use the number of the PR
we're currently loading.
Mistake in the statuses handling: the context is not sufficient to
uniquely identify a staging status as different repositories can get
the same status context (e.g. ci/runbot is present on all our
repositories).
This is only a visual problem, but the status dropdown on
stagings (both the dashboard and the branchwise listing) would reuse
one of the status with the context for all of them, leading to
incorrect links and misleading displays.
Fix by keying on (repo, context) instead, that's exactly why the
repository name was part of the status in the first place.
This is a regression due to the implementation details of
odoo/runbot#376: previously _parse_command would only yield the
commands it had specifically recognised (from a whitelist).
22e18e752b simplified the implementation
and (for convenience when adding new commands) now passes through any
command to the executor instead of skipping the unknown one.
But I forgot to update the executor to ignore unknown commands, so it
treats them as *failed* (since the success flag doesn't get set) and
assumes it's an ACL issue, so notifies the user that they can't do the
thing they never really asked for.
Add an end-case which skips the feedback bit for unrecognized
commands, which restores the old behavior.
Fixes#390
Currently it can be difficult to know why the mergebot refuses to
merge a PR (not that people care, they generally just keep sending new
commands without checking what the 'bot is telling them, oh well...).
Anyway knowing the CI state is the most complicated bit because the CI
tag only provides a global pass/fail for statuses but not a view of
specific statuses, and sometimes either the runbot or github fails to
notify the mergebot, leading to inconsistent internal states & shit.
By adding a tag per status context per PR, we can more clearly
indicate what's what.
Fixes#389
Apparently a long-running issue but not really a concern before the
new mergebot started sending a lot more statuses: stagings would show
a list of all statuses they received, including optional / irrelevant
statuses.
Get a list of required statuses and only show that on the staging
dropdowns.
Closes#387
Adds an `override` mergebot command. The ability to override is set on
an individual per-context per-repository basis, similar to but
independent from review rights. That is, a given individual may be
able to override the status X on repository A and unable to do so on
repository B.
Overrides are stored in the same format as regular statuses, but
independent from them in order to persist them across builds.
Only PR statuses can be overridden, statuses which are overridable on
PRs would simply not be required on stagings.
An alternative to implementing this feature in the mergebot would be
to add it to individual status-generating tools on a per-need
basis.
Pros of that alternative:
* display the correct status on PRs, currently the PR will be failing
status-wise (on github) but correct as far as the mergebot is
concerned
* remove complexity from the mergebot
Cons of that alternative:
* each status-generating tool would have to implement some sort of ACL
system
* each status-generating tool would have to receive & parse PR
comments
* each status-generating tool would have to maintain per-pr state in
order to track overrides
Some sort of helper library / framework ought make that rather easy
though. It could also be linked into the central provisioning system
thing.
Closes#376
Requirement for odoo/runbot#376: one can't expect there being someone
to override CI checks on stagings, so it only makes sense for checks
on PRs, which in turns requires that there could be checks only
required on PRs.
Could also be useful for features like incremental linting /
formatting, we may want to apply checks on PRs which filter on the
lines modified, but not require the entire software be reformatted at
once.
Having the required statuses be a mere list of contexts has become a
bit too limiting for our needs as it doesn't allow e.g. adding new
required statuses on only some branches of a repository (e.g. only
master), nor does it allow putting checks on only branches, or only
stagings, which would be useful for overridable checks and the like,
or for checks which only make sense linked to a specific revision
range (e.g. "incremental" linting which would only check whatever's
been modified in a PR).
Split the required statuses into a separate set of objects, any of
which can be separately marked as applying only to some branches (no
branch = all branches).
Fixes#382
With a concurrency of 3 or more, it ends up being pretty easy to hit
github's rate limit (5000 requests/h), at which point the run hits a
cascading failure where every test from thereon blows up to the rate
limiting.
Add a handling for that case in some of the early github-querying
fixtures, so they can wait for the ratelimit delay to be restored.
This increases the need for a proper fake github thingie I could run
on a per-test basis.
73f720a55c refactored the runbot tests,
and amongst other things created a single patch point for the "mock
root" as a testcase attribute.
One of the tests was missed during that refactoring, likely because
it's skipped by default.
The ocrmypdf suite of tools are needed by task #2238654.
Altough this package will be optional for Odoo users, it has to be
usable by dev's in order to test and/or fix the feature.
Docker container names are derived from the dest and step name. The dest
is itself derived from the branch name.
In some rare cases, it happens that a character not allowed by Docker
appears in the container name computed by the runbot.
With this commit, a sanitize_container_name function is used to remove
unallowed characters at the container utility level.
The postgresql-client in the Dockerfile is the one provided by the
Debian package. When the postgresql server on the host has a higher
version than the client, some builds may fail (for example, dumping a
database with the pg_dump).
With this commit, the postgresql-client 12 from the postgresql repo is
used in the Dockerfile.
While the head gets updated (properly), the squash flag did not which
could lead to odd results. Since a PR can only be reopened if it was
regular-pushed to (not after a force push) there are two scenarios:
* the PR updated to have 0 commits, closed, pushed to with one commit
then reopened, after reopening the PR would be marked as !squash and
would ask for a merge method (that's what happened with
odoo/odoo#51763)
* the PR has a single commit, is closed, pushed to then reopened,
after reopening the PR would still be marked a squash and potentially
straight rebased without asking for a merge method
Nothing would break per-se but both scenarios are undesirable.
Close#373
The exponential backoff offsets from the write_date of the children
PRs, however it doesn't reset, so the offsetting gets bumped up way
more than originally expected or designed if the child PRs are under
active development for some reason.
Fix this by adding a field to specifically record the date of merge of
a PR, and check that feature against the backoff offset. This should
provide more regular and reliable backoff.
Fixes#369
Given a PR batch getting forward-ported together, if one of the PRs
has a conflict the others should be considered "in conflict" as well,
and should have a note pointing in that direction and indicating that
the PR should be approved the normal way eventually. Which they do.
However, the message is confusing as it gets bolted on the normal
non-conflicting message, either noting that it's part of a chain
or (worse because it gives conflicting indication) the "terminal"
message recommending using the forwardbot to approve of the entire
chain.
I've no idea why I did it that way instead of just adding a case to
the conditional, and the commit message provides no indication. But
perform that change, it seems innocuous, hopefully there weren't good
reasons I forgot about for doing it the other way around.
Fixes#367
Runbot can send status multiple time for the same hash:
- if transaction fails in scheduler and is retried
- if multiple subbuild are failing
Leading to multiple issues:
- when github receive more than one failure status, mergebot will
be notified multiple times and send multiple mail (for forward ports mainly)
- github will answer `422 Unprocessable Entity for url...` after
1000 status.
This fix proposes to limit number of status:
- By avoiding to send status for orphan build (parent status will never change)
- By storing last send status to avoid to notify multiple time
- By sending status post commit to avoid to contact github in case of failure.
This will also slightly reduce transaction time by removing an http request.
Sometimes, it happens that a `git fetch` fails with an error code 128
for example. When this happens, the runbot host is immediately disabled.
During investigations of such cases, we found that simply retrying the
fetch command works.
With this commit, the fetch command is tried 5 times with an increasing
delay before deciding to disable the runbot host.
When it updated tagging e82de3136b also
incorrectly replaced a `pr` by `pr.display_name`, probably leftover
from an attempt to update a callsite from `str(pr)` to
`pr.display_name` which I missed when reverting that.
Anyway at that section, `pr` is an integer (as it comes from an SQL
query) not an object.
Provides a `skipci` command to PR reviewers. This makes it so the
followup PRs (after the first one) get created immediately, without
waiting for CI to succeed on a given forward-port PR.
This can be useful if for some reason a change *must* be merged in
branch N+1 before it can be merged in branch N.
Fixes#363
e9e08fec3c attempted to fix the issue
but obviously failed as it still occurs: when creating a PR through
the API, it's possible that the webhook gets triggered fast enough the
transaction creating the PR from the webhook commits before we get
around to creating our own PR from the API call. In which case the
forward port process aborts.
The process is re-run later on and generally succeeds fully, but we're
left with a dangling PR we created but couldn't do anything with as
its use broke.
This issue seems to be getting more frequent so it's becoming quite
important to fix it. Therefore we give our Raging Bull a Big Gun and
now he has 20 attack *cough cough* we lock the bloody table down
tight (only allow concurrent `SELECT`) until we've got the PR back and
we've done the updates we need to it and nobody can mess with it...
probably.
This is not ideal as it's going to block updates to completely
unrelated PRs but it doesn't seem like postgres really allows for
locking out creations without locking out the rest, short of using
advisory locks maybe? E.g. in the `create` override get a
`pg_advisory_xact_lock_shared`, then get a `pg_advisory_xact_lock` in
the forward-port process that way we're just blocking the concurrent
creation of PRs during forward port, but creations don't block one
another and we don't block updates.
Application-level locks wouldn't really work as the 'bot could be
deployed using multi-worker scenarios so we'd need cross-process locks
or something.
Hopefully fixes#352
Since we store the target_branch_name, filtering out pull head names
that contains `patch-` is not necessary anymore.
This commit is one first step towards a clean refactoring.
When a build is done, various numerical informations could be extracted
from log files. e.g.: global query count or tests query count ...
The extraction regular expression could be hard-coded in a custom step
but there is no place holder where to store the retrieved information.
In order to compare results, we need to store it.
With this commit, a new model `runbot.build.stat` is used to store
key/values pair linked to a build/config_step. That way, extracted
values can be stored.
Also, another `runbot.build.stat.regex` is used to store regular
expressions that can be used to grep log files and extract values.
The regular expression must contain a named group like this:
`(?P<value>.+)`
The text catched by this group MUST be castable into a float.
Optionally, another named group can be used in the regular expresion
like this:
`(?P<key>.+)`
This `key` group will then be used to augment the key name in the
database.
Example:
Consider a log line like this one:
`odoo.addons.website_blog.tests.test_ui tested in 10.35s`
A regular expression like this, named `test_duration`:
`odoo.addons.(?P<key>.+) tested in (?P<value>\d+\.\d+)s`
Should store the following key:value:
`{
'key': 'test_duration.website_blog.tests.test_ui',
'value': 10.35
}`
A `generic` boolean field is present on the build.stat.regex object,
meaning that when no regex are linked to a make_stats config step, then,
all the generic regex will be applied.
A wizard is added to help the creation the regular expressions, allowing
to test if they work against a user provided example.
A _make_stats method is added to the ConfigStep model which is called
during the _schedule of a build, just before calling the next step.
The regex search is only apllied in steps that have the `make_stats`
boolean field set to true. Also, the build branch have to be flagged
`make_stats` too or the build can have a key `make_stats` in its
config_data field.
The `make_stats` field on the branch is a compute stored field.
That way, sticky branches are automaticaly set `make_stats' true.
Finally, an SQL view is used to facilitate the stats visualisation.
The logic of the partner merge wizard is to collect all relevant data
from source partners, write them to a destination partner, then remove
the sources.
This... doesn't work when the field in question has a UNIQUE
constraint (like github_login), because it's going to copy the value
from a source onto a dest which will blow the constraint, and so the
copy fails. In that case the user first has to *move over* the unique
field's value then they can use the wizard.
Just fix for the github login: take all sources, remove (and store)
their github logins, then write the login onto the dst.
An alternative would have been to *defer* the constraint, however:
* it only works on unique constraints, not unique indexes
* it requires the constraint to be declared DEFERRABLE
Closes#301
The reminder feature is a bit brutal when people go on holidays or
whatever as it keeps commenting every day.
This should comment every day for a few days, then quickly taper down.
Closes#285
Up till now if an FF failed with an exception having neither cause nor
context the cancel reason would be an empty string.
Fallback on stringifying the exception itself as a last resort.
Because the reminder cron uses groupby to "merge" open PRs related to the
same source and send a single message for all of them (e.g. PR 6548
forward-ported to 6587 and 6591 should have a single reminder message per
day not one per descendant), the PRs with the same source need to be
consecutive in the search sequence.
However there was no order specified so the search would yield PRs in id
order or something, and if there happened to be an other forward-port PR
inbetween the descendants of the original would not get coalesced and would
therefore trigger a message per descendant per day (doubling or tripling the
intended spam rate).
Ordering by source_id should fix the issue as it ought make all PRs
forward-ported from the same thing contiguous, and therefore grouped
together before sending reminder messages.
An alternatively solution would be to use `groupby` instead of `search` but
it would require more modifications as we'd need to re-browse the sources
and descendants, etc...
First part of fixing #285 as this is likely why odoo/enterprise#7204 got
spammed so much: its descendants were odoo/enterprise#7367 and
odoo/enterprise#7369 and it just so happens that odoo/enterprise#7368 was
*also* a forward port PR, causing the issue explained above.