runbot

mirror of https://github.com/odoo/runbot.git synced 2025-03-15 23:45:44 +07:00

Author	SHA1	Message	Date
Xavier Morel	dff7f102ea	[CHG] runbot_merge: convert freeze wizard to local repo Probably less necessary than for the regular staging stuff, but might as well while at it. Requires updating one of the test to generate a non-ff push, as O_CREAT doesn't exist at the git level, and the client (and it is client-side) only protects against force pushes. So there is no way to trigger an issue with just the creation of the new branch, it needs to exist and point to a non-ancestor commit. Also remove a sleep in the ref update loop as there are no ref updates anymore, until the very final sync via git. NB: maybe it'd be possible to push both bump and release PRs together for each repo, but getting which update failed in case of failure seems difficult.	2023-08-23 14:09:32 +02:00
Xavier Morel	1f8d534c02	[REM] runbot_merge: use of tmp branches during staging The entire staging process is now done locally with no interaction with the repo until the staging branches get pushed to. Removed the backoff / visibility check, as we expect the git repo to be a "better" reference than the API, and we're now pushing there directly.	2023-08-23 09:34:28 +02:00
Xavier Morel	879a5d0068	[CHG] runbot_merge: convert rebase-merge and merge to working locally	2023-08-22 15:52:48 +02:00
Xavier Morel	87948ef575	[IMP] runbot_merge: inline stage_into, remove working copy Turns out we can (most likely) do all the work in the local main repo directly, so there is no need for the separate working copy, or the exist stack, and thus no need for a separate function to be context-managed. Should have yagni'd and waited to setup the working copy stuff... oh well.	2023-08-22 14:23:51 +02:00
Xavier Morel	a9b4b373c2	[CHG] runbot_merge: convert rebase-ff staging to working locally Implemented rebase (-ish?) by hand, following the model of the one built on the github API, so we have all the information we need and still don't need a local working copy. Had to update several tests due to formatting changes from using the CLI git (maybe that could be avoided by feeding the commit message through stdin but w/e). Also added a dupe of an existing helper for the formatting changes bit, will have to remove it once all's done.	2023-08-22 14:17:22 +02:00
Xavier Morel	6b72585ef1	[CHG] runbot_merge: convert squash staging to working locally Update a test because of minor formatting changes which don't really seem fixable (but don't really matter either).	2023-08-22 08:05:06 +02:00
Xavier Morel	7a13b129bc	[ADD] runbot_merge: github identity for the mergebot Necessary to create commits as the mergebot without going through the github API. Copy of the improved version from forwardport. Not an override, to avoid unnecessarily triggering one or the other which is confusing and weird.	2023-08-22 08:04:46 +02:00
Xavier Morel	fac220c234	[IMP] forwardport: when fetching identity avoid emails if possible If the primary email is made public, it is returned directly as part of the /users endpoint, in which case we don't need to fetch it via /user/emails. Also improve error messages, and fix the incorrect checks on the existence of the github name and email. And allow manually updating both via the project form.	2023-08-22 08:01:38 +02:00
Xavier Morel	e88e6fed34	[FIX] convert source_url to only include the token not having to deal with the username is much easier	2023-08-22 08:00:48 +02:00
Xavier Morel	747162d6e1	[CHG] runbot_merge: switch staging from github API to local It has been a consideration for a while, but the pain of subtly interacting with git via the ignominous CLI kept it back. Then ~~the fire nation attacked~~ github got more and more tight-fisted (and in some ways less reliable) with their API. Staging pretty much just interacts with the git database, so it's both a facultative github operator (it can just interact with git directly) and a big consumer of API requests (because the git database endpoints are very low level so it takes quite a bit of work to do anything especially when high-level operations like rebase have to be replicated by hand). Nota: it might be a good idea to use pygit2 instead of the CLI, the library seems to be typed and might allow doing the processing we need on a bare repository directly. Which we can do via the plumbing commands anyway, but still. Fixes #247	2023-08-18 13:51:18 +02:00
Xavier Morel	5703840b3b	[MOV] runbot_merge: all the staging creation code to a separate module Move almost all the staging code to free functions, in a separate module, and extensively typed. The only bits which didn't move are: - the entry point (the cron hook), because it has to be a model method in order to be called - the `_build_merge_message` method, because it needs to be overridable There's also a bit of an import mess, because the cron & `_build_merge_message` need to call into the new module, but the new module wants the types they belong to, so it's a bit circular.	2023-08-18 13:07:51 +02:00
Xavier Morel	b1f63e52f3	[CHG] *: move repo cache from forwardbot to mergebot If the stagings are going to be created locally (via a git working copy rather than the github API), the mergebot part needs to have access to the cache, so move the cache over. Also move the maintenance cron. In an extermely minor way, this prefigures the (hopeful) eventual merging of the ~~planes~~ modules.	2023-08-16 14:37:28 +02:00
Xavier Morel	90961b99c9	[ADD] : changelog entries I forgot Can't hurt to have* them.	2023-08-14 09:28:19 +02:00
Xavier Morel	62fb880a45	[FIX] forwardport: sync outstandings notification with page In `81ce4ea02b` the delta for PRs being listed in the `/forwardport/outstanding` page was increased from 3 days to 7 (1 week), however the warning box in the home page still used the old cutoffs leading to An inconsistency between the two and an effective severe overcounting, as the reason why the cutoff was increased to a week is forward ports can take a while or the author / reviewer can be a touch busy at end of week, so 3~4 days are routine when a PR is merged on thursday or friday (and even worse if there's bank holidays in the mix).	2023-08-14 08:00:56 +02:00
Xavier Morel	7348e4d7a4	[IMP] runbot_merge: ensure at least 1s between mutating GH calls Mostly a temporary safety feature after the events of 07-31: it's still not clear whether that was a one-off issue or a change in policy (I was not able to reproduce locally even doing several set_refs a second) and the gh support is not super talkative, but it probably doesn't hurt to commit the workaround until #247 gets implemented. On 2023-07-31, around 08:30 UTC, `set_ref` started failing, a lot (although oddly enough not continuously), with the unhelpful message that > 422: Reference cannot be updated This basically broke all stagings, until a workaround was implemented by adding a 1s sleep before `set_ref` to ensure no more than 1 `set_ref` per second, which kinda sorta has been the github recommendation forever but had never been an issue before. Contributing to this suspicion is that in late 2022, the documentation of error 422 on `PATCH git/refs/{ref}` was updated to: > Validation failed, or the endpoint has been spammed. Still would be nice if GH was clear about it and sent a 429 instead. Technically the recommendation is: > If you're making a large number of POST, PATCH, PUT, or DELETE > requests for a single user or client ID, wait at least one second > between each request. So... actually implement that. On a per-worker basis as for the most part these are serial processes (e.g. crons), we can still get above the rate limit due to concurrent crons but it should be less likely. Also take `Retry-After` in account, can't hurt, though we're supposed to retry just the request rather than abort the entire thing. Maybe a future update can improve this handling. Would also be nice to take `X-RateLimit` in account, although that's supposed to apply to all requests so we'd need a second separate timestamp to track it. Technically that's probably also the case for `Retry-After`. And fixing #247 should cut down drastically on the API calls traffic as staging is a very API-intensive process, especially with the sanity checks we had to add, these days we might be at 4 calls per commit per PR, and up to 80 PRs/staging (5 repositories and 16 batches per staging), with 13 live branches (though realistically only 6-7 have significant traffic, and only 1~2 get close to filling their staging slots).	2023-08-11 12:32:21 +02:00
Xavier Morel	85a74a9e32	[ADD] runbot_merge: staging query endpoints `/runbot_merge/stagings` ======================== This endpoint is a reverse lookup from any number of commits to a (number of) staging(s): - it takes a list of commit hashes as either the `commits` or the `heads` keyword parameter - it then returns the stagings which have all these commits as respectively commits or heads, if providing all commits for a project the result should always be unique (if any) - `commits` are the merged commits, aka the stuff which ends up in the actual branches - `heads` are the staging heads, aka the commits at the tip of the `staging.$name` branches, those may be the same as the corresponding commit, or might be deduplicator commits which get discarded on success `/runbot_merge/stagings/:id` ============================ Returns a list of all PRs in the staging, grouped by batch (aka PRs which have the same label and must be merged together). For each PR, the `repository` name, `number`, and `name` in the form `$repository#$number` get returned. `/runbot_merge/stagings/:id1/:id2` ================================== Returns a list of all the successfully merged stagings between `id1` and `id2`, from oldest to most recent. Individual records have the form: - `staging` is the id of the staging - `prs` is the contents of the previous endpoint (a list of PRs grouped by batch) `id1` must be lower than `id2`. By default, this endpoint is inclusive on both ends, the `include_from` and / or `include_to` parameters can be passed with the `False` value to exclude the corresponding bound from the result. Related to #768	2023-08-11 11:13:34 +02:00
Xavier Morel	4eefc980bb	[IMP] runbot_merge: logger messages	2023-08-10 16:14:33 +02:00
Xavier Morel	e9f7252ed1	[FIX] runbot_merge: sentry issue via monkeypatch `auto_session_tracking` causes issues when not specified on the super old version of the client which is available on ubuntu. Also disable tracing as it seems less useful than hoped for, and I've not been using what's been collected so far.	2023-08-10 15:27:20 +02:00
Xavier Morel	b1af2e573a	[IMP] runbot_merge: split staging heads out to join tables Currently the heads of a staging (both staging heads and merged heads) are just JSON data on the staging itself. Historically this was convenient as the heads were mostly of use to the staging process, and thus accessed directly through the staging essentially exclusively. However this makes finding stagings from merged commits e.g. for forensic research almost impossible, because querying based on the values of a JSON map is expensive, and indexing it is difficult. To make this use case more feasible, split the `heads` field into two join tables, one for the staging heads and one for the merged heads, this makes looking for stagings by commits much more efficient (although the queries may not be trivial). Also add two utility RPC methods, so that it's possible to query stagings reasonably easily and efficiently based on a set of commits (branch heads). related to #768	2023-08-10 14:04:59 +02:00
Xavier Morel	cdffa83191	[IMP] runbot_merge, forwardport: minor cleanups Remove unused imports, unnecessary f-strings, dead code, fix less-than-ideal operators.	2023-08-10 13:33:16 +02:00
Xavier Morel	a692163f6e	[IMP] runbot_merge: add quick jump from stagings to PRs In the backend, the intermediate jump through batches is really not convenient (even if we kinda have to jump through batches anyway). Fixes #751	2023-07-10 15:23:31 +02:00
Xavier Morel	780e20bfd6	[IMP] runbot_merge: filtering options and UX on stagings list Allow filtering stagings by state (success or failure), and provide a control to explicitly update the staging date limit. Should make it easier to drill through stagings when looking for specific information. Related to #751	2023-07-10 15:23:31 +02:00
Xavier Morel	5bce73c97d	[IMP] *: optimise loading of home page Fix outstanding query to make a positive `state` filtering, instead of negative, matching 3b52b1aace8674259812a76b1566260937dbcacb. Also manually create a map of stagings (grouped by branch) sharing a single prefetch set. For odoo the mergebot home page has 12 branches in the odoo project and 8 in spreadsheet, 6 stagings each. This means 120 queries to retrieve all the heads (Odoo stagings have 5 heads and spreadsheet have 1, but that seems immaterial). By fixing `_compute_statuses` and creating a single prefetch set for all stagings of all branches we can fetch all the commits in a single query instead of 120.	2023-07-10 15:23:31 +02:00
Xavier Morel	81ce4ea02b	[IMP] rewrite /forwardport/outstanding - add support for authorship (not just approval) - make display counts directly - fix `state` filter: postgres can't do negative index lookups - add indexes for author and reviewed_by as we look them up - ensure we handle the entire source filtering via a single subquery Closes #778	2023-07-10 15:23:31 +02:00
Xavier Morel	aefbdaf974	[IMP] *: client side sorted implementation Allow sorting by a callback. Sort remains client-side.	2023-07-10 15:23:31 +02:00
Xavier Morel	d748d4b215	[IMP] : create a single template db per module to test Before this, when testing in parallel (using xdist) each worker would create its own template database (per module, so 2) then would copy the database for each test. This is pretty inefficient as the init of a db is quite expensive in CPU, and when increasing the number of workers (as the test suite is rather IO bound) this would trigger a stampede as every worker would try to create a template at the start of the test suite, leading to extremely high loads and degraded host performances (e.g. 16 workers would cause a load of 20 on a 4 cores 8 thread machine, which makes its use difficult). Instead we can have a lockfile at a known location of the filesystem, the first worker to need a template for a module install locks it, creates the templates, then writes the template's name to the lockfile. Every other worker can then lock the lockfile and read the name out, using the db for duplication. Note: needs to use `os.open` because the modes of `open` apparently can't express "open at offset 0 for reading or create for writing", `r+` refuses to create the file, `w+` still truncates, and `a+` is undocumented and might not allow seeking back to the start on all systems so better avoid it. The implementation would be simplified by using `lockfile` but that's an additional dependency plus it's deprecated. It recommends `fasteners` but that seems to suck (not clear if storing stuff in the lockfile is supported, it opens the lockfile in append mode). Here the lockfiles are sufficient to do the entire thing. Conveniently, this turns out to improve both* walltime CPU time compared to the original version, likely because while workers now have to wait on whoever is creating the template they're not competing for resources with it.	2023-07-10 15:23:31 +02:00
Xavier-Do	f22ae357e3	[DEL] runbot: 15.0 eol	2023-06-22 15:59:50 +02:00
Xavier-Do	b72fdff01d	[DEL] runbot: 15.0 eol	2023-06-22 15:59:36 +02:00
Xavier Morel	72281b0c63	[IMP] runbot_merge: allow running test suite without an explicit addons path	2023-06-22 14:38:10 +02:00
Xavier-Do	fc41e09f44	[IMP] runbot: add useful indexes remaining from upgrade	2023-06-22 11:23:43 +02:00
Xavier Morel	765281a665	[FIX] runbot_merge: make provisioning more resilient A few cases of conflict were missing from the provisioning handler. They can't really be auto-fixed, so just output a warning and ignore the entry, that way the rest of the provisioning succeeds.	2023-06-21 14:26:19 +02:00
Xavier Morel	9260384284	[FIX] runbot_merge: concurrency error in freeze wizard (hopefully) During the 16.3 freeze an issue was noticed with the concurrency safety of the freeze wizard (because it blew up, which caused a few issues): it is possible for the cancelling of an active staging to the master branch to fail, which causes the mergebot side of the freeze to fail, but the github state is completed, which puts the entire thing in a less than ideal state. Especially with the additional issue that the branch inserter has its own concurrency issue (which maybe I should fix): if there are branches being forward-ported across the new branch, it's unable to see them, and thus can not create the now-missing PRs. Try to make the freeze wizard more resilient: 1. Take a lock on the master staging (if any) early on, this means if we can acquire it we should be able to cancel it, and it won't suffer a concurrency error. 2. Add the `process_updated_commits` cron to the set of locked crons, trying to read the log timeline it looks like the issue was commits being impacted on that staging while the action had started: REPEATABLE READ meant the freeze's transaction was unable to see the update from the commit statuses, therefore creating a diverging update when it cancelled the staging, which postgres then reported as a serialization error. I'd like to relax the locking of the cron (to just FOR SHARE), but I think it would work, per postgres: > SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as > SELECT in terms of searching for target rows: they will only find > target rows that were committed as of the transaction start > time. However, such a target row might have already been updated (or > deleted or locked) by another concurrent transaction by the time it > is found. In this case, the repeatable read transaction will wait > for the first updating transaction to commit or roll back (if it is > still in progress). If the first updater rolls back, then its > effects are negated and the repeatable read transaction can proceed > with updating the originally found row. But if the first updater > commits (and actually updated or deleted the row, not just locked > it) then the repeatable read transaction will be rolled back with > the message This means it would be possible to lock the cron, and then get a transaction error because the cron modified one of the records we're going to hit while it was running: as far as the above is concerned the cron's worker had "just locked" the row so it's fine to continue. However this makes it more and more likely an error will be hit when trying to freeze (to no issue, but still). We'll have to see how that ends up. Fixes #766 maybe	2023-06-21 14:26:19 +02:00
Xavier Morel	ed0fd88854	[ADD] runbot_merge: sentry instrumentation Currently sentry is only hooked from the outside, which doesn't necessarily provide sufficiently actionable information. Add some a few hooks to (try and) report odoo / mergebot metadata: - add the user to WSGI transactions - add a transaction (with users) around crons - add the webhook event info to webhook requests - add a few spans to the long-running crons, when they cover multiple units per iteration (e.g. a span per branch being staged) Closes #544	2023-06-21 14:26:19 +02:00
Xavier Morel	06a3a1bab5	[IMP] runbot_merge: add sentry filtering, rework some error messages - move sentry configuration and add exception-based filtering - clarify and reclassify (e.g. from warning to info) a few messages - convert assertions in rebase to MergeError so they can be correctly logged & reported, and ignored by sentry, also clarify them (especially the consistency one) Related to #544	2023-06-15 08:21:20 +02:00
Xavier Morel	cd4ded899b	[IMP] runbot_merge: error reporting Largely informed by sentry, - Fix an inconsistency in staging ref verification, `set_ref` internally waits for the observed head to match the requested head, but then `try_staging` would re-check that and immediately fail if it didn't. Because github is eventually consistent (hopefully) this second check can fail (and is also an extra API call), breaking staging unnecessarily, especially as we're still going to wait for the update to be visible to git. Remove this redundant check entirely, as github provides no way to ensure we have a consistent view of anything, it doesn't have much value and can do much harm. - Add github request id to one of the sanity check warnings as that could be a useful thing to send upstream, missing github request ids in the future should be noted and added. - Reworked the GH object's calls to be clearer and more coherent: consistently log the same thing on all GH errors (if `check`), rather than just on the one without a `check` entry. Also remove `raise_for_status` and raise `HTTPError` by hand every time we hit a status >= 400, so we always forward the response body no matter what its type is. - Try again to log the request body (in full as it should be pretty small), also remove stripping since we specifically wanted to add a newline at the start, I've no idea what I was thinking. Fixes #735, #764, #544	2023-06-14 16:01:45 +02:00
Xavier Morel	270dfdd495	[REF] : move most feedback messages to pseudo-templates Current system makes it hard to iterate feedback messages and make them clearer, this should improve things a touch. Use a bespoke model to avoid concerns with qweb rendering complexity (we just want GFM output and should not need logic). Also update fwbot test setup to always configure an fwbot name, in order to avoid ping messages closing the PRs they're talking about, that took a while to debug, and given the old message I assume I'd already hit it and just been too lazy to fix. This requires updating a bunch of tests as fwbot ping are sent to* `fp_github_name`, but sent from the reference user (because that's the key we set). Note: noupdate on CSV files doesn't seem to work anymore, which isn't great. But instead set tracking on the template's templates, it's not quite as good but should be sufficient. Fixes #769	2023-06-14 16:01:45 +02:00
Xavier Morel	e14616b2fb	[IMP] runbot_merge: add support for draft check `1cea247e6c` missed the update of the `draft` flag, add support for it. Fixes #753	2023-06-14 16:01:45 +02:00
Xavier Morel	048ae0c5ff	[FIX] forwardport: flag statuses as `recursive` I'd been convinced this was an ORM error because the field is not recursive... in runbot_merge, in forwardbot it is and thus does indeed need to be flagged to avoid the warning.	2023-06-14 16:01:45 +02:00
Xavier Morel	2009177ada	[IMP] *: allow disabling staging on branch, remove fp target flag - currently disabling staging only works globally, allow disabling on a single branch - use a toggle - remove a pair of tests which work specifically with `fp_target`, can't work with `active` (probably) - cleanup search of possible and active stagings, add relevant indexes and use direct search of relevant branches instead of looking up from the project - also use toggle button for `active` on branches - shitty workaround for upgrading DB: apparently mail really wants to have a `user_id` to do some weird thing, so need to re-add it after resetting everything Fixes #727	2023-06-14 16:01:42 +02:00
Xavier Morel	4a4252b4b9	[FIX] runbot_merge: holes in provisioning - github logins are case-insensitive while the db field is CI the dict in which partners are stored for matching is not, And the caller may not preserve casing. Thus it's necessary to check the casefolded database values against casefolded parameters, rather than exactly. - users may get disabled by mistake or when one leaves the project, they may also get switched from internal to portal, therefore it is necessary to re-enable and re-enroll them if they come back. - while at it remove the user's email when they depart, as they likely use an organisational email which they don't have access to anymore Side-note, but remove the limit on the number of users / partners being created at once: because there are almost no modules in the mergebot's instance, creating partner goes quite fast (compared to a full instance), thus the limitation is almost certainly unnecessary (creating ~300 users seems to take ~450ms). Fixes ##776	2023-06-14 16:01:42 +02:00
Xavier Morel	611f9150ff	[IMP] runbot_merge: add signed kw support to from_role, use it Closes #774	2023-06-14 16:01:42 +02:00
Xavier Morel	485d2d7b55	[IMP] runbot_merge: add sitemap params to http controllers When it's missing, website complains because it's dumb. Fixes #763	2023-06-14 16:01:42 +02:00
Xavier Morel	4f237d15b0	[FIX] runbot_merge: correctly check request in handle_pr `652b1ff9ae` wanted to check if a request was available, however it deref'd the `request` object without checking it which is not correct: a `request` normally has an `httprequest`, but the `request` itself might be missing if the handler is called from e.g. a cron. Fixes #739	2023-06-14 16:01:42 +02:00
Xavier-Do	a80dc25699	[FIX] runbot: only top parent is marked killable	2023-06-14 10:30:02 +02:00
Xavier-Do	9cf750119d	[IMP] runbot: make search case insensitive	2023-06-02 17:11:54 +02:00
Xavier-Do	03667b703c	[IMP] runbot: fallback on python step for upgrade step	2023-06-02 14:44:48 +02:00
Xavier-Do	ad4131789c	[FIX] runbot: fix false positive message	2023-06-02 11:10:51 +02:00
Xavier-Do	7845a718b7	[IMP] runbot: add use_ssl to settings	2023-06-02 10:49:05 +02:00
Martin Trigaux	8153fdc4b2	[FIX] runbot: download over https Firefox blocks downloads from http link if you are on an https page Allow to deactivate via an ICP in case the runbot is configured over HTTP (you shouldn't really)	2023-06-02 10:49:05 +02:00
Christophe Monniez	236554b588	[FIX] runbot: catch exceptions during db drop When the runbot tries to drop a local database, if the that raises an exception, it goes in a loop failure. It mays happen for example if someone forgot to close a psql during an investigation :-) With this commit, the exceptions are catched and at least the database name is logged.	2023-06-01 16:16:06 +02:00

1 2 3 4 5 ...

1517 Commits