runbot

mirror of https://github.com/odoo/runbot.git synced 2025-03-15 23:45:44 +07:00

Author	SHA1	Message	Date
Xavier Morel	dae046708f	[IMP] runbot_merge: make batch blocked message more precise In case of PRs not being ready, don't just say the PRs are waiting for CI even though they might be unreviewed, and make a difference between waiting for CI (pending) and having failed CI.	2024-05-24 09:08:56 +02:00
Xavier Morel	f97502e503	[IMP] runbot_merge: make skipchecks impact PR state It's a bit weird and inconsistent to have a PR being staged while unreviewed or unapproved or w/e. If we compute the state based on skipchecks then everything is consistent. Also remove the implicit override of all statuses when explicitly marking the pr as `ready`, it risks creating difficult to understand states, and it's unnecessary since `skipchecks` gets set. Also as with setting skipchecks, sets the current user as reviewer on all PRs of the batch without a reviewer.	2024-05-24 09:08:56 +02:00
Xavier Morel	fa2bba3cb9	[CHG] runbot_merge: don't reset cancel_staging on r- Also send skipchecks removal to the PR being r-'d, as sending it to a random PR of the batch doesn't really make sense?	2024-05-24 09:08:56 +02:00
Xavier Morel	c66451a8c7	[IMP] runbot_merge: cleanup/modernize test_multirepo.py - remove the `legal/cla` and `ci/runbot` context names, which I use a lot for historical reasons but fundamentally they're not useful to the tests, the `default` context is generally simpler. - remove `make_branch` helper as we don't actually use branch protection and at the end of the day it doesn't do much else - convert a few explicit PR lookups to the project-wide `to_pr` helper	2024-05-23 07:58:58 +02:00
Xavier Morel	a6a37f8896	[FIX] runbot_merge: handling of staging cancellation Move staging cancellation to the batch, remove its (complicated) handling from the PRs. This loses some precision in the cancellation message, but that could likely be recovered (in part) by adding more precise checks & diagnostic extractions in the compute.	2024-05-23 07:58:58 +02:00
Xavier Morel	ad1d590d9c	[IMP] runbot_merge: fix dual merge of split prioritised PRs Because `alone` (formerly p != 2) is selected before split PRs, if a prioritised PR gets split (or a split PR gets prioritised) it will be staged once as prioritised, and again because split. Improve the selection of ready batches to exclude split batches upstream, such that they don't have to be rechecked over and over, and their priorities don't cause us issues.	2024-05-23 07:58:58 +02:00
Xavier Morel	83511f45e2	[CHG] runbot_merge: move priority field from PR to batch Simplifies the `ready_prs` query a bit and allows it to be converted to an ORM search, by moving the priority check outside. This also allows the caller to not need to post-process the records list anywhere near the previous state of affairs. `ready_prs` now returns either the "alone" batches, or the non-alone batches, rather than mixing both into a single sequence. This requires correctly applying the search filters to not retrieve priority of batches in error or targeting other branches.	2024-05-23 07:58:58 +02:00
Xavier Morel	ef6a002ea7	[CHG] runbot_merge: move staging readiness to batch Staging readiness is a batch-level concerns, and many of the markers are already there though a few need to be aggregated from the PRs. As such, staging has no reason to be performed in terms of PRs anymore, it should be performed via batches directly. There is a bit of a mess in order not to completely fuck up when retargeting PRs (implicitly via freeze wizard, or explicitely) as for now we're moving PRs between batches in order to keep the batches mostly target-bound. Some of the side-effects in managing the coherence of the targeting and moving PRs between batches is... not great. This might need to be revisited and cleaned up with those scenarios better considered.	2024-05-23 07:58:58 +02:00
Xavier Morel	9ddf017768	[CHG] *: move fw_policy from PR to batch	2024-05-23 07:58:58 +02:00
Xavier Morel	21b5dd439b	[CHG] runbot_merge: move merge_date to batch, remove active - `merge_date` should be common to an entire batch, so move it there - remove `Batch.active` which should probably have been removed when batches were made persistent (can eventually re-add as a proxy for `merge_date` being set maybe, but for now removing it seems a better way to catch mistakes) - update various sites to use `Batch.merge_date` instead of `Batch.active`	2024-05-23 07:58:58 +02:00
Xavier Morel	e910b8e857	[IMP] runbot_merge: move cross-pr properties to batch	2024-05-23 07:58:58 +02:00
Xavier Morel	473f89f87d	[CHG] *: persistent batches This probably has latent bugs, and is only the start of the road to v2 (#789): PR batches are now created up-front (alongside the PR), with PRs attached and detached as needed, hopefully such that things are not broken (tests pass but...), this required a fair number of ajustments to code not taking batches into account, or creating batches on the fly. `PullRequests.blocked` has also been updated to rely on the batch to get its batch-mates, such that it can now be a stored field with the right dependencies. The next step is to better leverage this change: - move cross-PR state up to the batch (e.g. skipchecks, priority, ...) - add fw info to the batch, perform forward-ports batchwise in order to avoid redundant batch-selection work, and allow altering batches during fw (e.g. adding or removing PRs) - use batches to select stagings - maybe expose staging history of a batch?	2024-05-23 07:58:58 +02:00
Xavier Morel	c140701975	[ADD] runbot_merge: support staging ready PRs over splits Not sure it's going to be useful but it's hard to know if we can't test it. The intent is mostly the ability to prioritize throughput (or attempt to) during high-load events, if we can favour staging N new batches over a split's N/2 we might be able to merge more crap. But maybe not, we'll see, either way now it's here and seems to more or less work. Fixes #798	2024-05-23 07:58:58 +02:00
Xavier Morel	9f54e6f209	[ADD] runbot_merge: option to disable staging without cron Because the mergebot crons are on such a tight scheduling, and just them finding out they have nothing to do can take a while, disabling them can be a chore. Disabling staging via the project is much less likely to cause issues as the projects don't normally (or ever?) get exclusively locked, so they can generally be written to at any moment. Furthermore, if we ever get in a situation where we have multiple active projects (not really the case currently, we have multiple projects but only one is really active) it's less disruptive to disable stagings on a single specific project. Fixes #860	2024-05-23 07:58:58 +02:00
Xavier Morel	d4fa1fd353	[CHG] : rewrite commands set, rework status management This commit revisits the commands set in order to make it more regular, and limit inconsistent command-sets, although it includes pseudo-command aliases for common tasks now removed from the core set. Hard Errors =========== The previous iteration of the commands set would ignore any non-command term in a command line. This has been changed to hard error (and ignoring the entire thing) if any command is unknown or invalid. This fixes inconsistent / unexpected interpretations where a user sends a command, then writes a novel on the same line some words of which happen to also* be commands, leading to merge states they did not expect. They should now be told to fuck off. Priority Restructuring ---------------------- The numerical priority system was pretty messy in that it confused "staging priority" (in ways which were not entirely straightforward) with overrides to other concerns. This has now being split along all the axis, with separate command subsets for: - staging prioritisation, now separated between `default`, `priority`, and `alone`, - `default` means PRs are picked by an unspecified order when creating a staging, if nothing better is available - `priority` means PRs are picked first when staging, however if `priority` PRs don't fill the staging the rest will be filled with `default`, this mode did not previously exist - `alone` means the PRs are picked first, before splits, and only `alone` PRs can be part of the staging (which usually matches the modename) - `skipchecks` overrides both statuses and approval checks, for the batch, something previously implied in `p=0`, but now independent. Setting `skipchecks` basically makes the entire batch `ready`. For consistency this also sets the reviewer implicitly: since skipchecks overrides both statuses and approval, whoever enables this mode is essentially the reviewer. - `cancel` cancels any ongoing staging when the marked PR becomes ready again, previously this was also implied (in a more restricted form) by setting `p=0` FWBot removal ============= While the "forwardport bot" still exists as an API level (to segregate access rights between tokens) it has been removed as an interaction point, as part of the modules merge plan. As a result, fwbot stops responding ---------------------- Feedback messages are now always sent by the mergebot, the forward-porting bot should not send any message or notification anymore. commands moved to the merge bot ------------------------------- - `ignore`/`up to` simply changes bot - `close` as well - `skipci` is now a choice / flag of an `fw` command, which denotes the forward-port policy, - `fw=default` is the old `ci` and resets the policy to default, that is wait for the PR to be merged to create forward ports, and for the required statuses on each forward port to be received before creating the next - `fw=skipci` is the old `skipci`, it waits for the merge of the base PR but then creates all the forward ports immediately (unless it gets a conflict) - `fw=skipmerge` immediately creates all the forward ports, without even waiting for the PR to be merged This is a completely new mode, and may be rather broken as until now the 'bot has always assumed the source PR had been merged. approval rework --------------- Because of the previous section, there is no distinguishing feature between `mergebot r+` = "merge this PR" and `forwardbot r+` = "merge this PR and all its parent with different access rights". As a result, the two have been merged under a single `mergebot r+` with heuristics attempting to provide the best experience: - if approving a non-forward port, the behavior does not change - else, with review rights on the source, all ancestors are approved - else, as author of the original, approves all ancestors which descend from a merged PR - else, approves all ancestors up to and including the oldest ancestor to which we have review rights Most notably, the source's author is not delegated on the source or any of its descendants anymore. This might need to be revisited if it provides too restrictive. For the very specialized need of approving a forward-port and none of its ancestors, `review=` can now take a comma (`,`) separated list of pull request numbers (github numbers, not mergebot ids). Computed State ============== The `state` field of pull requests is now computed. Hopefully this makes the status more consistent and predictable in the long run, and importantly makes status management more reliable (because reference datum get updated naturally flowing to the state). For now however it makes things more complicated as some of the states have to be separately signaled or updated: - `closed` and `error` are now separate flags - `merge_date` is pulled down from forwardport and becomes the transition signal for ready -> merged - `reviewed_by` becomes the transition signal for approval (might be a good idea to rename it...) - `status` is computed from the head's statuses and overrides, and that becomes the validation state Ideally, batch-level flags like `skipchecks` should be on, well, the batch, and `state` should have a dependency on the batch. However currently the batch is not a durable / permanent member of the system, so it's a PR-level flag and a messy pile. On notable change is that forcing the state to `ready` now does that but also sets the reviewer, `skipchecks`, and overrides to ensure the API-mediated readying does not get rolled back by e.g. the runbot sending a status. This is useful for a few types of automated / programmatic PRs e.g. translation exports, where we set the state programmatically to limit noise. recursive dependency hack ------------------------- Given a sequence of PRs with an override of the source, if one of the PRs is updated its descendants should not have the override anymore. However if the updated PR gets overridden, its descendants should have that override. This requires some unholy manipulations via an override of `modified`, as the ORM supports recursive fields but not recursive dependencies (on a different field). unconditional followup scheduling --------------------------------- Previously scheduling forward-port followup was contigent on the FW policy, but it's not actually correct if the new PR is immediately validated (which can happen now that the field is computed, if there are no required statuses or all of the required statuses are overridden by an ancestor) as nothing will trigger the state change and thus scheduling of the fp followup. The followup function checks all the properties of the batch to port, so this should not result on incorrect ports. Although it's a bit more expensive, and will lead to more spam. Previously this would not happen because on creation of a PR the validation task (commit -> PR) would still have to execute. Misc Changes ============ - If a PR is marked as overriding / canceling stagings, it now does so on retry not just when setting initially. This was not handled at all previously, so a PR in P0 going into error due to e.g. a non-deterministic bug would be retried and still p=0, but a current staging would not get cancelled. Same when a PR in p=0 goes into error because something was failed, then is updated with a fix. - Add tracking to a bunch of relevant PR fields. Post-mortem analysis currently generally requires going through the text logs to see what happened, which is annoying. There is a nondeterminism / inconsistency in the tracking which sometimes leads the admin user to trigger tracking before the bot does, leading to the staging tracking being attributed to them during tests, shove under the carpet by ignoring the user to whom that tracking is attributed. When multiple users update tracked fields in the same transaction all the changes are attributed to the first one having triggered tracking (?), I couldn't find why the admin sometimes takes over. - added and leveraged support for enum-backed selection fields - moved variuous fields from forwardport to runbot_merge - fix a migration which had never worked and which never run (because I forgot to bump the version on the module) - remove some unnecessary intermediate de/serialisation fixes #673, fixes #309, fixes #792, fixes #846 (probably)	2024-05-23 07:58:46 +02:00
Xavier Morel	75f29f9315	[IMP] conftest: avoid parsing json it if may not be When asserting a status fails, it's possible that this is a 400 or 500-type error which does not yield JSON data at all (e.g. forgot to start the proxy or dummy), in which case trying to dump the json data is actively harmful as it triggers cascading failures. And it's not like the output message is formatted, so a re-structured assertion message is not helpful.	2024-05-16 10:37:50 +02:00
Xavier Morel	955a61a1e8	[CHG] runbot_merge, forwardbot: merge commands parser - move all commands parsing to runbot_merge as part of the long-term unification effort (#789) - set up an actual parser-ish structure to parse the commands to something approaching a sum type (fixes #507) - this is mostly prep for reworking the commands set (#673), although strict command parsing has been implemented (cf update to `test_unknown_commands`)	2024-05-16 10:37:50 +02:00
Xavier Morel	0e71f85802	[ADD] forwardport: some typing to conftest It's not amazing, because cross-typing between the different conftests and utility modules doesn't really work smoothly. So not entirely convinced it's great. While at it, inline a few `x.json()` local aliases when the jsons are used in exclusive locations e.g. usually an assertion message on one side and a result / process on the other.	2024-05-16 10:37:50 +02:00
Xavier-Do	8722aba511	[FIX] runbot: report access error on view	2024-05-16 10:04:54 +02:00
Xavier Morel	f4889ec8cf	[ADD] runbot_merge: ad-hoc ACL tracking to res.partner Sadly m2ms don't support tracking, so add a bunch of ad-hoc tracking to the override rights in order to know who, what, when at least. Do the same for the review rights although maybe tracking works for those.	2024-05-16 09:32:03 +02:00
Xavier Morel	b177361f20	[IMP] forwardport: locking during creation of fw batch Not sure why I didn't think about it previously (in #357), but it would make sense for the entire batch creation to be atomic and thus under the same lock. The commit during forward porting also makes a lot less sense: it was a failed early attempt at resolving the problem by hoping we'd win the race with the webhook (commit before the webhook hit). By locking the PRs table to update, we actually resolved it. But since all that happens then is a few updates and then a commit by the cron itself (it commits per batch), it's probably good enough to leave the entire thing under the same lock. This means we lock out other interactions a bit longer, but since the span is still just the forward port of a single batch it should not be too much of an issue outside of post-freeze recovery thingie.	2024-05-16 09:32:03 +02:00
Xavier Morel	a8e4d6dfee	[IMP] runbot_merge: don't select content when locking rows It might not be a huge amount of extra work since we're never actually retrieving the rows, but it still seems completely unnecessary. Sadly we can't do something cleaner like an aggregation, because aggregating requires moving the locking query to a subquery, and experimentally that seems slower than just ignoring / discarding the result set.	2024-05-16 09:32:03 +02:00
Xavier-Do	2a49ac4f8c	[FIX] runbot: no result for subcommand	2024-05-15 12:11:24 +02:00
Xavier-Do	d5114c0062	[IMP] runbot: improve make_result The "Error or traceback found in logs" message is sometimes confusing since we don't know what is the cause of the issue. The first idea is to split the two concept, error and traceback to have a better idea of the cause of the issue The second one will also log the content of the line in the error message. For traceback, tries to get the complete traceback, getting all indentend lines and one last non idented one. While working on it, cleaning slighty to partially get rid of returning a dict, artefact from odoo < 13.0	2024-05-14 16:21:15 +02:00
Xavier-Do	507f4e37e3	[IMP] runbot: custom trigger view Somme trigger may have an important depth and nightly result can be long to check. A custom view was already done for upgrade nightly, but this is hidden and the same logic could be applied to the distro builds, ... This commit adds a custom view on the trigger and related controller to display a custom view for a trigger.	2024-05-14 15:26:24 +02:00
Xavier-Do	2a18cd7f3d	[FIX] *: remove invalid escape sequences	2024-04-30 16:05:21 +02:00
Xavier-Do	e7d1cc9c57	[FIX] runbot: small fix following BS 5.3	2024-04-23 11:53:46 +02:00
Xavier-Do	816af5a058	[IMP] runbot: better log listing The current logic to define if a build has logs or not is based on is_docker_step. This related logic is not always valid, since it is based on step type and step content, but there are many way to have a result that could be a docker start, one of them being to call another step. The main issue is that we don't always now if this other step is a docker step or not before runtime. Moreover the logic becamore more and more complex adding more conditions like commands, _run_, docker_params, ... Moreover, the log list is completed before the build start, meaning that if the build is killed before completion, some logs may be missing but will be listed. This is also an issue when trying to access logs before the correspondong step ran. This commit changes the logic by adding the log file to the list (and start log) when starting a docker, wich should give a more consistent result.	2024-04-23 10:17:34 +02:00
Xavier-Do	a3c85d87d0	[IMP] runbot: improve log message layout	2024-04-23 10:08:54 +02:00
Xavier-Do	c577b72a66	[IMP] runbot: rework preference menu	2024-04-23 10:08:54 +02:00
Xavier-Do	2c5f129969	[IMP] runbot: add dark theme	2024-04-23 10:08:54 +02:00
Xavier-Do	6bafea7c36	[IMP] runbot: adapt to bootstrap 5.3 The bootstrap version was freezed during a previous migration to avoid loosing to much time adapting the style again to fit the previous look and feel. Anyway, the latest version of bootsrap offers more flexibility about themes, and it could be a good oportunity to modernise a little the runbot interface and answer to long lasting requests. The main part of the adaptation is to tweak colors to match the previous style, and adapt some class in xml views. Some css rules are also tweaked to keep the same looks without the need to rewrite xml views too much, this could be done in a future commit.	2024-04-23 10:08:54 +02:00
Xavier-Do	0eae784a1e	[FIX] runbot: don't kill build if build may be used. The current stratey to check if a build can be killed is to ensure that no slot unskipped slots points to the same build. Since the check is only done after a prepare, it should ensure that the new slot have the build linked before checking if the previous batch can kill it's build. Since the slot are now filled latter on (after the minimal check) it is possible that a build is considered killable an killed before being linked again. To fix this, we just need to consier params instead of builds to define if a build is killable or not. If the params of a builds are linked elsewere, don't kill them.	2024-04-17 16:21:21 +02:00
Xavier-Do	e241f03321	[FIX] runbot: only start triggers based on alive branches	2024-04-17 14:28:53 +02:00
Xavier-Do	9eb7f0d577	[IMP] runbot: add trigger cross dependency	2024-04-16 11:09:50 +02:00
Xavier-Do	86b88d86c8	[FIX] runbot: adapt tests	2024-04-16 08:39:51 +02:00
Xavier-Do	7277d402fb	[FIX] runbot: fix typo in view	2024-04-15 23:10:27 +02:00
Xavier-Do	7bc26219c2	[FIX] runbot: add missing tree hash Previous commit introduced commit tree hash, but only when calling get_commit_infos. This fixes the get_ref and find_new_commit to have the same infos.	2024-04-15 16:44:44 +02:00
Xavier-Do	c70aa57acb	[FIX] runbot: incorrect label	2024-04-15 14:57:06 +02:00
Xavier-Do	c067384b8b	[IMP] runbot: add tree hash to views	2024-04-15 14:41:02 +02:00
Xavier-Do	e521149d93	[IMP] runbot: add container_cpus to step views	2024-04-15 14:32:54 +02:00
Xavier-Do	83acc43a05	[IMP] runbot: save reference batches on base batches	2024-04-15 11:42:28 +02:00
Xavier-Do	c9e14a86ca	[IMP] runbot: propagate batch to reference builds decision	2024-04-15 11:42:28 +02:00
Xavier-Do	082ecaafe9	[IMP] runbot: add tree hash info to commit Right now the commit is considered as a part of the build uniquifier. If it makes sence for a check on the commit metadata, it is not the case for execution tests where only the content matters. This will allow to mark a trigger as non depending on commit but tree hash, and avoid rebuild when only fixing a commit message.	2024-04-15 10:53:48 +02:00
Xavier-Do	f191e8cc48	[IMP] runbot: optionnal container_name in python steps	2024-04-15 10:44:32 +02:00
Xavier-Do	7e848b8073	[FIX] runbot: fix stats.js	2024-04-15 10:35:20 +02:00
Xavier-Do	9ef850220b	[IMP] runbot: add a way to add custom pre/post Mainly usefull for custom triggers	2024-04-15 10:08:56 +02:00
Christophe Monniez	412baa3fad	[REF] runbot: centralize cpu_limit in _run_step In order to keep it coherent, the cpu_limit computation can be done in one place instead of defining it in all _run* methods. In python steps, it can still be overridden when returning docker_params.	2024-04-15 10:06:28 +02:00
Christophe Monniez	d0a96faf84	[IMP] runbot: add cpus parameter When build eats all the CPU's resources, it may interfere with other builds and cause collateral damages. With this commit, a new settings parameter `Containers CPUs` is added in order to limit the usage of available CPU's on runbot instances. If left to 0, no limts are applied. Otherwise, the cpu_quota docker parameter is computed as Containers CPU's * (logical cpu's count / nb parallel builds) * cpu period which defaults to 100000. e.g.: - on a host with 16 logical CPU's - with 8 parallel builds allowed - with Containers CPUs set to 1.5 - with the default cpu_period cpu_quota will be: (16/8) * 1.5 * 100000 = 300000 This system parameter can be overridden by the `container_cpus` field on steps.	2024-04-15 10:06:28 +02:00
Xavier-Do	0fc1daeac9	[IMP] runbot: enable user to toggle no_build	2024-04-10 16:23:50 +02:00

... 2 3 4 5 6 ...

1856 Commits