Commit Graph

1508 Commits

Author SHA1 Message Date
Xavier-Do
2a82f3c1f7 [FIX] runbot: fw-bot codeowner fix 2022-12-01 11:29:25 +01:00
Xavier Morel
7948e59c51 [FIX] *: fix forward port inserter if last branch is disabled
In case where the last branch (before the branch being frozen) is
disabled, the forwardport inserter screws up, and fails to correctly
create the intermediate forwardports from the new branch.

Also when disabling a branch, if there are FW PRs which target that
branch and have not been forward-ported further, automatically
forward-port them as if the branch had been disabled when they were
created, this should limit data loss and confusion.

Also change the message set on PRs when disabling a branch: because of
user conflicts in test setup, the message about a branch being
disabled would close the PRs, which would then orphan the followup,
leading to unexpected / inconsistent behaviour.

Fixes #665
2022-12-01 10:57:32 +01:00
Xavier Morel
ac4047ec2d [IMP] conftest: support for model methods 2022-12-01 10:57:32 +01:00
Xavier Morel
e20277c6ad [FIX] forwardport: storage of old garbage, repo cache sizes
Since the forwardport bot works off of PRs, when it was created
leveraging the magic refs of github (refs/pull/*/head) seemed
sensible: when updating the cache repo, the magic refs would be
updated alongside and the forward-porting would have all the commits
available.

Turns out there are a few issues with this:

- the magic refs have become a bit unreliable, so we need a fallback
  (b1c16fce8768080d30430f4e6d3788b60ce13de7)
- the forwardport bot only needs the commits on a transient basis, but
  the magic refs live forever and diverge from all other content
  (since we rarely `merge` PRs), for a large repo with lots of PRs
  this leads to a significant inflation in size of repo (both on-disk
  and objects count) e.g. odoo/odoo has about 25% more objects
  with the pull refs included (3486550 to 4395319) and takes nearly
  50% more space on disk (1.21G to 1.77)

As a result, we can just stop configuring the pull refs entirely, and
instead fetch the heads (~refs) we need as we need them. This can be a
touch more expensive at times as it requires two `fetch` calls, and
we'll have a few redundant calls as every forward port of a chain will
require a fetch of the root's head, *however* it will avoid retrieving
PR data for e.g. prs to master, as they don't get forward-ported, they
also won't get parasite updates from PRs being ignored or eventually
closed.

Eventually this could be optimised further by

- performing an update of the cache repo at the start of the cron iff
  there are batches to port
- creating a temp clone for the batch
- fetching the heads of all PRs to forward port in the temp clone in a
  single `fetch`
- performing all the ports by cloning the temp clone (and not
  `fetch`-ing into those)
- then cleaning up the temp clone after having cleaned up individual
  forward port clones

Small followup for #489
2022-12-01 10:57:32 +01:00
Xavier Morel
c35b721f0e [IMP] forwardport: gc/maintenance of local repo caches
The current system makes / lets GC run during fetching. This has a few
issues:

- the autogc consumes resources during the forward-porting
  process (not that it's hugely urgent but it seems unnecessary)
- the autogc commonly fails due to the combination of large repository
  (odoo/odoo) and low memory limits (hardmem for odoo, which get
  translated into soft ulimits)

As a result, the garbage collection of the repository sometimes stops
entirely, leading to an increase in repository size and a decrease in
performances.

To mitigate this issue, disable the automagic gc and maintenance
during normal operation, and instead add a weekly cron which runs an
aggressive GC with memory limits disabled (as far as they can get, if
the limits are imposed externally there's nothing to be done).

The maintenance is implemented using a full lockout of the
forward-port cron and an in-place GC rather than a copy/gc/swap, as
doing this maintenance at the small hours of the week-end (sat-sun
night) seems like a non-issue: currently an aggressive GC of odoo/odoo
(using the default aggressive options) takes a total of 2:30 wallclock
(5h user) on a fairly elderly machine (it's closer to 20mn wallclock
and 2h user on my local machine, also turns out the cache repos are
kinda badly configured leading to ~30% more objects than necessary
which doesn't help).

For the record, a fresh checkout of odoo/odoo right now yields:

    | Overall repository size      |           |
    | * Commits                    |           |
    |   * Count                    |   199 k   |
    |   * Total size               |   102 MiB |
    | * Trees                      |           |
    |   * Count                    |  1.60 M   |
    |   * Total size               |  2.67 GiB |
    |   * Total tree entries       |  74.1 M   |
    | * Blobs                      |           |
    |   * Count                    |  1.69 M   |
    |   * Total size               |  72.4 GiB |

If this still proves insufficient, a further option would be to deploy
a "generational repacking" strategy:
https://gitlab.com/gitlab-org/gitaly/-/issues/2861 (though apparently
it's not yet been implemented / deployed on gitlab so...).

But for now we'll see how it shakes out.

Close #489
2022-12-01 10:57:32 +01:00
Xavier Morel
985aaa5798 [FIX] runbot_merge: lock-in statuses after a staging has finished
The `statuses` field of a staging is always "live" because it's a
computed non-stored field. This is an issue when a staging finishes in
whatever state, then someone gets new statuses sent on one of the head
commits, either by rebuilding (part of) the staging or by just using
the same commit for one of their branches.

This makes the reporting of the main dashboard confusing, as one might
look at a failed staging and see all the required statuses
successful. It also makes post-mortem analysis more complicated as the
logs have to be trawled for what the statuses used to be (and they
don't always tell).

Solve this by storing a snapshot of the statuses the first time a
staging moves away from `pending`, whether it's to success or failure.

Fixes #667
2022-12-01 10:57:32 +01:00
Xavier Morel
57a176ac87 [ADD] runbot_merge: multi-commit squash mode
Fixes #672
2022-12-01 10:57:32 +01:00
Xavier Morel
1a5c143a00 [FIX] runbot_merge: make timestamps and batch labels selectable
In the branch lists of stagings, the timestamps in the left column and
the labels in the data cells can not be selected, because they're
buttons and anyway bootstrap explicitly sets

    .btn {
        ...
        user-select: none;
    }

This can be frustrating, as timestamps and labels are useful
information to cross-reference, the ability to copy them is
convenient.

Custom-set the reverse via our own CSS.

Fixes #668
2022-12-01 10:57:32 +01:00
Xavier Morel
afe4d13eeb [FIX] forwardport: fix pinging on forwardport PRs
- avoid pinging the author of the fw PR (which is the forward-bot
  itself)
- instead ping the author and reviewer of the source, and possibly the
  reviewer of the PR if any
- might also be a good idea to ping reviewers of intermediate PRs?
2022-12-01 10:57:32 +01:00
Xavier Morel
5f08100f3a [REV] runbot_merge: don't close PRs when deactivating branches
Partially revert 0c882fc0df

This turns out to be more bane than boon, as it breaks forward-port
chains and confuses people (despite the message). Update notification
message and don't close the PR anymore.

While at it, disable any pending staging on the branch being deactivated.

Fixes #654
2022-12-01 10:57:32 +01:00
Xavier Morel
b45ecf08f9 [IMP] forwardport: handling of missing magic refs
Github can fail to create the magic refs on PRs
(`pull/refs/?/head`). Since forwardport relies on these refs to fetch
PR content this is an issue when it occurs, as the forward ports fail
in a loop.

After discussion with Github support, it turns out Github enabled
`allowReachableSHA1InWant` a while back, meaning it's possible to
fetch content by commit (rather than ref) as long as the content is
"in network".

Use this property as fallback when checking if we can see the PR head
before forward porting.

Also:

- remove explicit configuration of GC during fetch, it doesn't disable
  the autogc (yet?) but that's likely going to happen anyway
- update logging and logger hierarchy during forward port to make
  things clearer and easier to extract, although based on PR id rather
  than number
- rate limit failing forward ports to avoid running them on every cron
  (~ every minute), run them every ~30mn instead, this provides higher
  odds of recovery with less log garbage in case of transient github
  failure, and if the PR is stuck it limits the log pollution

Fixes #658
2022-12-01 10:57:32 +01:00
Xavier Morel
fb8f44dd01 [FIX] runbot_merge: 15.0 compatibility (t-raw deprecation)
MERGEBOT-H9
2022-12-01 10:57:32 +01:00
Xavier-Do
ee9b3b7570 [FIX] runbot: avoid excessive log_counter updates 2022-11-30 14:07:55 +01:00
Xavier Morel
3e2db48786 [FIX] runbot_merge: more frontend templates
af016f4239 did a half-assed job and
didn't fix the one test which actually checks the dashboard.

TBF I was in a bit of a hurry trying to make the mergebot work and be
presentable again, but still...
2022-11-30 12:45:11 +01:00
Xavier-Do
7cdf77ce18 [IMP] runbot: hide some buttons
Force new batch buttons can be sometimes confusing for user.
Creating a group to show this button for advanced user only will help
avoiding useless new batch when it's not needed.

New batch is only needed:
- to create a new slot when a new trigger is added/modified through a
custom trigger
- take last databases into account for upgrades, mainly when backporting
a new field or strange usually forbiden operations
- avoiding to need to push again to rebase when a r+ was added on one
pr but one of them needs to be rebased or adapted.

Thos case are unusuall but the button is used most of the time thinking
this is a kind or rebuild or maybe it will rebase and push the branch
on the pr. Only user with basic knowledge of when it is needed should
have access to these buttons.
2022-11-29 16:07:34 +01:00
Xavier Morel
af016f4239 [FIX] runbot_merge: frontend templates & styles for 15.0
15.0 (or 14.0) dropped some of the BS3 (?) compatibility stuff, which
the mergebot was (apparently) relying on. This lead to a visual
degradation as well as the frontend dropdown looking absolutely awful.

Fix that, on both style and templates.

15.0 (or 14.0) also dropped the bespoke responsive utility classes,
switch to bootstrap's.
2022-11-29 10:41:50 +01:00
Xavier-Do
3664eabd90 [FIX] runbot: manage empty dbname 2022-11-28 14:38:05 +01:00
Christophe Monniez
2cad0542f4 [IMP] runbot: queue build logs in a local database
Before the commit the build ir_logging was sent from the build instance
to the main runbot ir.logging table. As the number of runbot hosts
increases, it introduce a lot of concurrency.
e.g.: 80 hosts with 8 builds means 640 instances trying to insert
records in the ir.logging table.

With this commit, a special database is used on the builder host in
order to receive ir.logging's from the build instances.

Regulary, the table is emptied by the builder and the logs are inserted
in the runbot leader ir.logging table.
2022-11-28 06:46:49 +01:00
Xavier-Do
66e37b9323 [FIX] runbot: renable blacklist 2022-11-25 10:41:52 +01:00
Xavier-Do
891d2d71e8 [IMP] runbot: add draft detection form titl 2022-11-24 16:24:42 +01:00
Christophe Monniez
3a9832d747 [IMP] runbot: add a close error wizard
When marking multiple build error as fixed, it's sometimes necessary to
explain why it was decided to close the error. When working with a few
errors, this can be done manually ... But most of the time we want to
close a lot of false negatives in batch.

With this commit, a simple wizard is made available that will post a
reason in the chatter of the build_errors.
2022-11-24 15:18:14 +01:00
Xavier-Do
1e8e059734 [FIX] fix codeowner 2022-11-24 15:17:53 +01:00
Xavier-Do
ff41311cb5 [FIX] runbot: add coverage access 2022-11-22 13:19:38 +01:00
Xavier-Do
c5e42b5529 [IMP] runbot: add missing fields 2022-11-22 13:19:38 +01:00
Xavier-Do
2b53455a9c [IMP] runbot: make ownership multi_edit 2022-11-22 10:34:24 +01:00
Xavier-Do
cd1360d716 [IMP] runbot: add search view 2022-11-22 10:12:48 +01:00
Xavier-Do
0ca706c56c [IMP] runbot: add team manager group 2022-11-22 09:34:29 +01:00
Xavier-Do
22abf95bca [FIX] runbot: ownership improvements
Tweaking the view to make it easier to use
2022-11-22 09:34:29 +01:00
Xavier-Do
f2d71a0b79 [FIX] runbot: fix pull info 2022-11-21 16:48:54 +01:00
Xavier-Do
2e77a55ddb [IMP] runbot: add codeowner management 2022-11-21 16:32:25 +01:00
Xavier-Do
410a01d13b [REF] runbot: move teams stuff 2022-11-21 16:32:25 +01:00
Xavier Morel
57162547e0 [FIX] runbot_merge: Odoo 15.0 + Py3.10 compat
Turns out I was running "15.0" except just on the runbot, enterprise
and community were still the 14.0 repos, so some of the changes were
missing.

While at it, bundle fixes for 3.10, as that's what Jammy needs, and
the mergebot/15.0 will be running on that.
2022-11-17 10:30:04 +01:00
Xavier-Do
2ca7a3de6e [FIX] runbot: trigger with no config fallback 2022-11-09 12:29:00 +01:00
Xavier-Do
f72c4a8baa [FIX] runbot: fix cleanup 2022-11-08 15:19:33 +01:00
Xavier-Do
e0856b2245 [IMP] runbot: improve build_error management
The build error view was unstructured and contains to much information.

This commit organize fields in groups and also validate some
modification on records in order to avoid build error manager to
disable test-tags by mistake.

An error cannot be deactivated if it appeared less than 24 hours ago to
avoid disabling a non forxardported pr that will fail the next nightly
generating another build error.

Test tags can only be added or removed by administrators.

Also adds a menu for easier User managerment

Also fixed the dname search and display.
2022-11-08 14:43:43 +01:00
Christophe Monniez
ac010405dc [FIX] runbot: use gethostname instead of getfqdn
At boot time, it appears that when the runbot tries to get its hostname
it may get a wrong fqdn.
2022-11-07 10:12:15 +01:00
Christophe Monniez
964a88cb36 [FIX] runbot: take number of builds from config data
Oversight from previous improvement in b79c4a5a52.
2022-11-07 10:12:15 +01:00
Christophe Monniez
04459cdda7 [FIX] runbot: proper line feed in default odoorc 2022-11-07 10:12:15 +01:00
Xavier Morel
1449937e00 [ADD] mergebot: support for coverage during tests
Runs the test instances of Odoo using `coverage` in parallel mode.

- useful for finding out under-tested parts of the code
- because it only instruments mergeport/forwardport, and the tests do
  so much IO, the wallclock performance impacts are minimal (~2%
  increase with branch coverage analysis, for an increase in CPU
  of ~20%, for the entire testsuite)
- for reporting, the scattered coverage reports need to be aggregated
  using `coverage combine`, followed by rendering with `coverage
  html`, these work out of the box, no parameterization is necessary
- coverage does not run on the test suite, only the modules under test
2022-10-27 11:25:25 +02:00
Xavier Morel
13f239826e [FIX] forwardport: avoid logging git error if there's no stream data
If no stream data was captured (no stderr and no stdout), would just
log

    git call error:

as error, with no further information.

Don't do that if we have neither stderr nor stdout data, since we're
re-raising the exception anyway, it's just confusing.
2022-10-26 14:47:00 +02:00
Xavier Morel
22c3406659 [FIX] forwardport: error reporting when git command fails
- if stderr was empty or had been redirected to stdout, no useful
  information would be show, making debugging more complicated
- the fallback is the error itself, but since it's reraised odds are
  pretty high the caller will eventually log the error itself, so
  it's redundant

=> fallback to stdout if stderr is empty, and only log if either is
non-empty
2022-10-26 14:47:00 +02:00
Xavier Morel
6281c86d5e [FIX] conftest: local webhook support
Not sure how I missed this but apparently pytest fixtures can't return
or yield, it's one or the other.

Which makes a lot of sense but means the tunnel fixture was broken
when using local webhooks (= no tunnel) as it returned the local url
rather than yield it.
2022-10-26 14:47:00 +02:00
Xavier ALT
fffc27d2fa [FIX] runbot: fix creation of new runbot.version from the backend
Traceback (most recent call last):
  File "/home/odoo/src/odoo/15.0/odoo/addons/base/models/ir_http.py", line 237, in _dispatch
    result = request.dispatch()
  File "/home/odoo/src/odoo/15.0/odoo/http.py", line 687, in dispatch
    result = self._call_function(**self.params)
  File "/home/odoo/src/odoo/15.0/odoo/http.py", line 359, in _call_function
    return checked_call(self.db, *args, **kwargs)
  File "/home/odoo/src/odoo/15.0/odoo/service/model.py", line 94, in wrapper
    return f(dbname, *args, **kwargs)
  File "/home/odoo/src/odoo/15.0/odoo/http.py", line 348, in checked_call
    result = self.endpoint(*a, **kw)
  File "/home/odoo/src/odoo/15.0/odoo/http.py", line 916, in __call__
    return self.method(*args, **kw)
  File "/home/odoo/src/odoo/15.0/odoo/http.py", line 535, in response_wrap
    response = f(*args, **kw)
  File "/home/odoo/src/odoo/15.0/addons/web/controllers/main.py", line 1347, in call_kw
    return self._call_kw(model, method, args, kwargs)
  File "/home/odoo/src/odoo/15.0/addons/web/controllers/main.py", line 1339, in _call_kw
    return call_kw(request.env[model], method, args, kwargs)
  File "/home/odoo/src/odoo/15.0/odoo/api.py", line 464, in call_kw
    result = _call_kw_multi(method, model, args, kwargs)
  File "/home/odoo/src/odoo/15.0/odoo/api.py", line 451, in _call_kw_multi
    result = method(recs, *args, **kwargs)
  File "/home/odoo/src/odoo/15.0/odoo/models.py", line 6489, in onchange
    snapshot1 = Snapshot(record, nametree)
  File "/home/odoo/src/odoo/15.0/odoo/models.py", line 6271, in __init__
    self.fetch(name)
  File "/home/odoo/src/odoo/15.0/odoo/models.py", line 6281, in fetch
    self[name] = record[name]
  File "/home/odoo/src/odoo/15.0/odoo/models.py", line 5888, in __getitem__
    return self._fields[key].__get__(self, type(self))
  File "/home/odoo/src/odoo/15.0/odoo/fields.py", line 1054, in __get__
    self.recompute(record)
  File "/home/odoo/src/odoo/15.0/odoo/fields.py", line 1243, in recompute
    self.compute_value(recs)
  File "/home/odoo/src/odoo/15.0/odoo/fields.py", line 1265, in compute_value
    records._compute_field_value(self)
  File "/home/odoo/src/odoo/15.0/odoo/models.py", line 4255, in _compute_field_value
    getattr(self, field.compute)()
  File "/home/odoo/runbot/extra/runbot/models/version.py", line 36, in _compute_version_number
    version.number = '.'.join([elem.zfill(2) for elem in re.sub(r'[^0-9\.]', '', version.name).split('.')])
  File "/usr/lib/python3.8/re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
Exception

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/odoo/src/odoo/15.0/odoo/http.py", line 643, in _handle_exception
    return super(JsonRequest, self)._handle_exception(exception)
  File "/home/odoo/src/odoo/15.0/odoo/http.py", line 301, in _handle_exception
    raise exception.with_traceback(None) from new_cause
TypeError: expected string or bytes-like object
2022-10-21 11:59:55 +02:00
Xavier ALT
1095b25270 [FIX] runbot: fix help msg, 'additionnal_env' is split on semi-column 2022-10-21 11:59:55 +02:00
Christophe Monniez
309aeaa32e [IMP] runbot: speedup build garbage collecting
When the builds directory is filled with a lot of build directories
(around 100000) the garbage collection process may take up to 2 minutes.
The root cause is that each build directory is scanned to clean it up
even if it was already cleaned.

With this commit, a stamp file is used to mark directories that were
already garbage collected.
2022-10-21 11:32:46 +02:00
Christophe Monniez
7642bffda3 [FIX] runbot: download phantomjs from nightly server
In order to avoid bitbucket rate limiting, we prefer to download this
old pal from our server.
2022-10-21 11:17:05 +02:00
Xavier-Do
903ee7d983 [FIX] runbot: manage falsy value for frontend_url 2022-10-21 10:39:20 +02:00
Xavier-Do
e27a1b8f71 [IMP] runbot: cleaup settings view 2022-10-21 10:32:40 +02:00
Xavier-Do
0287dcaab7 [IMP] runbot: update install documentation 2022-10-21 10:32:40 +02:00
Xavier-Do
303638e507 [FIX] runbot: don't hardcode user odoo 2022-10-21 10:32:40 +02:00