Skip to content

fix: live pre-merge build-status revalidation (BERTE-602)#268

Open
charlesprost wants to merge 2 commits into
mainfrom
fix/BERTE-602-live-build-status-revalidation
Open

fix: live pre-merge build-status revalidation (BERTE-602)#268
charlesprost wants to merge 2 commits into
mainfrom
fix/BERTE-602-live-build-status-revalidation

Conversation

@charlesprost
Copy link
Copy Markdown
Contributor

Summary

  • Adds revalidate_build_status(), called immediately before the merge/queue decision in _handle_pull_request. It bypasses BUILD_STATUS_CACHE and fetches live state from the GitHub Actions API, filtering workflow runs strictly to the current w-branch before aggregating.
  • Adds AggregatedWorkflowRuns.state_for_branch(name): filters and deduplicates within a single branch, preventing a sibling branch sharing the same SHA from producing a false SUCCESSFUL.
  • Regression tests explicitly replaying the artesca#5155 incident shape (renamed source branch, same SHA on old and new w-branches, old completed/new in-progress → must raise BuildInProgress).

Root cause (artesca#5155, 2026-05-20)

The source branch was renamed; the new w-branch shared its commit SHA with the old w-branch. GitHub's API returned workflow runs for both names. AggregatedWorkflowRuns.state picked the best status across branches, returning SUCCESSFUL because the old w-branch completed. That result was latched into BUILD_STATUS_CACHE (which never overwrites SUCCESSFUL). When the peer approval landed 15 s later, Bert-E read SUCCESSFUL from cache and merged — while the actual push CI on the new w-branch was still running.

What this PR does

bert_e/git_host/github/__init__.py

  • state_for_branch(branch_name): filters _workflow_runs to branch_name before deduplicating by workflow_id, so a sibling branch cannot mask the result. Logs the filtered run snapshot (branch + runs + conclusion) for post-mortem diagnostics.

bert_e/workflow/gitwaterflow/__init__.py

  • revalidate_build_status(job, wbranches): live, cache-bypassing final check at the merge decision boundary. Calls get_commit_status() directly (always hits the API), uses state_for_branch for GitHub Actions keys, skips gracefully for hosts without get_commit_status (e.g. Bitbucket).
  • Called in _handle_pull_request after the interactive confirm and before build_queue_collection.

Test plan

  • test_artesca5155_renamed_branch_inprogress_blocked — exact incident shape, asserts BuildInProgress
  • test_artesca5155_old_aggregator_would_have_passed — documents the pre-fix bug: AggregatedWorkflowRuns.state returns SUCCESSFUL for the same data
  • test_pr5155_shape_raises_build_in_progress — same SHA on w-branch (in_progress) and destination (success)
  • Full matrix: bypass/no-key/no-method → skipped; clean PR → passes; failed/no-runs → correct exception
  • state_for_branch unit tests: cross-branch poisoning, dedup within branch, workflow_dispatch exclusion, no-runs case

🤖 Generated with Claude Code

Documents pytest binary location, GitHub release process, and
the devdocs release checklist reference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@charlesprost charlesprost requested a review from a team as a code owner May 20, 2026 17:05
@charlesprost charlesprost force-pushed the fix/BERTE-602-live-build-status-revalidation branch from cd4b948 to 61b5e95 Compare May 20, 2026 17:13
@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.51%. Comparing base (24545a4) to head (93143c7).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #268      +/-   ##
==========================================
+ Coverage   89.28%   89.51%   +0.22%     
==========================================
  Files          77       78       +1     
  Lines       10287    10493     +206     
==========================================
+ Hits         9185     9393     +208     
+ Misses       1102     1100       -2     
Flag Coverage Δ
integration 87.19% <23.80%> (-0.33%) ⬇️
tests 87.15% <23.80%> (-0.33%) ⬇️
tests-BuildFailedTest 26.68% <4.76%> (-0.12%) ⬇️
tests-QuickTest 34.23% <4.76%> (-0.15%) ⬇️
tests-RepositoryTests 26.34% <4.76%> (-0.11%) ⬇️
tests-TaskQueueTests 51.46% <11.90%> (-0.21%) ⬇️
tests-TestBertE 65.38% <23.80%> (-0.22%) ⬇️
tests-TestQueueing 53.69% <11.90%> (-0.22%) ⬇️
tests-api-mock 15.51% <0.48%> (-0.31%) ⬇️
tests-noqueue 77.35% <23.80%> (-0.28%) ⬇️
tests-noqueue-BuildFailedTest 26.68% <4.76%> (-0.12%) ⬇️
tests-noqueue-QuickTest 34.23% <4.76%> (-0.15%) ⬇️
tests-noqueue-RepositoryTests 26.34% <4.76%> (-0.11%) ⬇️
tests-noqueue-TaskQueueTests 51.46% <11.90%> (-0.21%) ⬇️
tests-noqueue-TestBertE 61.75% <23.80%> (-0.20%) ⬇️
tests-noqueue-TestQueueing 26.37% <4.76%> (-0.11%) ⬇️
tests-server 28.57% <0.97%> (-0.56%) ⬇️
unittests 42.80% <99.51%> (+1.16%) ⬆️
utests 27.38% <99.51%> (+1.47%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Introduces revalidate_build_status(), called immediately before the
merge/queue decision in _handle_pull_request. Unlike check_build_status,
it bypasses BUILD_STATUS_CACHE and fetches live state from the GitHub
Actions API, then filters workflow runs strictly to each wbranch's own
head_branch before aggregating.

Root cause (PR artesca#5155): a push run on the PR branch was still
in_progress when bert-e polled. The run shared a SHA with development/4,
which had already completed successfully. AggregatedWorkflowRuns.state
returned SUCCESSFUL because it grouped by branch then returned SUCCESSFUL
if *any* branch succeeded, poisoning the cache entry and hiding the
in-progress run on the w-branch.

New AggregatedWorkflowRuns.state_for_branch(name) filters and deduplicates
within a single branch before calling branch_state(), preventing cross-
branch SHA poisoning. revalidate_build_status uses this method. Hosts
without get_commit_status (e.g. Bitbucket) are skipped gracefully.

Tests added in test_github_build_status.py document the poisoning scenario
and verify the per-branch fix. test_revalidate_build_status.py covers the
full decision matrix: PR 5155 shape → BuildInProgress, clean PR → passes,
bypass/no-key/no-method → skipped, failed/no-runs → appropriate exception.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@charlesprost charlesprost force-pushed the fix/BERTE-602-live-build-status-revalidation branch from 61b5e95 to 93143c7 Compare May 20, 2026 17:34
Copy link
Copy Markdown
Contributor

@matthiasL-scality matthiasL-scality left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review — non-blocking observations

)


def _make_run(*, run_id, sha, branch, status, conclusion,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate helper_make_run is defined identically in test_github_build_status.py:232. Consider moving it to a shared conftest.py so both test modules can import it. Not a blocker, but avoids drift if the run schema changes.

that shares the same SHA cannot cause a false SUCCESSFUL result.
Unlike .state, this never returns SUCCESSFUL for a different branch.
"""
conclusion_ranking = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate dedup logicconclusion_ranking and the best-by-workflow_id loop replicate the same logic already in remove_unwanted_workflows (lines 660–671). Consider extracting a private _best_runs_for(runs) helper that both methods can call. Not a blocker.

assert worst_status == 'SUCCESSFUL'


def revalidate_build_status(job, wbranches):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocker — long-term fix should be tracked as an improvement.

This function is a sound safety net for the immediate incident. However, the root cause is that BUILD_STATUS_CACHE is keyed only by SHA (BUILD_STATUS_CACHE[build_key][sha]), so two branches sharing the same SHA compete for the same cache slot. The "never overwrite SUCCESSFUL" guard then permanently locks the slot once any branch on that SHA succeeds.

The durable fix would be to include the branch name in the cache key:

# today
BUILD_STATUS_CACHE[build_key][sha]

# proposed
BUILD_STATUS_CACHE[build_key][(sha, branch_name)]

Each w-branch would then have its own independent cache slot; a renamed branch can never inherit a SUCCESSFUL result from its predecessor. This would also allow removing revalidate_build_status (or reducing it to a lightweight assertion) once the cache is correct.

This is not a blocker for this PR. Please open a follow-up Jira improvement ticket to track the cache key refactor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants