Add Azure DevOps history-driven slow-test threshold enricher (#9140)#9182
Add Azure DevOps history-driven slow-test threshold enricher (#9140)#9182Evangelink wants to merge 5 commits into
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR extends the Microsoft.Testing.Extensions.AzureDevOpsReport extension to support Azure DevOps history-driven slow-test surfacing: it fetches per-test duration history, computes p95/p99, lowers the “still running after …” threshold for historically-fast tests, and decorates emitted slow-test lines with the computed stats.
Changes:
- Added new slow-test history CLI options (
--report-azdo-slow-test-history*) plus help/info output updates and validation. - Reused the existing Azure DevOps history fetch to optionally collect durations and publish per-test p95/p99/sample-count stats.
- Introduced a new
AzureDevOpsSlowTestReporterplus pure helpers (PercentileCalculator,DurationHistoryStats,AzureDevOpsSlowTestThresholds,TestNodeIdentity) and accompanying unit/acceptance tests and localized resources.
Show a summary per file
| File | Description |
|---|---|
| test/UnitTests/Microsoft.Testing.Extensions.UnitTests/AzureDevOpsSlowTestHistoryTests.cs | Adds unit tests for percentile math, duration stats, threshold computation, and formatting. |
| test/UnitTests/Microsoft.Testing.Extensions.UnitTests/AzureDevOpsHistoryServiceTests.cs | Adds coverage for duration-stat aggregation when slow-test history is enabled/disabled. |
| test/IntegrationTests/Microsoft.Testing.Platform.Acceptance.IntegrationTests/HelpInfoAllExtensionsTests.cs | Updates --help/--info expectations to include new AzDO slow-test history options. |
| test/IntegrationTests/Microsoft.Testing.Platform.Acceptance.IntegrationTests/AzureDevOpsCommandLineTests.cs | Adds acceptance validation tests for the new command-line options. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/TestNodeIdentity.cs | Centralizes stable test-name resolution for history lookups. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/PercentileCalculator.cs | Implements nearest-rank percentile computation for client-side p95/p99. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/IAzureDevOpsHistoryService.cs | Extends history service contract to expose duration stats. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/DurationHistoryStats.cs | Adds immutable p95/p99/sample-count container + builder from raw durations. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/AzureDevOpsSlowTestThresholds.cs | Adds pure logic for threshold selection and compact duration formatting. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/AzureDevOpsSlowTestReporter.cs | New background scanner that emits slow-test lines with exponential backoff and optional history decoration. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/AzureDevOpsReporter.cs | Refactors to reuse TestNodeIdentity for stable naming. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/AzureDevOpsHistoryService.cs | Reuses history fetch to optionally retain duration samples and publish p95/p99 stats. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/AzureDevOpsHistoryClient.cs | Plumbs duration (durationInMs) from results into the in-memory model. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/AzureDevOpsExtensions.cs | Registers the new slow-test reporter in the AzDO extension pipeline. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/AzureDevOpsCommandLineProvider.cs | Adds new options, descriptions, and validation rules. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/AzureDevOpsCommandLineOptions.cs | Defines new option names and default constants for slow-test history. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/AzureDevOpsResources.resx | Adds resource strings for new options, validation, and slow-test output formatting. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.cs.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.de.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.es.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.fr.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.it.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.ja.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.ko.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.pl.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.pt-BR.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.ru.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.tr.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.zh-Hans.xlf | Localization updates for new resource strings. |
| src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/Resources/xlf/AzureDevOpsResources.zh-Hant.xlf | Localization updates for new resource strings. |
Copilot's findings
Comments suppressed due to low confidence (1)
src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/AzureDevOpsCommandLineProvider.cs:122
- When '--report-azdo' is not enabled, ValidateCommandLineOptionsAsync only checks '--report-azdo-slow-test-history' but does not check the dependent options '--report-azdo-slow-test-history-min-sample' / '--report-azdo-slow-test-history-multiplier'. As a result, users can pass these options without '--report-azdo' and validation will incorrectly succeed (errorMessage stays null). Add checks for these options in the "!IsOptionSet(--report-azdo)" block so they fail fast.
else if (commandLineOptions.IsOptionSet(AzureDevOpsCommandLineOptions.AzureDevOpsSlowTestHistory))
{
errorMessage = AzureDevOpsResources.AzureDevOpsSlowTestHistoryRequiresAzureDevOps;
}
else if (commandLineOptions.IsOptionSet(AzureDevOpsCommandLineOptions.AzureDevOpsStackFrameFilter))
{
errorMessage = AzureDevOpsResources.AzureDevOpsStackFrameFilterRequiresAzureDevOps;
}
- Files reviewed: 30/30 changed files
- Comments generated: 1
This comment has been minimized.
This comment has been minimized.
Evangelink
left a comment
There was a problem hiding this comment.
Note
🤖 Automated review by GitHub Copilot. Posted via a maintainer's GitHub token, so it appears under their account — the account owner did not write or approve this content personally. Generated by the Expert Code Review workflow. To request a follow-up action, reply by tagging @copilot directly.
| # | Dimension | Verdict |
|---|---|---|
| 2 | Threading & Concurrency | 🟠 1 MAJOR |
| 13 | Test Completeness & Coverage | 🟠 1 MAJOR |
| 1 | Algorithmic Correctness | 🟡 1 MODERATE |
✅ 19/22 dimensions clean.
-
Threading —
_active,_multiplier, and_minimumSampleCountinAzureDevOpsSlowTestReporterare written on the session-lifecycle thread and read concurrently inConsumeAsyncwithoutvolatileorInterlocked.*. The JIT may cache the pre-start value of_active = false, causing the reporter to silently skip all slow-test tracking. See inline comment for fix. -
Test Completeness —
AzureDevOpsSlowTestReporteris a 302-line class with no direct unit tests. The pure-logic helpers (PercentileCalculator,DurationHistoryStats,AzureDevOpsSlowTestThresholds) are well-tested, but the class's own behaviour is entirely covered by the acceptance tests, which are expensive to run and don't exercise fine-grained paths. The following scenarios have no test coverage at all:TF_BUILDnot set → warning emitted, scan loop not startedConsumeAsyncwithInProgressTestNodeStateProperty→ entry added to_inProgresswith correct thresholdConsumeAsyncwith a terminal state → entry removedScanLoopAsyncemission path: with history decoration vs. static fallback- Exponential-backoff doubling of
NextEmitThreshold OnTestSessionFinishingAsyncdrains and cancels the loop task
-
Correctness —
ValidateSlowTestHistoryMultiplierArgumentsAsyncacceptsmultiplier > 0without an upper bound.p99 * double.MaxValueoverflows to∞;TimeSpan.FromMilliseconds(∞)throwsOverflowExceptioninResolveThreshold, whichConsumeAsync's catch block silently swallows — disabling slow-test detection for every test that has a history entry. See inline comment for fix.
Other observations (non-blocking):
-
LogUnexpectedExceptionlogs atWarninglevel. Per codebase conventions (Dimension 8), unexpected exceptions from internal callbacks are more naturallyError-level (warnings signal user-actionable conditions). Low impact given the catch is guarding an enhancement feature, but worth aligning with convention in a follow-up. -
TestNodeIdentity.GetTestNameuses.OfType<SerializableKeyValuePairStringProperty>().FirstOrDefault(...)— allocates an enumerator per call. This was pre-existing inAzureDevOpsReporter; the refactoring into a shared helper is the right move and doesn't introduce new regressions.
🤖 Automated content by GitHub Copilot. Posted via a maintainer's GitHub token, so it appears under their account — the account owner did not write or approve this content personally. Generated by the Expert Code Review (on PR ready) workflow. · 562.4 AIC · ⌖ 13 AIC · ◷
- Make slow-test history truly no-op outside Azure DevOps (no output-device line, downgrade log to Trace) - Use volatile fields / Volatile.Read for cross-thread publication in AzureDevOpsSlowTestReporter - Cap --report-azdo-slow-test-history-multiplier at 10000 and guard ComputeThreshold against infinity/NaN - Avoid floating-point equality in TrimTrailingZero - Document the expected OperationCanceledException swallow Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
…ng for slow-test history - Extract ScanOnceAsync and add direct unit tests for AzureDevOpsSlowTestReporter (no-op outside AzDO, emission, history decoration, exponential backoff, terminal removal, finish drain). - Validate that --report-azdo-slow-test-history-min-sample/-multiplier require --report-azdo-slow-test-history even when --report-azdo is absent (Copilot review note), with acceptance coverage. - Dogfood --report-azdo-slow-test-history in test/Directory.Build.targets, gated on TF_BUILD. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| double historyThresholdMs = stats.P99Milliseconds * multiplier; | ||
| if (double.IsNaN(historyThresholdMs) || double.IsInfinity(historyThresholdMs) || historyThresholdMs < 0) | ||
| { | ||
| // A pathological multiplier (or overflow to infinity) would make TimeSpan.FromMilliseconds throw; | ||
| // fall back to the static threshold instead of silently disabling slow-test detection. | ||
| return staticThreshold; | ||
| } | ||
|
|
||
| var historyThreshold = TimeSpan.FromMilliseconds(historyThresholdMs); | ||
| return historyThreshold < staticThreshold ? historyThreshold : staticThreshold; |
There was a problem hiding this comment.
Addressed in fb67002 — ComputeThreshold now short-circuits to the static threshold whenever historyThresholdMs >= staticThreshold.TotalMilliseconds (in addition to NaN/<=0), so the TimeSpan.FromMilliseconds conversion is skipped entirely for large finite values that would exceed TimeSpan.MaxValue. Added regression test ComputeThreshold_WhenHistoryExceedsTimeSpanMax_UsesStaticThreshold.
| // Exponential backoff so a genuinely stuck test does not spam the log: T, 2T, 4T, ... | ||
| test.NextEmitThreshold = TimeSpan.FromTicks(test.NextEmitThreshold.Ticks * 2); |
There was a problem hiding this comment.
Addressed in fb67002 — the exponential-backoff doubling now clamps at TimeSpan.MaxValue before doubling: currentTicks > TimeSpan.MaxValue.Ticks / 2 ? TimeSpan.MaxValue : TimeSpan.FromTicks(currentTicks * 2), so Ticks * 2 can no longer overflow into a negative value.
| if (TryGetWindowInDays(AzureDevOpsCommandLineOptions.AzureDevOpsFlakyHistory, out int flakyWindow)) | ||
| { | ||
| anyEnabled = true; | ||
| historyWindowInDays = Math.Max(historyWindowInDays, flakyWindow); | ||
| } |
There was a problem hiding this comment.
Addressed in fb67002 — documented the behavior in TryGetHistoryConfiguration: flaky-history and slow-test-history deliberately share a single query using the maximum of the two windows so neither feature is starved of data, and the comment now calls out that the flaky in last {N}d annotation reflects the effective (maximum) window when the slow-test window is larger.
…-azdo-slow-test-history # Conflicts: # src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport/AzureDevOpsExtensions.cs
This comment has been minimized.
This comment has been minimized.
…red history window Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| using Microsoft.Testing.Extensions.AzureDevOpsReport; | ||
| using Microsoft.Testing.Extensions.Reporting; | ||
| using Microsoft.Testing.Platform.CommandLine; |
🧪 Test quality grade — PR #918229 test methods graded (15 A, 14 B) across 5 changed files — none below B. The dominant B driver is the
This advisory comment was generated automatically. Grades are heuristic
|
Fixes #9140.
Summary
Adds an Azure DevOps history-driven slow-test threshold enricher to the existing
Microsoft.Testing.Extensions.AzureDevOpsReportpackage (issue Option A — reuses the existing history-fetch infrastructure rather than a new package).For tests with a known-short historical runtime it lowers the per-test
still running after Nsthreshold and decorates each emitted line with the historical p95/p99:[slow] still running after 5s: Foo (historical p95 = 2s, p99 = 3s, samples = 120)Knobs (best defaults)
--report-azdo-slow-test-history <days>--report-azdo.--report-azdo-slow-test-history-min-sample <N>--report-azdo-slow-test-history-multiplier <X>min(60s static, p99 * multiplier).Emission uses exponential backoff (T -> 2T -> 4T ...). Falls back to the static 60s threshold when history is unavailable/below min-sample, and silently no-ops outside Azure DevOps (
TF_BUILDnot set) or when env/token are missing.Implementation
AzureDevOpsHistoryServicenow loads when flaky or slow-test history is enabled, capturesdurationInMsonly when slow-test history is on, and exposesTryGetDurationStats(p95/p99/samples). No extra Azure DevOps round-trip.PercentileCalculator(nearest-rank),DurationHistoryStats,AzureDevOpsSlowTestThresholds(threshold/decoration/formatting),TestNodeIdentity(shared FQN history key).AzureDevOpsSlowTestReporterperforms the surfacing/backoff.Note on #9139
#9140 was specced to hook into
IProgressEnricherfrom #9139, which isn't merged yet. The surfacing here is therefore self-contained, with all decision logic isolated in pure methods and documented in-code to migrate ontoIProgressEnricheronce #9139 lands.Tests
Microsoft.Testing.Extensions.UnitTestsAzureDevOps suite: 143 passing.--help/--infoacceptance expectations updated (both blocks, alphabetically); xlf regenerated for all 13 locales.All new types are
internal(no public API surface added).