Chart Library
IntegrityEvaluationAI AgentsResearch

One Anchor Said -3.6%. 100 Anchors Said -0.5%. The Perils of Single-Anchor Decompositions.

Chart Library Team··6 min read

The striking single-anchor finding

The cohort API at chartlibrary.io returns 500 nearest-neighbor historical patterns for any chart anchor, plus a decomposition layer that slices those 500 matches by catalyst proximity, sector, market-cap bucket, and intraday behavior. The slice output ranks conditions by how far each subgroup's forward-return median shifts from the full-cohort baseline.

Run on NVDA for 2026-04-14, the top slice was unmistakable: matches that formed inside an earnings window (±5 days from a quarterly filing) had a 10-day forward return of -3.17%, against a cohort baseline of +0.48%. Delta: -3.65 percentage points. n=29, 34.5% hit rate. The write-up almost wrote itself — 'anchor patterns near earnings tend to underperform.'

The red flag

The 95% bootstrap CI on that slice was [-6.64pp, +1.90pp]. It crossed zero. The effect wasn't per-anchor stat-sig — it was just large in magnitude on a single cohort. Every time we find a finding like that in our own pipeline, we owe ourselves the same check we would demand of someone else: does it generalize?

The 100-anchor test

We sampled 100 random anchors stratified across 2023-2025, filtered to mid/large/mega cap ($2B+). For each anchor we built the cohort through the production API (top_k=500, horizon=10d) and computed the within_earnings_window slice delta, bootstrap CI, stat_sig flag, and direction. Nine anchors failed (missing embedding); 91 were usable. We also computed a placebo: dividend_within_7d, a different catalyst-proximity flag that should NOT have the same effect if the earnings story is real.

  • Earnings slice (n=87 anchors): mean delta -0.52pp, median -0.38pp. 64% of anchors negative, 7% per-anchor stat-sig. Stouffer meta-Z = -4.49, p ≈ 7e-6.
  • Dividend placebo (n=89 anchors): mean delta -0.24pp, median -0.18pp. 67% of anchors negative, 9% per-anchor stat-sig. Stouffer meta-Z = -5.15, p ≈ 3e-7.
  • NVDA's -3.65pp was 7× the real population mean and 10× the median. Textbook single-anchor outlier.

What the placebo tells us

Both slices are negative, and the dividend placebo is actually more statistically significant by Stouffer's Z. That's a problem for any clean 'earnings windows underperform' narrative.

The honest read is that some part of the signal is general event-proximity, not earnings-specific. Catalyst-adjacent windows pull in matches that tend to have been drawn from volatile or mean-reverting setups — the cohort retrieval doesn't condition on whether the match was 'pre-event' or 'post-event', and both phases leak into the slice. The earnings effect is real AND bigger than the dividend effect by mean delta (about 2×), but it's not the 5× gap a trader would need to treat 'earnings overlap' as a useful per-cohort filter.

What this means for agents

If you're building an agent on top of any historical-pattern decomposition API — ours or anyone else's — a single striking slice finding is not a result. It's a hypothesis. Three rules that fell out of this run:

  • Single-anchor slice findings should never be quoted as a generalization. On a 500-match cohort, any slice with n<50 has a bootstrap CI wider than ±3pp, and striking point estimates inside that range are noise.
  • Always pair a real catalyst slice with a placebo catalyst slice. If dividend_within_7d shows the same direction and magnitude as within_earnings_window, the story is event-proximity, not earnings.
  • When you need to quote an aggregate claim, run the decomposition across 100+ anchors and report Stouffer-Z, mean-delta, and percent-directionally-consistent. The single-anchor slice is an exploration tool, not a report.

What we still owe you

The 100-anchor aggregate is tighter than a single NVDA but still softer than it should be. Two obvious follow-ups are queued:

  • Paired test: compute earnings_delta - dividend_delta within the same anchor. That controls for per-anchor baseline drift and isolates the earnings-specific component. If the paired mean is still negative and sig, the earnings-specific claim survives.
  • Regime-stratified version: run the 100 anchors across VIX-quartile buckets. Earnings-window underperformance might be large in high-vol regimes and flat in low-vol ones — that would be a real actionable segmentation.

Both of those are a follow-up post, not a reason to hold the finding we already have. The single-anchor-to-population gap (-3.65pp → -0.52pp) is the story worth publishing now. It's an unglamorous number, but it's honest.

Update (same day): We ran the paired test. Across 85 anchors with both slices populated, the within-anchor difference (earnings slice median − dividend slice median) was -0.23pp, with a 95% bootstrap CI of [-0.48pp, +0.02pp] and paired-t p = 0.075. That straddles zero. The earnings effect is statistically indistinguishable from a generic event-proximity artifact at conventional thresholds. Honest read: use the aggregate -0.5pp narrative against cohort baseline if you want, but don't claim earnings specificity. Regime-stratified follow-up is still queued.

The decomposition endpoint we used for this audit is live at /api/v1/cohort/{id}/decompose — returns slices + bootstrap CIs + stat-sig flags on any anchor. Agent builders: grab an API key at chartlibrary.io/developers. Single-anchor findings are your exploration tool; 100-anchor aggregates are your report.

Ready to try Chart Library?

Upload a chart screenshot or search any ticker — see what history says about your pattern.

Try it free