Concept · Reference

Reading a cohort response —
a field guide.

A cohort intelligence response is dense. Read it well and you have a defensible multi-factor view of a (symbol, date) anchor. Read it badly and you over-anchor on the wrong field.

This is the field-by-field reference: what each block means, what to look at first, and the three judgment calls that separate a good read from a noisy one.

The response in full

A typical cohort_analyze response looks like this (formatted for clarity):

{
  "anchor": {
    "symbol": "NVDA",
    "date": "2024-08-05",
    "timeframe": "1h"
  },
  "cohort_size_actual": 300,
  "cohort_score": 0.81,
  "outcome_distribution": {
    "1":  {"n": 296, "mean": 0.21, "median": -0.04, "p10": -2.4, "p90": 2.7, "win_rate": 0.49, "std": 1.9},
    "5":  {"n": 298, "mean": -0.4, "median": -1.30, "p10": -11.3, "p90": 6.8, "win_rate": 0.44, "std": 7.1},
    "10": {"n": 295, "mean": -0.1, "median": -1.85, "p10": -15.1, "p90": 11.2, "win_rate": 0.46, "std": 10.4}
  },
  "feature_importance_5d": [
    {"feature": "credit_spread_state=tight",  "importance": 0.18, "direction": "positive"},
    {"feature": "macro_state=bullish",         "importance": 0.14, "direction": "positive"},
    {"feature": "vol_regime=low",              "importance": 0.12, "direction": "negative"},
    {"feature": "days_since_earnings",         "importance": 0.09, "direction": "negative"},
    {"feature": "sector_rs_60d",               "importance": 0.07, "direction": "positive"}
  ],
  "regime_stratification_5d": {
    "low_vol":   {"n": 84, "win_rate": 0.38, "median_return": -2.1},
    "high_vol":  {"n": 76, "win_rate": 0.51, "median_return":  0.4},
    "bull_macro":{"n": 122,"win_rate": 0.51, "median_return":  0.2},
    "bear_macro":{"n": 88, "win_rate": 0.34, "median_return": -3.6}
  },
  "anchor_metadata": {
    "momentum_5d":      0.041,
    "momentum_20d":     0.118,
    "sector_rs_60d":    0.067,
    "pct_off_ath":     -0.052,
    "days_since_earnings": 17,
    "realized_vol_20d": 0.342
  }
}

That’s 30+ fields. Most readers fixate on outcome_distribution.5.median and call it a forecast. That’s the biggest mistake you can make with a cohort response. The fields below the headline number matter more.

1. cohort_size_actual and cohort_score

cohort_size_actual is the number of analogs that survived all filters. By default we target 300 — but filters (regime, sector, news context, time-since-earnings) can narrow it. If you applied filters and cohort_size_actual dropped below ~30, the distribution statistics are too noisy to trust. The API warns when that happens.

cohort_score is a tightness measure — how concentrated the analogs are in embedding space. A high score (near 1) means the 300 analogs are genuinely similar to the anchor. A low score (near 0) means we’re scraping the bottom of the barrel for similarity. Anything below ~0.5 should be read with caution.

Read these two fields first. They tell you whether the rest of the response is meaningful.

2. outcome_distribution — the headline numbers, in context

For each horizon (1, 5, 10 trading days forward), the response gives you:

  • median — the middle of the analog distribution
  • mean — the arithmetic average; can be skewed by tail events
  • p10 / p90 — the 10th and 90th percentiles; the “reasonable range” band
  • win_rate — fraction of analogs with positive forward return at that horizon
  • std — standard deviation of analog returns
  • n — analogs with complete forward returns at that horizon (may be smaller than cohort_size for very recent anchors)

How to read these together:

  1. Median vs mean. If they differ by more than ~50% in absolute terms, the distribution is skewed. Trust the median. In our example, 1d median is -0.04% but mean is +0.21% — a small right skew driven by a few big winners. Normal for short horizons.
  2. p10 / p90 spread. This is your reasonable downside / upside. NVDA 5d: -11.3% to +6.8%. That’s an asymmetric distribution — downside is wider than upside. Position size accordingly.
  3. Win rate vs median. Win rate is the probability of positive return; median is the magnitude. A cohort with 60% win rate and -0.5% median has lots of small wins and a few big losses. The reverse (40% win rate, +0.5% median) means few large winners pulling the median up. These are very different setups; the win rate alone doesn’t tell you which.

3. feature_importance — directionality is the whole game

feature_importance_5d is the ordered list of features that separated winning analogs from losers within the 300-analog cohort. Each entry has:

  • feature — the feature name and (for categorical features) the value being indicated
  • importance — magnitude of the separation (Gini-style, 0 to 1)
  • direction — “positive” means analogs with this feature value outperformed; “ negative” means underperformed

The right read is combine direction with whether the live anchor has that feature. From our example:

  • credit_spread_state=tight → positive (analogs in tight-credit periods outperformed)
  • macro_state=bullish → positive
  • vol_regime=low → negative (analogs in low-vol regimes underperformed)

Check anchor_metadata or call get_market_context to find out which of these the live anchor actually has. If the anchor is in tight credit + bullish macro + low vol, you have two positives and one negative. Net read: lean toward the positive analogs within the cohort, but the low-vol headwind is real.

Important: “positive” means correlated with positive forward returns within this cohort. It does NOT mean causal. Don’t reify the feature into an explanation; treat it as a conditioning variable.

4. regime_stratification — the cohort isn't homogeneous

regime_stratification_5d splits the 300-analog cohort by current regime. The whole-cohort statistics blur regime differences; the stratification reveals them.

In our example:

  • Low-vol analogs: n=84, win rate 38%, median -2.1%
  • High-vol analogs: n=76, win rate 51%, median +0.4%
  • Bull-macro analogs: n=122, win rate 51%, median +0.2%
  • Bear-macro analogs: n=88, win rate 34%, median -3.6%

The headline cohort number (5d median -1.3%) hides a sharp split. If we’re in bull-macro right now, the regime-conditioned read is much more positive (+0.2% median, 51% win rate). If we’re in bear-macro, much more negative.

The judgment call: which regime are we in? get_market_context tells you. If the live regime is clearly one side, the regime-conditioned statistic dominates the headline statistic.

When the regime is mixed or transitioning, you can’t cleanly condition. Stay with the headline cohort and acknowledge the regime uncertainty.

5. anchor_metadata — the input features

anchor_metadata is the live anchor’s feature values at the moment of the query. These are the same dimensions the cohort was retrieved against. Use them to check:

  • Does the anchor have the positive features? If feature_importance says days_since_earnings < 10 is negative and the anchor’s days_since_earnings = 17, that negative factor doesn’t apply. Adjust read up.
  • Where on the distribution does this anchor sit? realized_vol_20d = 0.342 is high vol. So we’re in the high-vol regime, which the stratification says has a 51% win rate. Use that.
  • Is the anchor anomalous? If most fields are extreme (very high momentum, very high vol, immediately post-earnings), the cohort might be drawing from analogs that don’t match the anchor’s actual situation. Cohort score will reflect this; double check.

The three judgment calls that matter

Reading a cohort response well comes down to three calls:

Call 1: Is the cohort tight enough to trust?

cohort_size_actual and cohort_score answer this. If size is below 30 after filters, or cohort_score is below 0.5, don’t make strong claims off the response. Loosen filters or accept the uncertainty.

Call 2: Headline cohort or regime-conditioned?

If the stratification reveals a sharp regime split AND the live anchor sits cleanly on one side, the regime-conditioned statistic is the right read. If the regime is transitioning or unclear, stay with the headline cohort.

Call 3: Which features are operative?

feature_importance + anchor_metadata together answer this. Don’t treat all features as equally operative. The ones that align with what the live anchor actually has are the ones to weight.

A read that holds up in front of a PM

Here’s the same NVDA · 2024-08-05 · 1h response, translated into the kind of paragraph an analyst would actually defend:

The 300-analog cohort suggests a slightly bearish 5-day outlook
(median -1.3%, win rate 44%, p10/p90 -11.3%/+6.8%). The signal
splits sharply on volatility regime: low-vol analogs win 38% of
the time vs 51% for high-vol. NVDA is currently in a high-vol
regime (realized_vol_20d 0.34, top quartile), which is the
better side of the split — 51% win rate, +0.4% median.

The two strongest positive features are tight credit and bullish
macro; both currently hold. The strongest negative is low-vol
regime, which does NOT apply here. So the regime-conditioned
read is roughly neutral.

Net: the headline statistic is bearish, but the regime-conditioned
read is closer to coin-flip. Cohort tightness is good (score 0.81),
so the analogs are meaningful. If you take the position, the p10
downside is -11.3%, which is wider than a typical NVDA setup, so
size accordingly.

That’s six sentences. Every claim ties to a specific field. No predictions, no point forecasts. A senior PM can push back on individual claims but the structure is defensible.

Frequently asked questions

What if the cohort response has only one horizon?
Then you queried with only one horizon. By default cohort_analyze returns 1, 5, and 10. You can request other horizons by specifying them in the request — e.g. horizons: [1, 3, 5, 10, 20]. Match the horizon to your decision timeframe.
Should I trust the mean or the median?
Median, almost always. The mean can be pulled by a single outlier analog (one big winner or loser among 300). Median is more robust. Use the mean only when you specifically want to know average outcomes including tails — for example, in expected-value calculations.
How do I know if feature_importance is meaningful?
Importance values around 0.05+ are meaningful. Below 0.03 they're noise. The ordered list is more useful than the absolute numbers — the top 3 features are what separated winners; the bottom features are uninformative.
Why doesn't the response include a recommended action?
Deliberately. Cohort intelligence informs decisions; it doesn't make them. The response gives you the distribution and explanation; what to do with it depends on your risk profile, holding horizon, and other context (fundamentals, options skew, your own portfolio). Tools that return 'buy/sell/hold' are usually compressing too much information.
How do regime_stratification buckets get defined?
Each regime dimension has named cuts: vol_regime ∈ {low, high} split on rolling 20-day realized vol vs 1-year median; macro_state ∈ {bullish, neutral, bearish} from a composite of SPY/QQQ momentum and breadth. Bucket definitions are stable and documented in our methodology page.
Can I get the raw 300 analogs (symbols and dates)?
Yes. The cohort_analyze response includes a top_matches array (default top 10) with symbol, date, distance, and realized return for each. Full 300 is available with the include_full_cohort=true flag (Builder tier and up).
Try it

Run a cohort_analyze call.

Free Sandbox tier — 200 calls/day, no authentication. MCP install for Claude or Cursor takes 30 seconds.

Related