Guide · API

Trim the cohort_analyze response —
with one parameter.

The default /api/v1/cohort_analyze response is fat by design. It bundles the outcome distribution, per-feature importance, regime stratification, risk profile, Layer 5 memory, and a few derived scores — about 41 KB of JSON per call. That bundle is what makes the endpoint useful to an agent: one tool call, the whole multi-factor view.

If you’re calling the endpoint from a backtest loop, a signal generator, or any non-agent script, most of those bytes are wasted. Pass a fields= allowlist and the server drops every key you didn’t ask for before serialization.

The minimum example

You only need outcome_distribution:

curl -X POST https://chartlibrary.io/api/v1/cohort_analyze \
  -H "Authorization: Bearer $CHART_LIBRARY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "anchor": {"symbol": "NVDA", "date": "2024-08-05", "timeframe": "1d"},
    "fields": ["outcome_distribution"]
  }'

Response — about 1 KB:

{
  "anchor": {"symbol": "NVDA", "date": "2024-08-05", "timeframe": "1d"},
  "cohort_size_actual": 300,
  "outcome_distribution": {
    "1":  {"n": 296, "mean":  0.21, "median": -0.04, "p10": -2.4,  "p90":  2.7, "win_rate": 0.49, "std":  1.9},
    "5":  {"n": 298, "mean": -0.40, "median": -1.30, "p10": -11.3, "p90":  6.8, "win_rate": 0.44, "std":  7.1},
    "10": {"n": 295, "mean": -0.10, "median": -1.85, "p10": -15.1, "p90": 11.2, "win_rate": 0.46, "std": 10.4}
  },
  "elapsed_ms": 264,
  "warnings": []
}

Bytes saved, by request shape

Measured against a real anchor (NVDA / 2024-08-05 / 1d, cohort_size = 300):

RequestBytesvs default
default (no fields)40,983
["outcome_distribution"]1,074−97.4%
["outcome_distribution", "cohort_score"]~1,100−97.3%
["outcome_distribution", "regime_stratification"]~2,400−94.1%
["outcome_distribution", "feature_importance"]39,106−4.6%

The big saving is dropping feature_importance — that single key is where the long-tail bytes live (one entry per anchor feature per horizon, including the SHAP-style support points). If your code only uses distributions, you save 97%. If you keep feature_importance, you save about 5%.

Valid fields

Pass any subset of:

outcome_distribution      // p10/p25/median/p75/p90/win_rate/std per horizon
feature_importance        // which Layer 2 features separated winners
regime_stratification     // outcomes sliced by vol / macro regime
risk_profile              // drawdown / runup percentiles
cohort_tightness_score    // how concentrated analogs are in embedding space
cohort_score              // composite signal-strength scalar in [0,1]
combined_conviction       // cohort_score + narrative_pulse boost
pulse_boost               // narrative-pulse contribution (scalar)
narrative_pulse           // realtime news sentiment summary
cohort_anchors            // the K analog rows themselves (opt-in)
anchor_metadata           // the anchor's own Layer 2 features (opt-in)

Four keys are always returned regardless of what you ask for — anchor, cohort_size_actual, elapsed_ms, and warnings — so the response is always parseable.

A typo returns a structured 422 with the valid list inline, not a silent empty response:

{
  "error": "Unknown fields: ['feature_imp']",
  "valid_fields": [
    "anchor_metadata", "cohort_anchors", "cohort_score",
    "cohort_tightness_score", "combined_conviction",
    "feature_importance", "narrative_pulse", "outcome_distribution",
    "pulse_boost", "regime_stratification", "risk_profile"
  ]
}

Through the MCP server

chartlibrary-mcp 5.0.1+ forwards fields= on the cohort tool when you call it with depth="full":

# From a Claude Desktop / Cursor / Codex agent session:

cohort(
  symbol="NVDA",
  date="2024-08-05",
  timeframe="1d",
  depth="full",
  fields=["outcome_distribution"],
)

For agents this matters less — Claude pays in tokens, not bandwidth, and the bigger payload gives the model richer context. For scripted callers wrapping the MCP server programmatically (or piping outputs into a JSON consumer), the trim is the same 97%.

Compute is not skipped — bytes are

fields= is a serialization filter. The server still computes the full response — feature importance is fit, regime strata are stratified, conformal bands are applied — and only the output is trimmed. That’s intentional: the heavy work happens whether or not you read it, so the cache is consistent across callers.

If you also want to skip computation, the separate options.include_* flags do that: options.include_feature_importance = false skips the SHAP fit entirely. Stack the two when you want both cheaper compute and a slim payload:

{
  "anchor": {"symbol": "NVDA", "date": "2024-08-05", "timeframe": "1d"},
  "options": {
    "include_feature_importance": false,
    "include_regime_stratification": false,
    "include_risk_profile": false
  },
  "fields": ["outcome_distribution"]
}

When this matters

Three concrete cases where it pays off:

  1. Backtest loops. Calling the endpoint once per (symbol, date) over thousands of historical anchors. The default response shape is designed for an agent reading one anchor; the script reading thousands wants the distribution row, nothing else.
  2. Signal-generation crons. A nightly job that pulls cohort statistics for every position in a portfolio. Trim to the distribution + conviction scalar; drop the rest. ~40× faster JSON parse, ~40× less log volume.
  3. Browser dashboards. A UI surface that renders only the headline median + p10/p90 band per horizon. No reason to ship feature_importance over the wire to a static page.

For agent traffic specifically — Claude Desktop, Cursor, the Claude API tool-use loop — the default full response is usually still the right call. Agents pay in tokens, not bandwidth, and the broader context measurably improves tool-selection on follow-up turns.

Frequently asked questions

Does fields= affect rate limits or pricing?
No. Quota is per-call, not per-byte. The trim is purely a payload-size optimization.
Can I trim cohort_anchors or anchor_metadata?
Both are already opt-in via options.include_cohort_anchors and options.include_anchor_metadata (default off). If you set those true, you can include them in fields=; if you don't, they're not present to trim.
What about other endpoints — does fields= work on /discover or /narrative?
Not yet. fields= ships on /api/v1/cohort_analyze first because that's the heaviest response and the first endpoint a real evaluator asked us to slim. If you have a similar need on another endpoint, email graham@chartlibrary.io — concrete asks ship fast.
Is the schema stable?
The list of valid fields is part of the API contract and won't shrink without a deprecation window. New optional keys may be added; passing fields= guarantees your client never breaks from a new key landing in the default response.

Related

Try it

Trim a cohort_analyze response.

Free Sandbox tier covers 200 calls/day. Cohort intelligence with fields= is on Builder ($29/mo).