Guide · Agent builders

Build an AI trading agent with Claude — end-to-end walkthrough.

This guide walks through building a paper-trading agent that uses Chart Library’s cohort intelligence as its analysis primitive, scores its picks against realized actuals, and compounds intelligence through a lesson memory loop.

We backtested this exact architecture over 19 recent trading days. The memory loop beat passive SPY by +1.6 percentage points and beat the no-memory baseline by +7.5pp. Tiny sample, but a real signal — the lesson loop adds something.

The full implementation is open source (see /agent-trader for the live paper-trading version). This guide gives you the scaffolding to build your own.

Architecture in one diagram

┌─────────────────────────────────────────────────────────┐
│  DAILY DECISION (runs after market close, Tue-Sat)      │
│                                                         │
│   1. Pull top-10 candidates (forward_tests / discover)  │
│   2. For each: cohort_analyze (n=300, full distribution)│
│   3. Read last 5 lessons from memory                    │
│   4. Claude picks 3 with conviction + reasoning         │
│   5. Log trades to DB                                   │
└─────────────────────────────────────────────────────────┘
                         │ wait 5 trading days
                         ▼
┌─────────────────────────────────────────────────────────┐
│  SCORING + LESSON GENERATION (runs nightly)             │
│                                                         │
│   1. Close due trades against daily_bars                │
│   2. Compute return vs SPY benchmark                    │
│   3. Claude writes 1-sentence lesson                    │
│   4. Lesson stored in memory, conditions next picks     │
└─────────────────────────────────────────────────────────┘

Prerequisites

Python 3.10+
pip install anthropic chartlibrary (the SDK is on PyPI as chartlibrary)
Anthropic API key (free credits available for new accounts)
Chart Library API key (Sandbox is free; Builder $29/mo for higher limits)
SQLite or any database for the trade log + lesson memory

Step 1: Pull candidates with cohort enrichment

# step1_candidates.py
from chartlibrary import ChartLibraryClient

cl = ChartLibraryClient(api_key="cl_...")  # or rely on env var

def get_enriched_candidates(top_n=10):
    """Return today's top-N picks with full cohort statistics."""
    setups = cl.agent.setups(top=top_n, timeframe="1d")
    enriched = []
    for s in setups:
        # cohort fields are pre-enriched in agent/setups response
        enriched.append({
            "symbol": s.symbol,
            "cohort_score": s.cohort.cohort_score,
            "win_rate_5d": s.cohort.win_rate_5d,
            "median_5d": s.cohort.median_5d,
            "p10_5d": s.cohort.p10_5d,
            "p90_5d": s.cohort.p90_5d,
            "top_features": [f.feature for f in s.top_features[:3]],
            "interest_score": s.interest_score,
        })
    return enriched

The /api/v1/agent/setups endpoint is built specifically for this — it returns the top picks pre-enriched with full-cohort stats and top features in one call, so you don’t need 10 separate cohort_analyze calls.

Step 2: Read lesson memory

# step2_memory.py
import sqlite3

def init_db():
    db = sqlite3.connect("agent.db")
    db.executescript("""
        CREATE TABLE IF NOT EXISTS trades (
            id INTEGER PRIMARY KEY,
            trade_date DATE, symbol TEXT, conviction TEXT,
            entry DOUBLE, exit DOUBLE, return_pct DOUBLE, spy_5d DOUBLE,
            cohort_score DOUBLE, win_rate_5d DOUBLE,
            reason TEXT, status TEXT DEFAULT 'open'
        );
        CREATE TABLE IF NOT EXISTS lessons (
            id INTEGER PRIMARY KEY,
            lesson_date DATE UNIQUE, lesson_text TEXT,
            portfolio_5d DOUBLE, spy_5d DOUBLE
        );
    """)
    return db

def read_recent_lessons(db, n=5):
    rows = db.execute(
        "SELECT lesson_date, lesson_text, portfolio_5d, spy_5d "
        "FROM lessons ORDER BY lesson_date DESC LIMIT ?", (n,)
    ).fetchall()
    return list(reversed(rows))  # chronological for prompt

Step 3: Claude picks with conviction

# step3_pick.py
import anthropic, json, re

client = anthropic.Anthropic()

def claude_picks(candidates, lessons, slots=3):
    lessons_text = ""
    if lessons:
        lessons_text = "\nLESSONS FROM RECENT DAYS:\n"
        for d, text, port, spy in lessons:
            lessons_text += f"- {d} (you {port:+.2f}%, SPY {spy:+.2f}%): {text}\n"

    prompt = f"""You are a cohort-aware paper trader. Pick {slots} long positions for 5-day equal-weighted hold.

CANDIDATES (pre-enriched with full-cohort stats):
{json.dumps(candidates, indent=2)}
{lessons_text}
Pick exactly {slots} symbols. For each, output: conviction (low/medium/high) + ONE sentence reason (max 25 words).
Apply your accumulated lessons. High cohort_score is one signal, not the only one.

Output ONLY a JSON array:
[{{"symbol": "AAPL", "conviction": "medium", "reason": "60% win rate cohort, tight band"}}]"""

    resp = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=512,
        messages=[{"role": "user", "content": prompt}],
    )
    text = resp.content[0].text.strip()
    m = re.search(r"\[[\s\S]*\]", text)
    return json.loads(m.group()) if m else []

Step 4: Score and reflect

# step4_score.py
from chartlibrary import ChartLibraryClient

cl = ChartLibraryClient()

def score_due_trades(db):
    """Close any trade that's hit its 5d mark using daily bars."""
    rows = db.execute("""
        SELECT id, trade_date, symbol, entry FROM trades WHERE status='open'
    """).fetchall()
    for tid, td, sym, entry in rows:
        # Get close 5 trading days after td
        bars = cl.bars.daily(symbol=sym, after=td, limit=5)
        if len(bars) < 5: continue  # not enough days yet
        exit_price = bars[-1].close
        ret = (exit_price - entry) / entry * 100.0
        spy = cl.bars.daily(symbol="SPY", after=td, limit=5)
        spy_5d = (spy[-1].close - spy[0].close) / spy[0].close * 100.0
        db.execute("""
            UPDATE trades SET status='closed', exit=?, return_pct=?
            WHERE id=?""", (exit_price, ret, tid))
    db.commit()

def generate_lesson(db, trade_date):
    picks = db.execute("""
        SELECT symbol, return_pct, conviction, cohort_score, win_rate_5d
        FROM trades WHERE trade_date=? AND status='closed'""", (trade_date,)).fetchall()
    if not picks: return
    portfolio_5d = sum(p[1] for p in picks) / len(picks)
    # ... fetch SPY 5d ...
    prompt = f"""Reflect on {trade_date}: picks {picks}, portfolio {portfolio_5d:+.2f}%.
Write ONE concise lesson (1-2 sentences, max 35 words) for next time."""
    lesson = client.messages.create(
        model="claude-haiku-4-5-20251001", max_tokens=128,
        messages=[{"role": "user", "content": prompt}]
    ).content[0].text.strip()
    db.execute("INSERT OR IGNORE INTO lessons (lesson_date, lesson_text, portfolio_5d) VALUES (?,?,?)",
               (trade_date, lesson, portfolio_5d))
    db.commit()

Step 5: Wire up the daily cron

Two cron jobs, both running Tue-Sat after market close (UTC):

# crontab
30 22 * * 1-5  python /app/decide.py   # 22:30 UTC: pick today's trades
35 23 * * 1-5  python /app/score.py    # 23:35 UTC: close due, write lessons

decide.py runs steps 1-3 and logs trades. score.py runs step 4 and writes the lesson when all trades from a decision date are closed.

What we learned from running this in production

The Cohort-Aware Trader at chartlibrary.io/agent-trader is exactly this architecture, running live. Three observations:

Memory matters more than we expected. The lesson-loop variant beat the same agent without memory by 7.5pp over 19 days. Most of the value comes from the agent learning to ignore signals that haven’t been working in the recent regime.
Claude is honest about confidence when prompted. Asking for explicit conviction labels (low/med/high) revealed that the LLM is well-calibrated when you make it commit. Most picks come back as “medium” — that’s a feature, not a bug.
The first 60 trades are noise. With 5-day overlapping windows, you need ~60 closed trades before the win rate or excess return is anything other than regime luck. Publish the disclaimer.

Frequently asked questions

Why use cohort intelligence as the primitive instead of point predictions?: LLMs reason poorly about point predictions — they either trust the number too much or hand-wave. Cohort intelligence returns a distribution, sample size, and feature attribution. The LLM has structured facts to reason about. Result: better-calibrated decisions, easier to debug, easier to audit.
Can I use a model other than Claude?: Yes. The pattern works with any model — GPT-4, Gemini, Llama, Mistral. We use Claude Haiku because it's cheap, fast, and follows the JSON-output constraint reliably. Adapt the prompt to your model's strengths.
What's the LLM cost per day for this agent?: Claude Haiku at current pricing: ~$0.001 per pick call + ~$0.001 per lesson call. ~$0.002/day = $0.50/year. The Chart Library API calls are free in Sandbox tier or $29/mo Builder. Total cost to run this 24/7 is dominated by the database + cron infrastructure, not the AI calls.
How do I extend this with conviction-weighted sizing?: After the LLM returns conviction labels (low/med/high), map them to position sizes (e.g., 25%/33%/45%). Track whether high-conviction picks actually outperform — if they don't, the LLM's conviction labels aren't useful and you should equal-weight. We're testing both in production.
What about stops and targets?: Production agents typically need them. We omit them in v1 to test the pure cohort signal — does the LLM's pick + 5-day hold actually work? Once you have signal, add stops to limit catastrophic drawdowns and targets to lock in winners.

Try it