Chart Library, explained for humans.
If you’ve looked at stock charts but haven’t built an ML model or written code, this page is for you. Every technical claim on /learn/methodology has a plain-English translation here, in the same order. Analogies instead of jargon. No math.
Chart Library is a search engine for the stock market. You give it a chart, and it finds the 10 historical charts that looked most like it, then shows you what happened next in those 10 cases.
The methodology page explains how we make sure the numbers we quote off that search are actually honest — not cherry-picked, not artificially tight, not hiding the experiments that didn’t work.
A few words of vocabulary
- Cohort — the group of historical matches. Think pulling the medical charts of 10 past patients whose symptoms looked most like yours.
- Distribution — the spread of outcomes across that group. Not one number, a range: “5 of 10 went up, 3 went down a little, 2 crashed.”
- Calibrated — adjusted so the confidence level we advertise matches what actually happens. If we say “80% confident the return lands in this range,” then historically 80% of the time it really does.
- Audited — we checked our own work and published the places it was wrong.
Every answer ships with a sample size and a range
Every answer comes with two things: how many past examples we based it on, and a realistic range — not a single point estimate. If we only had 4 matches, we tell you. We don’t quote “3.2% average return” without also telling you how much that number bounces around and how much data it’s built on.
We also publicly list the experiments that didn’t work. Anyone can claim their stuff works; the real tell is whether they tell you when it doesn’t.
1. How the search actually runs
Ingestion
We load price data for basically every US stock — about 20,000 tickers — going back to 2016. Both daily prices and minute-by-minute prices.
The important part: when a company goes bankrupt or gets acquired, it disappears from most databases. If you only study the stocks that survived to today, you’ll think the market is way safer than it is, because you never see the ones that went to zero. We deliberately load the dead ones back in, so our “history” includes the losers. A lot of financial research quietly cheats on this; we refused to.
Pattern representation (the fingerprint)
Every chart we’ve ever seen gets converted into a long list of numbers that captures its shape. Think of it like a fingerprint for a chart. Two charts with similar shapes get similar fingerprints; two very different shapes get very different fingerprints.
We don’t publish exactly how we make the fingerprint — that’s the recipe a competitor would copy in a weekend. What we do commit to: the same chart always produces the same fingerprint (no randomness), and when we compare two fingerprints, we care about both the shape AND the size of the move. A stock that went up 30% in a V-shape is a different pattern from one that went up 3% in the same shape — some rival systems ignore that; we don’t.
Retrieval
When you ask a question, we compare your chart’s fingerprint to every fingerprint in the library and pull the closest matches — typically the top 10. For each match, we already know what happened next (we precomputed it), so results come back instantly.
“Nearest-neighbor” is just “find the things most like this one.” Exactly like Shazam finding the song closest to the clip you hummed.
Calibration (the 'widen the range' trick)
If we just read the spread of outcomes straight off our 10 matches, the real range is actually wider than our 10 matches suggest. Why: we’re not drawing 10 random charts — we’re drawing the 10 most similar ones, which are by definition more alike than the broader universe. So the range in our sample is narrower than reality. We over-promise.
So we add a correction that widens the range. If we claim “80% confidence,” we test it on a separate batch of data and make sure 80% of the time the actual outcome really lands inside our band. Before the correction, our “80% band” only caught the real outcome 68% of the time — we were off by 12 percentage points, and we own it.
Decomposition
After we pull the 10 matches, we slice them by different attributes to see what’s actually driving the pattern. Were most of these matches near an earnings report? In the same industry? Big caps or small caps?
If 9 of 10 “matches” were all tech stocks near earnings, that’s worth knowing — it’s not the pattern that’s predicting, it’s earnings-season tech stocks. Decomposition surfaces that.
2. How we grade ourselves
Three hard rules we never break.
Symbol-disjoint splits (no cheating by ticker)
When we test the system, we pick a set of tickers it has never seen during training, and only quote scores on those unseen tickers. The common way people cheat at this (usually without realizing): split by date — “train on 2016-2022, test on 2023.” Problem: NVDA is in both halves, so the model has already seen NVDA’s habits. We split by ticker instead — NVDA is either in training or in testing, never both.
We caught ourselves doing this wrong in an earlier version.When we fixed it, our advertised accuracy dropped 2-3 percentage points. We shipped the fix anyway.
10-day quiet buffer (no sneaky autocorrelation)
We put a 10-trading-day “quiet buffer” between what the model saw during training and what we test it on. Why: yesterday’s chart and today’s chart look nearly identical just because it’s the same stock two days apart — that’s not the model being smart, that’s time being short. The buffer forces it to learn real patterns.
No matching against yourself
When you search for NVDA today, we don’t let last week’s NVDA show up as a “match.” Otherwise our 10 matches would be “NVDA, NVDA, NVDA, NVDA, NVDA…” and we’d just be predicting that NVDA looks like NVDA — useless.
3. The 'widen the range' fix, in detail
Four steps. Don’t worry about the formulas on the methodology page — this is all they’re saying:
- 1. Set aside a big batch of historical cases where we already know what happened next. This is our grading pile.
- 2. For each case in the grading pile, ask: “by how much did the real outcome fall outside our advertised band?” If our band was −5% to +8% and the real answer was +12%, the band was off by 4.
- 3. Look at all those “by how much was I wrong” numbers and find the one where 80% of the cases are more accurate. That number is our correction.
- 4. Widen every band by that correction amount. Now by construction, the band covers the truth 80% of the time.
Every response from our API includes both numbers — the raw one and the corrected one. Use the corrected one when sizing a trade. Use the raw one only when you’re just ranking which pattern looks strongest.
4. The experiments we ran that didn't work
The “look at the dirt under our fingernails” section. Three things we thought would work, didn’t, and publicly admitted.
Market-mood filters don't help (regime conditioning)
A very popular idea in quant finance: the same pattern behaves differently in scared vs. calm markets, so filter matches to today’s mood. We tested five versions (using fear gauges like VIX, credit spreads, the yield curve, breadth). Every one moved the answer by less than half a percentage point. Basically useless.
Our best guess why: the chart shape itself already contains the market-mood information. Searching on shape implicitly filters on mood. Most shops would have either buried this result or sold “regime-aware” as a premium feature anyway.
One anchor lies, 100 anchors don't
We ran an experiment on one chart and it showed earnings-window patterns underperform by 3.6%. Dramatic. Except when we re-ran it on 100 different charts, the real effect was only 0.5%, and when we compared it against a fake placebo signal (dividend dates, which shouldn’t matter), the test didn’t clear the statistical bar. So the 3.6% was noise.
This is the most common way financial research lies — run it once on a case that looks compelling, declare victory. We made a habit of running it 100 times.
Our own confidence bands were wrong
When we first shipped the “80% confidence” band, we later tested it and found the real coverage was 68%. That’s a big miss — we were overconfident by 12 percentage points. We shipped a fix (section 3 above) and published the before/after. The audit is public. The fix is public.
5. What we keep private, and why
This page exists so a professional evaluating us can trust that we’re doing the science right — without giving a competitor enough to copy it. A hospital can explain its rigorous trial protocol without leaking the drug formula.
We don’t publish: the exact model we use for fingerprints, how long the fingerprints are, what database trick we use to search them fast, our exact numerical corrections, how we generate practice data for training, or anything about the next version we’re building. A paying enterprise customer under NDA gets a deeper look.
6. Where we're weaker, stated out loud
We haven't held out 2020 cleanly
When we claim “here’s how this pattern performed in 2020,” part of our model was trained on data that included 2020. So it’s not a clean out-of-sample test for that specific period. We have a setting (as_of) that pretends today is an earlier date for the search step, but we can’t retroactively un-train the fingerprinter. If you’re evaluating us on a specific era, ask which era is cleanly held out.
Our live track record is only ~2 years long
Truly crazy market environments (2008, March 2020) show up so rarely in 2 years that we can’t tell you how the system performs in them. Don’t size off our numbers during a genuine panic — we simply don’t have the data yet.
The '80% means 80%' guarantee has fine print
It assumes the near future roughly resembles the near past. If the world changes very suddenly (COVID day 1), our bands will temporarily be wrong and we won’t know until after. We periodically re-fit to stay current, but there’s a lag.
Small slices in decomposition are exploration, not signal
Remember the slicing by sector, by market-cap, by catalyst in section 1e? If any single slice has fewer than 30 examples, don’t trust the number. Look at those slices for questions to investigate, not for answers to trade on.
Chart Library is a search engine for chart patterns. You hand it a chart; it hands you the 10 most similar charts in history and what happened after them. Most “pattern-based” systems quietly cheat in four places — they ignore the stocks that went bankrupt, they test on data they already trained on, they quote a tight confidence range when the real range is wider, and they hide the experiments that didn’t work. This page is our promise, in writing, that we do none of those four things — and it lists the specific times we caught ourselves slipping and fixed it.