Cohort Analysis: How to Read a Retention Table Like a Pro
A cohort table is the single most information-dense artifact in SaaS analytics — and the most commonly misread. Once you know the three directions to read it in, it tells you whether your product actually works, whether it's getting better, and whether your growth is real or recycled.
What a cohort is, precisely
A cohort is a group of users who share a starting event in the same time window — most commonly "signed up in the same month." You then track what fraction of each cohort is still active 1, 2, 3… months after that start.
The result is the classic retention triangle (numbers illustrative):
| Cohort | M0 | M1 | M2 | M3 | M4 | M5 |
|---|---|---|---|---|---|---|
| Jan | 100% | 42% | 31% | 27% | 25% | 24% |
| Feb | 100% | 45% | 33% | 29% | 27% | |
| Mar | 100% | 44% | 34% | 30% | ||
| Apr | 100% | 51% | 39% | |||
| May | 100% | 53% |
Each row is a cohort's life story. The triangle shape is just time: May's cohort hasn't lived long enough to have an M3.
Direction 1 — read across the row: does the product work?
Follow one row left to right. The question: does the curve flatten?
- Flattens above zero (Jan: 42 → 31 → 27 → 25 → 24): a durable core exists. Whatever percentage it flattens at is your product's honest retention floor.
- Keeps sliding toward zero: users are churning at every age. No amount of acquisition fixes this; growth just refills a leaking bucket.
The flattening level matters more than early retention. Illustratively, a product that flattens at 25% by month 4 has a real business hiding inside; one that reads 60 → 40 → 25 → 14 → 8 does not, despite the better-looking first month.
Direction 2 — read down a column: are we improving?
Compare the same age across cohorts — the M1 column above reads 42, 45, 44, 51, 53. That climb from 42% to 53% says onboarding or targeting genuinely improved between January and May.
This is the honest scoreboard for product work. Blended retention ("34% of all users active this month") mixes old and new cohorts and moves for a dozen confounded reasons; the column comparison isolates cohort quality at equal age.
Annotate your changes: if the new onboarding shipped in April, the Apr/May jump in M1 is your evidence it worked — far cheaper than an experiment, if you're disciplined about not shipping five other things at once.
Direction 3 — read the diagonal: what happened that month?
A diagonal slice (Jan's M4, Feb's M3, Mar's M2, Apr's M1) is everyone's behavior during the same calendar month. If a whole diagonal dips, something happened in the world or in your product that month — an outage, a pricing change, seasonality — rather than something about any particular cohort.
This is the direction people forget, and it's the antidote to false alarms: a bad diagonal masquerades as several cohorts "getting worse" simultaneously.
Choosing your definitions (where analyses quietly diverge)
Two teams can compute wildly different tables from the same data. Pin down:
- The activity definition. "Logged in" is weak; "performed a core action" is honest. Pick the action closest to value delivered.
- The window convention. Calendar months vs. 30-day windows from signup date. The latter is cleaner (a Jan-28 signup gets a full month 1, not three days).
- Bounded vs. unbounded. "Active in month 3" vs. "active in month 3 or later." Standard retention triangles use the bounded version.
- The population. All signups, or activated signups only? Both are useful — just label them. Retention of activated users isolates product stickiness from onboarding quality.
Revenue cohorts: the second table to build
Repeat the exercise with MRR instead of user counts: of the MRR a cohort started with, how much remains (including expansion) at each age? This is cohort-level net revenue retention, and it can flatten above 100% if expansion outpaces churn — the signature of the best SaaS businesses. Illustratively, a table where user retention flattens at 30% but revenue retention flattens at 85% tells you your survivors upgrade heavily.
Beyond retention: cohort thinking as a general tool
Once the triangle clicks, apply the same structure to any metric where time-since-start matters:
- Cumulative revenue per cohort. How much has the average January signup paid you by month 6? Divide by acquisition cost per user in January and you have honest, cohort-level payback — far more trustworthy than blended LTV formulas.
- Feature adoption by cohort. Do March signups adopt your key feature faster than January's? This is how you verify onboarding changes actually changed behavior.
- Channel cohorts. Same triangle, but cohorted by acquisition channel instead of month. Illustratively: paid-social cohorts flattening at 12% while organic-search cohorts flatten at 34% is a budget-reallocation memo written in numbers.
- Time-to-value distribution. For each cohort, the share reaching activation within 1, 7, 14 days. A compressing distribution across cohorts is onboarding progress made visible.
The mental move is always the same: never compare users of different ages as if they were the same population; align everyone on their own clock.
The traps
- Small cohorts. A 20-signup cohort moves ±10 points from three users. Below ~100 users per cohort, group by quarter instead of month.
- The doomed last column. The newest cell of every row is often a partial period. Exclude in-progress windows or your latest numbers always look catastrophic.
- Survivorship euphoria. "Users active at month 6 love us!" Of course — the ones who didn't leave. Never generalize from survivors to signups.
- Mixed populations. One enterprise pilot inside an SMB cohort distorts the whole revenue row. Segment big outliers out.
- Averaging across cohorts. The "average retention curve" hides the column-wise improvement that is the entire point.
Building the table without a data team
You don't need infrastructure for this. From a simple export of users (signup date) and activity events (user, date), a spreadsheet pivot gets you there: bucket signups by month, bucket activity by months-since-signup, count distinct users per cell, divide each row by its cohort size. Twenty minutes the first time, five thereafter. For revenue cohorts, the same construction runs off your billing export with MRR summed per cell instead of users counted. The point of doing it by hand at least once: you'll internalize exactly what each cell means — which makes you immune to the most common tool-induced confusion, unbounded versus bounded retention silently mislabeled.
A 20-minute monthly ritual
- Rebuild the table (user + revenue versions), monthly cohorts, core-action definition.
- Row check: is the newest mature cohort flattening, and at what level?
- Column check: are M1 and M3 improving across the last six cohorts?
- Diagonal check: any calendar-month dip across everything?
- Write three sentences: what improved, what didn't, one hypothesis to test.
Do this monthly and you'll have a better grip on product-market fit than most decks ever show.
The bottom line
Rows tell you whether the product works. Columns tell you whether you're improving. Diagonals tell you what happened. Definitions decide whether any of it is trustworthy. That's cohort analysis — everything else is formatting.
Growth Pilot builds these cohort views automatically from your connected data, with consistent definitions across user and revenue retention, so your monthly ritual starts at step two.