How Cohort builds its models

Every projection, symptom forecast, and cohort comparison in Cohort rests on the four input categories below.

Publicly available clinical trial data

The bedrock layer. We use the registered, peer-reviewed randomized-controlled-trial results for every medication Cohort tracks. For GLP-1s that means STEP and STEP-HFpEF (semaglutide, NEJM 2021–2024), SURMOUNT-1 through SURMOUNT-5 (tirzepatide, NEJM 2022–2024), SURPASS (T2D arm), and the TRIUMPH program for retatrutide. All of it is publicly registered on ClinicalTrials.gov and published in journals you can read on PubMed.

Trial data tells us the mean outcome at the population level — 15% body weight loss at week 68 on semaglutide, 22% at week 72 on tirzepatide. That's the anchor for every projection in Cohort.

Peer-reviewed scientific journals

Beyond the headline trials we read the secondary literature: mechanism papers, side-effect catalogs from independent cohort studies, body-composition substudies (the DEXA arm of STEP-3), maintenance and post-cessation analyses, lab-marker trajectories under GLP-1 therapy, and the growing literature on protein floors plus resistance training during caloric deficit. Sources include NEJM, JAMA, Diabetologia, Obesity, Cell Metabolism, Lancet Diabetes & Endocrinology — anything indexed in PubMed where the full text is publicly available or available through standard institutional access.

Anonymized user data

When Cohort users log a shot, a weight, a symptom, or a lab result, that data feeds the per-cohort distributions we show back to other members at the same phase and protocol. We never expose identifying information — your weight, your symptoms, your labs go into the cohort aggregate, not into anyone's feed. The cohort distribution is the point; individuals are not.

Users can export their own data and delete their account at any time. Account deletion removes the identifiable association; the anonymized aggregate that already informed the cohort model cannot be traced back.

Publicly available community discussion

Peptide and GLP-1 communities have spent the last several years documenting their own experience — symptom timelines, dose-titration outcomes, off-ramp protocols, lab-marker patterns. Where this discussion is public, we read it as a qualitative complement to the trial data. The trials tell us the mean; the public conversation tells us the variance, the tail behaviors, and the questions researchers haven't studied yet (such as the post-cessation regain-risk window, or the “Thursday trough” wearing-off pattern that doesn't make it into a primary endpoint).

We never publish identifying information from these sources. Direct quotes are paraphrased; usernames are never reproduced; topics are aggregated into category- level insights, not individual stories.

How these come together

Trial data sets the central estimate. Journals refine the variance and the mechanism. Anonymized user data updates the model with what's happening this week in this specific cohort. Public discussion fills the long tail — the edge cases trials don't enroll for.

Four inputs converge into a single cohort distribution. The central estimate comes from trials; the variance from journals; the weekly update from your cohort; the long-tail edges from public discussion.

Every published synthesis is reviewed by a human before it ships. No AI output is published blind. No marketing copy invents data that the sources don't support.

What we won't do

Publish identifiable user content. Aggregate, paraphrase, and protect.
Sell user data or derived analyses. Cohort+ subscriptions are how the company makes money — never data resale.
Make medical claims or replace prescribers. We synthesize what the data show. Dose changes and clinical decisions belong with your medical team.
Trust AI output blindly. Every published synthesis has been read by a human before it ships.