In clinical research, rather than checking efficacy once, we measure a patient at multiple time points (e.g., Baseline, Week 4, Week 8, Week 12). These endpoints over time are repeated measurements on the same patient. This design shows how efficacy evolves (improves, declines, or stays stable) instead of relying on a single snapshot .
Core problem – correlation. Observations from the same individual are naturally correlated; today’s value is related to last week’s value. Methods must acknowledge this within‑subject dependence .
Analogy. A student’s test scores over a semester: a high score on Test 1 makes a high score on Test 2 more likely. We cannot treat the scores as independent.
We start with two plots:
if (!requireNamespace("ggplot2", quietly = TRUE)) install.packages("ggplot2", repos="https://cloud.r-project.org")
if (!requireNamespace("dplyr", quietly = TRUE)) install.packages("dplyr", repos="https://cloud.r-project.org")
library(ggplot2); library(dplyr)
set.seed(1001)
n <- 80
id <- factor(seq_len(n))
arm <- factor(sample(c("Placebo","Drug"), n, replace=TRUE))
vis <- c(0,4,8,12)
d1 <- expand.grid(id=id, visit=vis)
d1 <- dplyr::left_join(d1, data.frame(id=id, arm=arm), by="id")
# outcome with group difference growing over time
mu0 <- 10 + 0.1*d1$visit
delta <- 0.0 + 0.2*(d1$visit>=4) + 0.5*(d1$visit>=8) + 1.0*(d1$visit>=12)
eta <- mu0 + ifelse(d1$arm=="Drug", delta, 0)
d1$y <- rnorm(nrow(d1), eta, sd=1.5)
# spaghetti
p_sp <- ggplot(d1, aes(visit, y, group=id, color=arm)) +
geom_line(alpha=.25) +
labs(title="Individual trajectories (spaghetti)", x="Visit (week)", y="Outcome")
# mean +/- 95% CI
d1s <- d1 |>
group_by(arm, visit) |>
summarise(m=mean(y), se=sd(y)/sqrt(n()), .groups="drop")
p_mean <- ggplot(d1s, aes(visit, m, color=arm)) +
geom_line() + geom_point() +
geom_errorbar(aes(ymin=m-1.96*se, ymax=m+1.96*se), width=.4) +
labs(title="Mean profiles with 95% CI", x="Visit (week)", y="Adjusted mean (approx.)")
p_sp; p_mean
Spaghetti (individual trajectories).
- Each line is one patient over time. Most lines trend upward in both
groups, but the Drug lines tend to rise more steeply
after mid‑study.
- There is meaningful patient‑to‑patient variability (lines are not
perfectly parallel), which is normal and motivates a model that
accounts for within‑person correlation.
- No single extreme outlier dominates; a few atypical slopes exist but
do not drive the overall pattern.
Mean profiles with 95% CI (group averages).
- Baseline means are similar between groups (as
expected in randomization).
- By Week 4 there is little separation.
- At Week 8, the Drug mean begins to pull above
Placebo; error bars overlap less.
- By Week 12, Drug is clearly higher and the 95% CIs
show minimal overlap, suggesting a clinically
meaningful difference.
- The widening gap indicates a time‑dependent treatment
effect rather than an immediate effect.
Implications for modeling.
- Include a treatment × visit interaction (effect
changes over time).
- Because repeated measures are correlated, use MMRM
(and compare AR(1) vs UN via AIC) for valid SEs and CIs.
Exploratory plots show broadly upward trajectories in both arms, with a steeper increase for Drug from Week 8 onward. Group mean profiles diverge progressively, with the largest separation at Week 12. These patterns motivate a model with a treatment‑by‑visit interaction and an appropriate within‑subject correlation structure (evaluated via AR(1) vs unstructured in the MMRM analysis).
Goal. Estimate group differences by visit using all available data, acknowledging within‑subject correlation.
Model. \[ Y_{ij} =
X_{ij}\beta + \varepsilon_{ij}, \qquad \varepsilon_i \sim \mathcal{N}(0,
\Sigma). \] - \(Y_{ij}\):
response of subject i at visit j.
- \(X_{ij}\beta\):
fixed‑effects mean (treatment, visit, treatment×visit,
optional baseline).
- \(\Sigma\): within‑subject
covariance across visits (captures correlation).
- Estimation via (RE)ML; valid under Missing At Random
(MAR) if the mean/covariance are correctly specified .
Common covariance structures (what they mean). -
UN (Unstructured): each visit has its own
variance; each visit‑pair has its own
covariance. Most flexible; more parameters.
- AR(1): correlation decays with time lag (nearby
visits more alike); assumes regular spacing.
- CS (Compound Symmetry): all visits share the
same correlation; simplest; may be too rigid.
Choosing a structure. Compare AIC/BIC and consider scientific plausibility (do correlations decay with lag?).
Super Simplified Model (MMRM):
\[\text{Measured outcome at each visit} \approx \underbrace{\text{Average trend by group and visit}}_{\text{fixed effects }X\beta} \;+\; \underbrace{\text{individual deviation/noise}}_{\varepsilon_{ij}\ \text{with correlation }\Sigma}\]
The key is that the deviation part is filtered with a covariance \(\Sigma\) so that repeated measures on the same person are not treated as independent.
Mini example (plain language). Two patients on Drug and two on Placebo are measured at Baseline/Week 4/Week 8. MMRM estimates the adjusted mean in each arm at each visit (e.g., Drug at Week 8), and their difference (Drug − Placebo). Because repeated measures are correlated, MMRM models that correlation (UN/AR(1)/CS) so your SEs and CIs are correct.
We simulate a trial with dropout and fit MMRM using
nlme::gls with UN/AR(1).
if (!requireNamespace("nlme", quietly = TRUE)) install.packages("nlme", repos="https://cloud.r-project.org")
if (!requireNamespace("emmeans", quietly = TRUE)) install.packages("emmeans", repos="https://cloud.r-project.org")
if (!requireNamespace("dplyr", quietly = TRUE)) install.packages("dplyr", repos="https://cloud.r-project.org")
if (!requireNamespace("ggplot2", quietly = TRUE)) install.packages("ggplot2", repos="https://cloud.r-project.org")
library(nlme); library(emmeans); library(dplyr); library(ggplot2)
set.seed(2003)
simulate_mmrm_data <- function(n=200, visits=c(0,4,8,12), dropout_prob=0.15) {
id <- factor(seq_len(n))
trt <- factor(sample(c("Placebo","Drug"), n, TRUE))
d <- expand.grid(id=id, visit=visits)
d <- dplyr::left_join(d, data.frame(id=id, trt=trt), by="id")
# mean structure: drug effect grows over time
base <- 20 + 0.3*d$visit
eff <- 0.0 + 0.3*(d$visit>=4) + 0.8*(d$visit>=8) + 1.2*(d$visit>=12)
mu <- base + ifelse(d$trt=="Drug", eff, 0)
# AR(1)-like individual noise
rho <- 0.6
d <- d |> dplyr::arrange(id, visit)
y <- numeric(nrow(d))
for (ii in levels(id)) {
idx <- which(d$id==ii)
e <- as.numeric(arima.sim(model=list(ar=rho), n=length(idx), sd=1))
y[idx] <- mu[idx] + e
}
d$y <- y
# simulate dropout at last one or two visits for a subset
keep <- rep(TRUE, nrow(d))
byid <- split(seq_len(nrow(d)), d$id)
for (ii in names(byid)) {
if (runif(1) < dropout_prob) {
idx <- byid[[ii]]
k <- sample(1:2, 1)
keep_tail <- tail(idx, k)
keep[keep_tail] <- FALSE
}
}
d <- d[keep, , drop=FALSE]
d$visit_f <- factor(d$visit, levels=visits)
d <- d |> dplyr::arrange(id, visit)
d$visit_index <- as.numeric(d$visit_f)
d
}
dat_mmrm <- simulate_mmrm_data()
# UN (via corSymm + varIdent) and AR(1) fits
fit_un <- nlme::gls(
y ~ trt*visit_f,
data = dat_mmrm,
correlation = nlme::corSymm(form = ~ visit_index | id),
weights = nlme::varIdent(form = ~ 1 | visit_f),
method = "REML",
na.action = na.omit
)
fit_ar1 <- nlme::gls(
y ~ trt*visit_f,
data = dat_mmrm,
correlation = nlme::corAR1(form = ~ visit_index | id),
method = "REML",
na.action = na.omit
)
AIC(fit_un, fit_ar1)
# Visit-specific adjusted means and pairwise differences
emm <- emmeans::emmeans(fit_un, ~ trt | visit_f)
emm
## visit_f = 0:
## trt emmean SE df lower.CL upper.CL
## Drug 22.1 0.229 181 21.7 22.6
## Placebo 22.4 0.229 181 21.9 22.8
##
## visit_f = 4:
## trt emmean SE df lower.CL upper.CL
## Drug 22.2 0.222 196 21.8 22.7
## Placebo 22.4 0.222 196 22.0 22.9
##
## visit_f = 8:
## trt emmean SE df lower.CL upper.CL
## Drug 22.2 0.222 191 21.7 22.6
## Placebo 22.4 0.221 189 21.9 22.8
##
## visit_f = 12:
## trt emmean SE df lower.CL upper.CL
## Drug 22.5 0.249 199 22.0 22.9
## Placebo 22.4 0.244 191 21.9 22.9
##
## Degrees-of-freedom method: appx-satterthwaite
## Confidence level used: 0.95
contrast(emm, method="pairwise")
## visit_f = 0:
## contrast estimate SE df t.ratio p.value
## Drug - Placebo -0.2282 0.324 181 -0.704 0.4826
##
## visit_f = 4:
## contrast estimate SE df t.ratio p.value
## Drug - Placebo -0.1810 0.314 196 -0.576 0.5655
##
## visit_f = 8:
## contrast estimate SE df t.ratio p.value
## Drug - Placebo -0.2108 0.314 190 -0.672 0.5026
##
## visit_f = 12:
## contrast estimate SE df t.ratio p.value
## Drug - Placebo 0.0802 0.348 195 0.230 0.8181
##
## Degrees-of-freedom method: appx-satterthwaite
# Plot LS-means with 95% CI
emm_df <- as.data.frame(emm)
ggplot(emm_df, aes(as.numeric(visit_f), emmean, color=trt)) +
geom_line() + geom_point() +
geom_errorbar(aes(ymin=lower.CL, ymax=upper.CL), width=.3) +
scale_x_continuous(breaks=seq_along(levels(dat_mmrm$visit_f)), labels=levels(dat_mmrm$visit_f)) +
labs(title="MMRM: Adjusted means by visit (95% CI)", x="Visit", y="Adjusted mean")
emmeans.The table of estimated marginal means (EMMeans) reports the model-adjusted mean for each treatment at each visit, with standard errors and 95% confidence intervals. These are already adjusted for covariates and missing data under MAR.
Plain-English interpretation for the example output you
saw:
- Baseline means are similar (≈22).
- Means remain close through Week 8.
- Drug ticks up slightly by Week 12, but confidence intervals still
overlap substantially → any difference is small.
The contrast(emm, "pairwise") table gives Drug −
Placebo at each visit (Δ), with SE, df, t and p.
Plain-English interpretation for the example
contrasts:
- All Δ are small in magnitude (|Δ| < 0.3) and not
statistically significant (all p > 0.48).
- The sign turns slightly positive at Week 12, consistent with the tiny
separation in the means, but still not meaningful.
The LS‑means plot visualizes adjusted mean profiles with 95% CIs by visit.
The mixed-model for repeated measures (MMRM) with an unstructured covariance matrix showed no evidence of a treatment-by-visit interaction (p > 0.4). Adjusted group means were similar at all visits (≈22 units), and visit-specific contrasts (Drug − Placebo) were small (|Δ| < 0.3) with wide overlap in 95% confidence intervals (all p > 0.48). The LS‑means profiles mirrored these results, showing closely tracking trajectories without meaningful divergence. These conclusions were obtained while accounting for within-subject correlation and using all available data under the missing-at-random assumption.
\[\text{Outcome at visit} \approx \underbrace{\text{Average trend by group and visit}}_{\text{fixed effects}} \;+\; \underbrace{\text{individual deviation}}_{\varepsilon_{ij}\ \text{correlated within subject via } \Sigma}\] Idea: the deviation term is filtered by a covariance matrix \(\Sigma\), so repeated measures on the same person are not treated as independent, giving correct SEs and CIs for by-visit comparisons.
Goal. Test whether mean response differs across visits and by treatment in balanced designs. Model idea. Treat visit as a within-subject factor; requires sphericity (equal variances of differences). Violations inflate type‑I error; apply Greenhouse–Geisser/HF corrections.
Super Simplified Model (RM‑ANOVA): \[\text{Outcome} \approx \text{Group} + \text{Visit} + \text{Group}\times\text{Visit} + \text{within‑person error (sphericity)}\]
if (!requireNamespace("dplyr", quietly = TRUE)) install.packages("dplyr", repos="https://cloud.r-project.org")
if (!requireNamespace("tidyr", quietly = TRUE)) install.packages("tidyr", repos="https://cloud.r-project.org")
library(dplyr); library(tidyr)
set.seed(3005)
simulate_balanced <- function(n=60, visits=c(0,4,8,12)) {
id <- factor(seq_len(n))
trt <- factor(rep(c("Placebo","Drug"), length.out=n))
d <- expand.grid(id=id, visit=visits)
d <- dplyr::left_join(d, data.frame(id=id, trt=trt), by="id")
mu <- 15 + 0.4*d$visit + ifelse(d$trt=="Drug", 0.8*(d$visit>=8)+1.2*(d$visit>=12), 0)
d$y <- rnorm(nrow(d), mu, 1.2)
d$visit_f <- factor(d$visit, levels=visits)
d
}
dat_bal <- simulate_balanced()
rm_aov <- aov(y ~ trt*visit_f + Error(id/visit_f), data=dat_bal)
summary(rm_aov)
##
## Error: id
## Df Sum Sq Mean Sq F value Pr(>F)
## trt 1 37.07 37.07 18.93 5.57e-05 ***
## Residuals 58 113.59 1.96
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Error: id:visit_f
## Df Sum Sq Mean Sq F value Pr(>F)
## visit_f 3 1076.8 358.9 258.69 < 2e-16 ***
## trt:visit_f 3 43.8 14.6 10.52 2.15e-06 ***
## Residuals 174 241.4 1.4
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Two error terms: - Error: id — between‑subject (tests treatment main effect averaged over visits). - Error: id:visit_f — within‑subject (tests visit main effect and treatment×visit interaction).
if (!requireNamespace("ggplot2", quietly = TRUE)) install.packages("ggplot2", repos="https://cloud.r-project.org")
library(ggplot2)
ggplot(dat_bal, aes(visit_f, y, color=trt, group=trt)) +
stat_summary(fun = mean, geom = "line") +
stat_summary(fun.data = mean_cl_normal, geom = "errorbar", width = 0.2) +
labs(title="RM-ANOVA: mean profiles ± SE", x="Visit", y="Outcome")
RM‑ANOVA revealed significant effects of treatment, visit, and their interaction, indicating that outcomes changed over time and that Drug’s trajectory differed from Placebo. Mean profiles showed a steeper improvement for Drug, consistent with the interaction. Because RM‑ANOVA assumes sphericity, results should be corroborated with MMRM or LMM in the presence of missing or unequal spacing.
\[ Y_{ij} = \mu + \alpha_{\text{trt}} + \beta_{\text{visit}} + (\alpha\beta)_{\text{trt}:\text{visit}} + \varepsilon_{ij} \] with within‑subject errors satisfying sphericity (equal variances of differences across visits).
Goal. Describe individual
trajectories via random effects.
Model. \[ Y_{ij} =
X_{ij}\beta + Z_{ij}b_i + \varepsilon_{ij}, \qquad
b_i\sim\mathcal{N}(0,D),\; \varepsilon_{ij}\sim\mathcal{N}(0,\sigma^2).
\] - \(\beta\):
population‑level (fixed) effects.
- \(b_i\):
subject‑specific random effects (e.g., random
intercept/slope).
- Focus on variance components and their correlation
(Verbeke and
Molenberghs 2009).
Super Simplified Model (LMM):
\[\text{Outcome} \approx \underbrace{\text{Average baseline + average time trend}}_{\text{fixed effects}} \;+\; \underbrace{\text{personal baseline} + \text{personal trend}}_{\text{random effects}} \;+\; \text{noise}\]
LMM explains how individuals differ around the average curve. Random effects quantify how much people vary at baseline and in growth rates, and how these two vary together (correlation).
Mini example. In a growth study, some patients start higher (baseline) and some grow faster (slope). LMM estimates the average growth and the spread of individual growth around that average.
if (!requireNamespace("nlme", quietly = TRUE)) install.packages("nlme", repos="https://cloud.r-project.org")
if (!requireNamespace("ggplot2", quietly = TRUE)) install.packages("ggplot2", repos="https://cloud.r-project.org")
library(nlme); library(ggplot2)
set.seed(4007)
simulate_growth <- function(n=120, visits=0:6) {
id <- factor(seq_len(n))
trt <- factor(sample(c("Placebo","Drug"), n, TRUE))
d <- expand.grid(id=id, visit=visits)
d <- merge(d, data.frame(id=id, trt=trt), by="id")
# subject-specific intercept/slope
b0 <- rnorm(n, 10, 1.0); b1 <- rnorm(n, 0.7, 0.25)
mu <- with(d, b0[as.integer(id)] + b1[as.integer(id)]*visit + ifelse(trt=="Drug", 0.4*visit, 0))
d$y <- rnorm(nrow(d), mu, 1.0)
d
}
dat_lmm <- simulate_growth()
lme_fit <- nlme::lme(y ~ trt*visit, random = ~ visit | id, data = dat_lmm)
summary(lme_fit)
## Linear mixed-effects model fit by REML
## Data: dat_lmm
## AIC BIC logLik
## 2782.649 2820.478 -1383.324
##
## Random effects:
## Formula: ~visit | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 0.9333925 (Intr)
## visit 0.2493579 -0.002
## Residual 0.9966184
##
## Fixed effects: y ~ trt * visit
## Value Std.Error DF t-value p-value
## (Intercept) 10.145674 0.16007045 718 63.38255 0.0000
## trtPlacebo -0.251820 0.21264116 118 -1.18425 0.2387
## visit 1.101633 0.04333513 718 25.42126 0.0000
## trtPlacebo:visit -0.344679 0.05756735 718 -5.98741 0.0000
## Correlation:
## (Intr) trtPlc visit
## trtPlacebo -0.753
## visit -0.296 0.223
## trtPlacebo:visit 0.223 -0.296 -0.753
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.01391071 -0.61447533 -0.02155762 0.63062161 2.85870836
##
## Number of Observations: 840
## Number of Groups: 120
nlme::lme with random
effects ~ visit | id.How to read the output (LMM): - Fixed
effects: baseline (intercept) and growth rate (time);
trt×visit indicates different growth rates
by group.
- Random effects: SD(intercept), SD(slope), and their
correlation (do higher starters grow faster?).
- Subject-level predictions (BLUPs) can be used to
visualize individual curves.
Conclusion template (LMM): > Outcome increased by [β_time] per visit on average (p<…). Compared with Placebo, Drug increased the growth rate by [β_trt:time] per visit (p<…). Substantial heterogeneity in slopes was observed (SD=…), indicating meaningful between‑subject variation.
Repeated measurements occur when the same subject is observed
multiple times, such as at baseline, Week 4, Week 8, and
Week 12.
These within-person observations are correlated,
because a person’s measurement today tends to resemble their own
measurement last week.
Ignoring this correlation makes the analysis too optimistic —
it underestimates variability and overstates significance.
Key idea:
Repeated-measure analysis is not about having “more data points per person,”
but about correctly modeling how these points are related within each person.
Repeated-measure designs are powerful because they: - Track
change over time, not just snapshots.
- Allow each subject to serve as their own control,
removing between-person noise.
- Handle dropouts and unbalanced visits, when modern
models like MMRM or LMM are used.
They are the backbone of longitudinal studies and clinical trials, where the question is rarely “Does the drug work at one point?” but rather “How does the effect evolve over time?”
| Method | Focus | Handles_Missing | Typical_Use | Key_Assumptions |
|---|---|---|---|---|
| RM-ANOVA | Within-subject change under sphericity | No | Balanced, complete designs | Sphericity (equal variances of differences) |
| LMM (Random-Effects) | Individual trajectories (person-specific slopes/intercepts) | Yes | Modeling subject-level trends | Random effects correctly specified |
| MMRM | Visit-specific adjusted means (population averages per timepoint) | Yes (MAR) | Continuous outcomes in clinical trials | MAR assumption, unstructured covariance |
| GEE | Population-averaged effects for binary/count outcomes | Yes | Binary or count outcomes | Correct link and working correlation structure |
Final Thought
Repeated-measure analysis is about telling a story —
how individuals and populations change over time.
Choosing the right model ensures that story is statistically honest and scientifically meaningful.