Introduction

This document demonstrates:

Basic survival analysis in R
- Code OS and PFS from trial data
- Plot a survival curve
- Fit the Kaplan-Meier estimator
- Fit a Cox proportional hazard model
How to fit and extrapolate parametric survival models using flexsurv
How to select best fit
How to perform partitioned survival analysis (PartSA or PSM) using hesim
- Adding cost, effect (survival) and utility data
- Extract incrementals and ICER
- Plot ICER-plane and CEAC curve

This is relevant as most HTAs are based on cancer and ~ 70% of HTAs within cancer is based on partitioned survival models in places like NICE or DMC.

This example is based on real trial data from Project Data Sphere (PDS) for the clinical trial: the PRIME study - Panitumumab AMG 954 20050203 EudraCT: 2006-000170-70.

Protocol: A Randomized, Multicenter, Phase 3 Study to Compare the Efficacy of Panitumumab in Combination with Oxaliplatin/ 5-fluorouracil/ leucovorin (FOLFOX) to the Efficacy of Oxaliplatin/ 5-fluorouracil/ leucovorin (FOLFOX) Alone in Patients with Previously Untreated Metastatic Colorectal Cancer

Primary results 2010: 10.1200/jco.2009.27.4860
Final results 2014:10.1093/annonc/mdu141

It is important to note that the trial found a effect within the KRAS stratum of wild-type (WT). Because of this the data is mostly analyzed only within this subgroup (60% of the population). Additionally the used data only includes 80% of the trial data to conserve anonymity. The trial protocol can be found through PDS

R

library(tidyverse)
library(patchwork)
library(broom)
library(survival)
library(flexsurv)
library(hesim)
library(survminer)
library(haven)
library(reactable)
library(forestmodel)

th <- theme(panel.grid = element_blank())

scale_colour_discrete <- function(...) { scale_colour_brewer(..., palette = "Paired") }
scale_fill_discrete   <- function(...) { scale_fill_brewer(...,   palette = "Paired") }

Kaplan-Meier survival function

Time-to-event data are common in epidemiology and health economics

Time from diagnosis to death (OS)
Time from diagnosis to progression/death (PFS)

There are other endpoints (RFS, DFS, TTP, CR, MRD, etc.) but these two are the most common

The estimator of the survival function \(S(t)\) (the probability that life is longer than \(t\)) is given by:

\[\hat S(t) = \prod_{i:t_i\leq t} \left( 1-\frac{d_i}{n_i}\right)\]

with \(t_i\) a time when at least one event happened, \(d_i\) the number of events (e.g., deaths) that happened at time \(t_i\), and \(n_i\) the individuals known to have survived (have not yet had an event or been censored) up to time \(t_i\).

S <- function(data, time, status) {
 
  t_i <- data[["time"]] |> sort() |> unique()
  n_i <- map_int(t_i, ~ nrow(data[data[["time"]] >= .x,]))
  d_i <- map_int(t_i, ~ nrow(data[data[["time"]] == .x & data[["status"]] == 2,]))
  
  tibble(t_i, n_i, d_i, survival = cumprod(1 - d_i / n_i))
}

Example KM

First we need some data, the dataset lung can always be called in R

head(lung)

  inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
1    3  306      2  74   1       1       90       100     1175      NA
2    3  455      2  68   1       0       90        90     1225      15
3    3 1010      1  56   1       0       90        90       NA      15
4    5  210      2  57   1       1       90        60     1150      11
5    1  883      2  60   1       0      100        90       NA       0
6   12 1022      1  74   1       1       50        80      513       0

This dataset contains a time (survival) and status (censoring) variable which is what we need to plot a survival curve. We can try our function with this data. aditionally i will use the common way to perform KM fit in R using the survfit function.

S(time = time, status = status, data = lung) |> 
  head()

# A tibble: 6 × 4
    t_i   n_i   d_i survival
  <dbl> <int> <int>    <dbl>
1     5   228     1    0.996
2    11   227     3    0.982
3    12   224     1    0.978
4    13   223     2    0.969
5    15   221     1    0.965
6    26   220     1    0.961

S(time = time, status = status, data = lung) |> 
  ggplot(aes(t_i, survival)) + 
  geom_step()

survfit(Surv(time, status) ~ 1, data = lung) |> 
  tidy() |> select(1:3,5) |> head()

# A tibble: 6 × 4
   time n.risk n.event estimate
  <dbl>  <dbl>   <dbl>    <dbl>
1     5    228       1    0.996
2    11    227       3    0.982
3    12    224       1    0.978
4    13    223       2    0.969
5    15    221       1    0.965
6    26    220       1    0.961

survfit(Surv(time, status) ~ 1, lung) |> 
  tidy() |> ggplot(aes(time, estimate)) + 
  geom_step()

However, when plotting survival curves we often want some more info like a risk table. Therefore i use the package survminer. As a standard it is a bit simple, but it can be customized alot to create nice survival curves like below.

Nice survival curves

see here for code example on Linkedin

Show code

lung2 <- lung |> 
  mutate(sex = if_else(sex == 1, "Male","Female"),
         age_group = case_when(between(age, 0,59) ~ "35 - 59 years",
                               between(age, 60,69) ~ "60 - 69 years",
                               between(age, 70,99) ~ "70 - 85 years"))

surv_plot <- function(data, strat, plot_title) {
  
  dat <- data[!is.na(data[[strat]]),]
  dat$time <- dat$time/365
  labs <- sort(unique(dat[[strat]]))
  
  splot <- survminer::ggsurvplot(fit = surv_fit(as.formula(paste0("Surv(time, status) ~", strat)), data = dat), 
                                 data = dat,
                                 palette = c("#0B0405FF", "#40498EFF","#38AAACFF", "#DEF5E5FF"), 
                                 ggtheme = theme_bw(), xlim = c(0,2.5),
                                 break.time.by = 0.5,
                                 pval = TRUE,
                                 pval.method = TRUE,
                                 pval.size = 3.5,
                                 pval.coord = c(0,0.05),
                                 pval.method.coord = c(0,0.12),
                                 risk.table = "nrisk_cumcensor",
                                 risk.table.y.text = FALSE,
                                 risk.table.height = 0.3,
                                 tables.y.text = TRUE,
                                 fontsize = 3, size = 0.2,
                                 conf.int = TRUE,
                                 conf.int.alpha = 0.2,
                                 conf.int.style = "ribbon",
                                 censor = T, censor.shape = 124, censor.size = 2, legend.labs = labs)
  
  splot$plot <- splot$plot + 
    labs(title = paste(plot_title), fill = NULL, color = NULL, x = "Time from diagnosis (years)") +
    scale_y_continuous(labels = scales::percent_format())+
    guides(color = guide_legend(override.aes = list(shape = NA, linewidth=0.6)))+
    theme(panel.grid = element_blank(),
          panel.border = element_rect(color="black", fill=NA, linewidth=0.6),
          legend.justification = c(1,1),
          legend.key.size = unit(4,"mm"),
          legend.spacing.y = unit(0,"mm"),
          legend.background = element_rect(color="black", fill=NA, linewidth = 0.6*0.5),
          legend.position = c(1,1),
          legend.box.margin = margin(t = 0.6*(.pt*72.27/96/4), r = 0.6*(.pt*72.27/96/4)),
          axis.ticks = element_line(color="black"),
          axis.text = element_text(color="black"),
          axis.title = element_text(size = 10, color="black"),
          plot.background = element_rect(fill="white"),
          plot.tag.position = c(0,1))
  
  splot$table <- splot$table +
    labs(title="No. at risk (No. censored)", x = NULL, y = NULL) + 
    coord_cartesian(clip = "off", xlim=c(0,2.5),ylim=c(0,3)) +
    theme(panel.grid = element_blank(),
          plot.title = element_text(size=10),
          panel.border = element_rect(color = NA),
          axis.ticks.x = element_blank(),
          axis.text.x = element_blank())
  splot
}

pl <- list(surv_plot(data = lung2, strat = "age_group", plot_title = "OS - age group"),
           surv_plot(data = lung2, strat = "sex", plot_title = "OS - sex"))

pl[[2]]$table <- pl[[2]]$table + coord_cartesian(clip = "off", xlim=c(0,2.5),ylim=c(-1,2))

p <- pl[[1]]$plot + pl[[2]]$plot + pl[[1]]$table + pl[[2]]$table +
  plot_layout(design = "AB\nCD", height=c(1,0.3))

p

Coding OS and PFS

Genereally to code the status and time variables 3 dates are needed

Date of start (e.g. diagnosis)
Date of event (e.g. death)
Date of end of study (No dates should be later than this)

In general the status indicator is coded 1 if there is a date of event, and 0 if this is missing. The time variable is coded as time difference between date of start and date of event for all that has the event, and date of start and date of end of study for everyone else.

This could look something like this: (the example is from the Simulacrum data)

data |> 
  mutate(
    d_date    = if_else(NEWVITALSTATUS == "D", VITALSTATUSDATE, NA),
    delta_OS  = if_else(is.na(d_date), 0 ,1),
    time_OS   = case_when(delta_OS == 1 ~ interval(DIAGNOSISDATE, d_date)      / months(1),
                          delta_OS == 0 ~ interval(DIAGNOSISDATE, VITALSTATUSDATE) / months(1)),
    progr     = if_else(is.na(DATE_FIRST_SURGERY), 0, 1),
    delta_PFS = if_else(delta_OS == 1 | progr == 1, 1, 0),
    time_PFS  = case_when(delta_OS == 1 & progr == 1 ~ interval(DIAGNOSISDATE, DATE_FIRST_SURGERY)  / months(1),
                          delta_OS == 1 & progr == 0 ~ interval(DIAGNOSISDATE, d_date)             / months(1),
                          delta_OS == 0 & progr == 1 ~ interval(DIAGNOSISDATE, DATE_FIRST_SURGERY) / months(1),
                          delta_OS == 0 & progr == 0 ~ interval(DIAGNOSISDATE, VITALSTATUSDATE)    / months(1)))

Trial data from PDS

Project data sphere provides real trial data that can be used for experimenting. We load the data from a SAS file.

pds <- read_sas("AMG_data/PDS_DSA_20050203/adsl_pds2019.sas7bdat", NULL) |> 
  select(-ATRT) |> 
  mutate(DTHMO = round(DTHDY/30.43,1),
         PFSMOCR = round(PFSDYCR/30.43,1))

To include the KRAS subtype we load the dataset with biomarker data.

biomark <- read_sas("AMG_data/PDS_DSA_20050203/biomark_pds2019.sas7bdat", NULL) |> 
  select(SUBJID, BMMTR1)

pds <- pds |> left_join(biomark) |> 
  mutate(KRAS = case_when(BMMTR1 == "Wild-type" ~ "WT",
                          BMMTR1 == "Mutant"    ~ "MT",
                          TRUE                  ~ "Missing"))

pds_WT <- pds |> filter(KRAS=="WT")

Here we can see some of the data.

pds |> select(-DTHDY,-PFSDYCR,-BMMTR1,-B_ECOG,-HISSUBTY,-DIAGMONS) |> 
  reactable(bordered = T, defaultPageSize = 6,
            theme = reactableTheme(style = list(fontSize = "14px")), 
            fullWidth = T, defaultColDef = colDef(minWidth = 40),
            columns = list(TRT = colDef(width = 100), RACE = colDef(width = 80)))

OS estimate

survfit(Surv(DTHMO, DTH) ~ TRT, data = pds)

Call: survfit(formula = Surv(DTHMO, DTH) ~ TRT, data = pds)

                           n events median 0.95LCL 0.95UCL
TRT=FOLFOX alone         466    350   19.4    17.4    21.8
TRT=Panitumumab + FOLFOX 469    340   19.6    18.0    21.7

Here we see that for the whole cohort there is no difference in OS.

survfit(Surv(DTHMO, DTH) ~ TRT, data = pds_WT)

Call: survfit(formula = Surv(DTHMO, DTH) ~ TRT, data = pds_WT)

                           n events median 0.95LCL 0.95UCL
TRT=FOLFOX alone         254    176   20.5    17.8    23.6
TRT=Panitumumab + FOLFOX 260    170   23.9    20.3    28.0

However when we only look at WT patients the median survival is 3.4 months longer.

PFS estimate

survfit(Surv(PFSMOCR, PFSCR) ~ TRT, data = pds_WT)

Call: survfit(formula = Surv(PFSMOCR, PFSCR) ~ TRT, data = pds_WT)

                           n events median 0.95LCL 0.95UCL
TRT=FOLFOX alone         254    213    9.2     7.7    10.9
TRT=Panitumumab + FOLFOX 260    212    9.6     9.2    11.3

We find 0.4 months longer median progression free survival for the treated group.

Plotting KM-curves

fit <- survfit(Surv(DTHMO, DTH) ~ TRT, data = pds_WT)
p1 <- ggsurvplot(fit, risk.table = T, risk.table.y.text = F, break.time.by = 6, xlim = c(0, 48),
                 censor = T, censor.shape = 124, censor.size = 2, palette=RColorBrewer::brewer.pal(2,"Paired"))

fit <- survfit(Surv(PFSMOCR, PFSCR) ~ TRT, data = pds_WT)
p2 <- ggsurvplot(fit, risk.table = T,risk.table.y.text = F, break.time.by = 6, xlim = c(0, 48),
                 censor = T, censor.shape = 124, censor.size = 2, palette=RColorBrewer::brewer.pal(2,"Paired"))

p1$plot + labs(title="OS") + p2$plot + labs(title="PFS") + p1$table + p2$table + 
  plot_layout(height = c(1, 0.3)) & theme(legend.position=c(0.75,0.9))

Cox-PH

coxph(Surv(DTHMO,DTH) ~ TRT, data = pds_WT) |> summary()

Call:
coxph(formula = Surv(DTHMO, DTH) ~ TRT, data = pds_WT)

  n= 514, number of events= 346 

                           coef exp(coef) se(coef)      z Pr(>|z|)
TRTPanitumumab + FOLFOX -0.1149    0.8915   0.1077 -1.066    0.286

                        exp(coef) exp(-coef) lower .95 upper .95
TRTPanitumumab + FOLFOX    0.8915      1.122    0.7219     1.101

Concordance= 0.515  (se = 0.014 )
Likelihood ratio test= 1.14  on 1 df,   p=0.3
Wald test            = 1.14  on 1 df,   p=0.3
Score (logrank) test = 1.14  on 1 df,   p=0.3

We can also output the Cox model as a forrest plot:

forest_model(coxph(Surv(DTHMO,DTH) ~ TRT + AGE + SEX + RACE + B_ECOG + KRAS + LIVERMET, data = pds), 
             format_options = forest_model_format_options(banded=F))

Extrapolating survival

dist_list <- c("exp", "weibull", "gompertz", "lnorm", "llogis", "gamma")
fit_results <- list()

for (d in dist_list) {
  fit_results[[d]] <- flexsurvreg(Surv(DTHMO, DTH) ~ TRT, data = pds_WT, dist = d)
}

map_df(fit_results, function(x) c(x$AIC, x$BIC)) |> signif(5) |> mutate(fit=c("AIC","BIC"), .before=1) |> 
  reactable(bordered=T)

For AIC/BIC lower is better so according to these model fit metrics a log normal distribution fitted to the hazard gives the best fit. However we know that this is likely unrealistic as this is also the most optimistic in relation to long term survival. Something like Weibull distribution is likely more realistic, depending on the disease.

Fitted survival functions

surv_data <- imap_dfr(fit_results, function(mod, name) {
  summary(mod, type="survival", tidy = TRUE, t = seq(0, 120, by = 1)) |> mutate(DIST = name)
})

surv_data |> ggplot(aes(time, est, color=DIST, linetype=TRT)) + 
  geom_line() + labs(title = "Survival functions across fits",x="Months",y=NULL) + 
  scale_x_continuous(breaks=seq(0,120,12)) + 
  scale_y_continuous(labels=scales::percent_format()) + theme_classic()

Fitted hazard functions

hazard_data <- imap_dfr(fit_results, function(mod, name) {
  summary(mod, type = "hazard", tidy = TRUE, t = seq(0, 120, by = 1)) |> mutate(DIST = name)
})

ggplot(hazard_data, aes(x = time, y = est, color = DIST, linetype=TRT)) +
  geom_line() + labs(title = "Hazard functions across fits",x="Months",y="Hazard") + coord_cartesian(ylim=c(0,0.15)) + 
  scale_x_continuous(breaks=seq(0,120,12)) + theme_classic()

KM-curve with fitted hazard (llogis)

Show code

fit_results_PFS <- list()

for (d in dist_list) {
  fit_results_PFS[[d]] <- flexsurvreg(Surv(PFSMOCR, PFSCR) ~ TRT, data = pds_WT|> filter(PFSDYCR!=0), dist = d)
}

fit <- survfit(Surv(DTHMO,DTH) ~ TRT, data = pds_WT)
pred_surv <- summary(fit_results[["llogis"]], t = seq(0, 120, by = 1), tidy = TRUE)

p <- ggsurvplot(fit, palette=RColorBrewer::brewer.pal(2,"Paired"),
                data = pds_WT,
                xlim = c(0, 120),
                break.time.by = 12,
                legend.labs = c("FOLFOX alone", "Panitumumab + FOLFOX"),
                conf.int = TRUE,
                conf.int.alpha = 0.2,
                conf.int.style = "ribbon",
                censor = F)

p$plot <- p$plot + 
  geom_line(data = pred_surv, aes(x = time, y = est, color = TRT), 
            linetype = "dashed", linewidth = 0.4) + labs(title="OS", color=NULL,fill=NULL) +
  scale_y_continuous(labels=scales::percent_format())

fit <- survfit(Surv(PFSMOCR,PFSCR) ~ TRT, data = pds_WT)
pred_surv <- summary(fit_results_PFS[["llogis"]], t = seq(0, 120, by = 1), tidy = TRUE)

p2 <- ggsurvplot(fit, palette=RColorBrewer::brewer.pal(2,"Paired"),
                data = pds_WT,
                xlim = c(0, 120),
                break.time.by = 12,
                legend.labs = c("FOLFOX alone", "Panitumumab + FOLFOX"),
                conf.int = TRUE,
                conf.int.alpha = 0.2,
                conf.int.style = "ribbon",
                censor = F)

p2$plot <- p2$plot + 
  geom_line(data = pred_surv, aes(x = time, y = est, color = TRT), 
            linetype = "dashed", linewidth = 0.4) + labs(title="PFS", color=NULL,fill=NULL) +
  scale_y_continuous(labels=scales::percent_format())
p$plot + p2$plot  & theme(legend.position=c(0.75,0.9))

Partitioned survival model

To make the data fit into the format of the hesim package we need to do some data management. I make some helper functions to reverse the changes i make.

# what i assume are people included on progression the PFS can't be calculated, these 16 individuals are excluded
pds_ex <- pds_WT |> filter(PFSDYCR!=0)

end_data <- pds_ex |> 
  mutate(strategy_id = if_else(TRT=="FOLFOX alone", 2, 1)) |> 
  select(age = AGE, sex = SEX, strategy_id,
         endpoint1_time = DTHMO,   endpoint1_status = DTH, 
         endpoint2_time = PFSMOCR, endpoint2_status = PFSCR)

add_strategy <- function(x) { x |> mutate(strategy_id = case_when(strategy_id == 1 ~ "Panitumumab + FOLFOX",
                                                                  strategy_id == 2 ~ "FOLFOX alone")) }
add_state <- function(x) { x |> mutate(state_id = case_when(state_id == 1 ~ "Progression free",
                                                            state_id == 2 ~ "Progressed",
                                                            state_id == 3 ~ "Dead")) }
add_curve <- function(x) { x |> mutate(curve = case_when(curve == 1 ~ "PFS",
                                                         curve == 2 ~ "OS")) }

I have to define patients (which i wont use here), strategies and states. These are defined from the trial. I collect all in the hesim_dat object.

patients   <- data.frame(patient_id = c(1))
strategies <- data.frame(strategy_id = c(1,2), strategy_name = c("Panitumumab + FOLFOX", "FOLFOX alone"))
states     <- data.frame(state_id = c(1,2), state_name = c("Progression free","Progressed"))

hesim_dat <- hesim_data(patients = patients, strategies = strategies, states = states)

I fit the survival using flexsurv and the log logistic hazard.

fit_os  <- flexsurvreg(Surv(endpoint1_time, endpoint1_status) ~ strategy_id, data = end_data, dist = "llogis")
fit_pfs <- flexsurvreg(Surv(endpoint2_time, endpoint2_status) ~ strategy_id, data = end_data, dist = "llogis")

psfit_wei <- flexsurvreg_list(fit_pfs, fit_os)

I define utility. Unfortunately they didn’t publizice the utility data from this trial, so this is based on an assumption.

utility_tbl <- stateval_tbl(tbl = data.frame(state_id = states$state_id, min = c(0.8,0.7), max = c(0.9,0.8)),
                            dist = "unif")

I define cost. This is based on the drug cost as i can find is around 80.000 USD for an average patient, this fits with around 4000 per month.

drugcost_tbl <- stateval_tbl(tbl = data.frame(strategy_id = strategies$strategy_id, est = c(5000, 1000)), 
                             dist = "fixed")

Some preparing the model

surv_input_data <- expand(hesim_dat, by = c("strategies", "patients"))
survmods <- create_PsmCurves(psfit_wei, input_data = surv_input_data, n = 500,
                             uncertainty = "bootstrap", est_data = end_data)

utilitymod <- create_StateVals(utility_tbl, n = 500, hesim_data = hesim_dat)
drugcostmod <- create_StateVals(drugcost_tbl, n = 500, hesim_data = hesim_dat)

psm <- Psm$new(survival_models = survmods,
               utility_model = utilitymod,
               cost_models = list(drug = drugcostmod))

The hesimpackage uses some fancy Object Oriented Programming (OOP) this is not very common in R outside internal package use, however here are some of it. This means that if this looks weird as this is “mutable” compared to the functional programming approach of “immutable” code, that is the code will have no side-effects (a function should only take inputs and return an output).

psm$sim_survival(t = seq(0, 120, by = 1))
psm$sim_stateprobs()
psm$sim_costs(dr = 0.035)
psm$sim_qalys(dr = 0.035)

Fitted curves

Not that the model is simulated we can make some output, first lets take a look of the bootstrapped survival data:

psm$survival_ |> 
  add_curve() |> add_strategy() |> 
  ggplot(aes(x=t,y=survival, color=curve)) + 
  geom_jitter(size=0.5) + facet_wrap(~strategy_id) + labs(title="Bootstrapped survival") +
  scale_y_continuous(labels=scales::percent_format())

psm$survival_ |> 
  add_curve() |> add_strategy() |> 
  group_by(strategy_id, t,curve) |> summarize(survival = mean(survival)) |> 
  ggplot(aes(x=t,y=survival, color=curve, fill=curve, linetype=strategy_id)) + 
  geom_line(size=0.5) + labs(title="Mean of bootstrapped survival") +
  scale_y_continuous(labels=scales::percent_format())

State diagram

psm$stateprobs_ |> add_strategy() |> add_state() |> 
  group_by(strategy_id, t,state_id) |> summarize(prob = mean(prob)) |> 
  ggplot(aes(y=prob, x=t, color=state_id)) + geom_line(size=1) + facet_wrap(~strategy_id) + 
  labs(title="State diagram", x="Months", y=NULL, color=NULL) +
  scale_y_continuous(labels = scales::percent_format())

psm$stateprobs_ |> add_strategy() |> add_state() |>  
  group_by(strategy_id, t, state_id) |> summarize(prob = mean(prob)) |> 
  ggplot(aes(y=prob, x=t, fill=state_id)) + geom_area(alpha=0.6) + 
  facet_wrap(~strategy_id) + labs(title="State diagram", x="Months", y=NULL,fill=NULL) +
  scale_y_continuous(labels = scales::percent_format())

CE-summary

We can run the cost-effectiveness analysis and return a summarized dataframe.

ce_sim <- psm$summarize()
cea_out <- cea(ce_sim, dr_qalys = 0.035, dr_costs = 0.035, k = seq(0, 300000, 1000))
cea_pw_out <- cea_pw(ce_sim, comparator = 2, dr_qalys = 0.035, dr_costs = 0.035) 


reactable(signif(cea_pw_out$summary,4)|> add_strategy(),bordered = T)

reactable(signif(cea_out$summary,4)|> add_strategy(),bordered = T)

CE-plane

#plot_ceplane(cea_pw_out, k = 100000)

cea_pw_out$delta |> add_strategy() |> 
  ggplot(aes(x=ie,y=ic)) + geom_point(color="#1F78B4") + 
  geom_hline(yintercept = 0) + geom_vline(xintercept = 0) + 
  geom_abline(intercept = 0, slope = 100000,linetype="dashed") +
  scale_y_continuous(limits=c(-15000,90000), breaks=seq(-15000,100000,15000), labels = scales::dollar_format()) +
  scale_x_continuous(limits=c(-1,3), breaks=seq(-5,5,0.5))+
  labs(x="Incremental QALYs",y="Incremental cost", title="ICER plane")

CEAC

cea_out$mce |> add_strategy() |> 
  ggplot(aes(x=k,y=prob,color=strategy_id)) + geom_line(size=1) +
  scale_y_continuous(labels = scales::percent_format()) +
  labs(y="Cost-effective", x="WTP",color=NULL, title="CEAC") +
  scale_x_continuous(breaks=seq(0,300000,50000), labels = scales::dollar_format())

Assignments

Load the data
Plot a histogram of age by treatment group
Estimate median OS for each group with KM
Plot a KM curve (try stratifying by RACE or ECOG)
Fit a coxph model adjusting for basic covariates
using flexsurv to fit different distributions (see AIC and BIC for best fit)
Plot extrapolation
Fit a PSM