library(tidyverse)
library(patchwork)
library(broom)
library(survival)
library(flexsurv)
library(hesim)
library(survminer)
library(haven)
library(reactable)
library(forestmodel)
th <- theme(panel.grid = element_blank())
scale_colour_discrete <- function(...) { scale_colour_brewer(..., palette = "Paired") }
scale_fill_discrete <- function(...) { scale_fill_brewer(..., palette = "Paired") }Introduction
This document demonstrates:
- Basic survival analysis in R
- Code OS and PFS from trial data
- Plot a survival curve
- Fit the Kaplan-Meier estimator
- Fit a Cox proportional hazard model
- How to fit and extrapolate parametric survival models using
flexsurv - How to select best fit
- How to perform partitioned survival analysis (PartSA or PSM) using
hesim- Adding cost, effect (survival) and utility data
- Extract incrementals and ICER
- Plot ICER-plane and CEAC curve
This is relevant as most HTAs are based on cancer and ~ 70% of HTAs within cancer is based on partitioned survival models in places like NICE or DMC.
This example is based on real trial data from Project Data Sphere (PDS) for the clinical trial: the PRIME study - Panitumumab AMG 954 20050203 EudraCT: 2006-000170-70.
Protocol: A Randomized, Multicenter, Phase 3 Study to Compare the Efficacy of Panitumumab in Combination with Oxaliplatin/ 5-fluorouracil/ leucovorin (FOLFOX) to the Efficacy of Oxaliplatin/ 5-fluorouracil/ leucovorin (FOLFOX) Alone in Patients with Previously Untreated Metastatic Colorectal Cancer
- Primary results 2010: 10.1200/jco.2009.27.4860
- Final results 2014:10.1093/annonc/mdu141
It is important to note that the trial found a effect within the KRAS stratum of wild-type (WT). Because of this the data is mostly analyzed only within this subgroup (60% of the population). Additionally the used data only includes 80% of the trial data to conserve anonymity. The trial protocol can be found through PDS
R
Kaplan-Meier survival function
Time-to-event data are common in epidemiology and health economics
- Time from diagnosis to death (OS)
- Time from diagnosis to progression/death (PFS)
There are other endpoints (RFS, DFS, TTP, CR, MRD, etc.) but these two are the most common
The estimator of the survival function \(S(t)\) (the probability that life is longer than \(t\)) is given by:
\[\hat S(t) = \prod_{i:t_i\leq t} \left( 1-\frac{d_i}{n_i}\right)\]
with \(t_i\) a time when at least one event happened, \(d_i\) the number of events (e.g., deaths) that happened at time \(t_i\), and \(n_i\) the individuals known to have survived (have not yet had an event or been censored) up to time \(t_i\).
S <- function(data, time, status) {
t_i <- data[["time"]] |> sort() |> unique()
n_i <- map_int(t_i, ~ nrow(data[data[["time"]] >= .x,]))
d_i <- map_int(t_i, ~ nrow(data[data[["time"]] == .x & data[["status"]] == 2,]))
tibble(t_i, n_i, d_i, survival = cumprod(1 - d_i / n_i))
}Example KM
First we need some data, the dataset lung can always be called in R
head(lung) inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
1 3 306 2 74 1 1 90 100 1175 NA
2 3 455 2 68 1 0 90 90 1225 15
3 3 1010 1 56 1 0 90 90 NA 15
4 5 210 2 57 1 1 90 60 1150 11
5 1 883 2 60 1 0 100 90 NA 0
6 12 1022 1 74 1 1 50 80 513 0
This dataset contains a time (survival) and status (censoring) variable which is what we need to plot a survival curve. We can try our function with this data. aditionally i will use the common way to perform KM fit in R using the survfit function.
S(time = time, status = status, data = lung) |>
head()# A tibble: 6 × 4
t_i n_i d_i survival
<dbl> <int> <int> <dbl>
1 5 228 1 0.996
2 11 227 3 0.982
3 12 224 1 0.978
4 13 223 2 0.969
5 15 221 1 0.965
6 26 220 1 0.961
S(time = time, status = status, data = lung) |>
ggplot(aes(t_i, survival)) +
geom_step()survfit(Surv(time, status) ~ 1, data = lung) |>
tidy() |> select(1:3,5) |> head()# A tibble: 6 × 4
time n.risk n.event estimate
<dbl> <dbl> <dbl> <dbl>
1 5 228 1 0.996
2 11 227 3 0.982
3 12 224 1 0.978
4 13 223 2 0.969
5 15 221 1 0.965
6 26 220 1 0.961
survfit(Surv(time, status) ~ 1, lung) |>
tidy() |> ggplot(aes(time, estimate)) +
geom_step()However, when plotting survival curves we often want some more info like a risk table. Therefore i use the package survminer. As a standard it is a bit simple, but it can be customized alot to create nice survival curves like below.
Nice survival curves
see here for code example on Linkedin
Show code
lung2 <- lung |>
mutate(sex = if_else(sex == 1, "Male","Female"),
age_group = case_when(between(age, 0,59) ~ "35 - 59 years",
between(age, 60,69) ~ "60 - 69 years",
between(age, 70,99) ~ "70 - 85 years"))
surv_plot <- function(data, strat, plot_title) {
dat <- data[!is.na(data[[strat]]),]
dat$time <- dat$time/365
labs <- sort(unique(dat[[strat]]))
splot <- survminer::ggsurvplot(fit = surv_fit(as.formula(paste0("Surv(time, status) ~", strat)), data = dat),
data = dat,
palette = c("#0B0405FF", "#40498EFF","#38AAACFF", "#DEF5E5FF"),
ggtheme = theme_bw(), xlim = c(0,2.5),
break.time.by = 0.5,
pval = TRUE,
pval.method = TRUE,
pval.size = 3.5,
pval.coord = c(0,0.05),
pval.method.coord = c(0,0.12),
risk.table = "nrisk_cumcensor",
risk.table.y.text = FALSE,
risk.table.height = 0.3,
tables.y.text = TRUE,
fontsize = 3, size = 0.2,
conf.int = TRUE,
conf.int.alpha = 0.2,
conf.int.style = "ribbon",
censor = T, censor.shape = 124, censor.size = 2, legend.labs = labs)
splot$plot <- splot$plot +
labs(title = paste(plot_title), fill = NULL, color = NULL, x = "Time from diagnosis (years)") +
scale_y_continuous(labels = scales::percent_format())+
guides(color = guide_legend(override.aes = list(shape = NA, linewidth=0.6)))+
theme(panel.grid = element_blank(),
panel.border = element_rect(color="black", fill=NA, linewidth=0.6),
legend.justification = c(1,1),
legend.key.size = unit(4,"mm"),
legend.spacing.y = unit(0,"mm"),
legend.background = element_rect(color="black", fill=NA, linewidth = 0.6*0.5),
legend.position = c(1,1),
legend.box.margin = margin(t = 0.6*(.pt*72.27/96/4), r = 0.6*(.pt*72.27/96/4)),
axis.ticks = element_line(color="black"),
axis.text = element_text(color="black"),
axis.title = element_text(size = 10, color="black"),
plot.background = element_rect(fill="white"),
plot.tag.position = c(0,1))
splot$table <- splot$table +
labs(title="No. at risk (No. censored)", x = NULL, y = NULL) +
coord_cartesian(clip = "off", xlim=c(0,2.5),ylim=c(0,3)) +
theme(panel.grid = element_blank(),
plot.title = element_text(size=10),
panel.border = element_rect(color = NA),
axis.ticks.x = element_blank(),
axis.text.x = element_blank())
splot
}
pl <- list(surv_plot(data = lung2, strat = "age_group", plot_title = "OS - age group"),
surv_plot(data = lung2, strat = "sex", plot_title = "OS - sex"))
pl[[2]]$table <- pl[[2]]$table + coord_cartesian(clip = "off", xlim=c(0,2.5),ylim=c(-1,2))
p <- pl[[1]]$plot + pl[[2]]$plot + pl[[1]]$table + pl[[2]]$table +
plot_layout(design = "AB\nCD", height=c(1,0.3))
pCoding OS and PFS
Genereally to code the status and time variables 3 dates are needed
- Date of start (e.g. diagnosis)
- Date of event (e.g. death)
- Date of end of study (No dates should be later than this)
In general the status indicator is coded 1 if there is a date of event, and 0 if this is missing. The time variable is coded as time difference between date of start and date of event for all that has the event, and date of start and date of end of study for everyone else.
This could look something like this: (the example is from the Simulacrum data)
data |>
mutate(
d_date = if_else(NEWVITALSTATUS == "D", VITALSTATUSDATE, NA),
delta_OS = if_else(is.na(d_date), 0 ,1),
time_OS = case_when(delta_OS == 1 ~ interval(DIAGNOSISDATE, d_date) / months(1),
delta_OS == 0 ~ interval(DIAGNOSISDATE, VITALSTATUSDATE) / months(1)),
progr = if_else(is.na(DATE_FIRST_SURGERY), 0, 1),
delta_PFS = if_else(delta_OS == 1 | progr == 1, 1, 0),
time_PFS = case_when(delta_OS == 1 & progr == 1 ~ interval(DIAGNOSISDATE, DATE_FIRST_SURGERY) / months(1),
delta_OS == 1 & progr == 0 ~ interval(DIAGNOSISDATE, d_date) / months(1),
delta_OS == 0 & progr == 1 ~ interval(DIAGNOSISDATE, DATE_FIRST_SURGERY) / months(1),
delta_OS == 0 & progr == 0 ~ interval(DIAGNOSISDATE, VITALSTATUSDATE) / months(1)))Trial data from PDS
Project data sphere provides real trial data that can be used for experimenting. We load the data from a SAS file.
pds <- read_sas("AMG_data/PDS_DSA_20050203/adsl_pds2019.sas7bdat", NULL) |>
select(-ATRT) |>
mutate(DTHMO = round(DTHDY/30.43,1),
PFSMOCR = round(PFSDYCR/30.43,1))To include the KRAS subtype we load the dataset with biomarker data.
biomark <- read_sas("AMG_data/PDS_DSA_20050203/biomark_pds2019.sas7bdat", NULL) |>
select(SUBJID, BMMTR1)
pds <- pds |> left_join(biomark) |>
mutate(KRAS = case_when(BMMTR1 == "Wild-type" ~ "WT",
BMMTR1 == "Mutant" ~ "MT",
TRUE ~ "Missing"))
pds_WT <- pds |> filter(KRAS=="WT")Here we can see some of the data.
pds |> select(-DTHDY,-PFSDYCR,-BMMTR1,-B_ECOG,-HISSUBTY,-DIAGMONS) |>
reactable(bordered = T, defaultPageSize = 6,
theme = reactableTheme(style = list(fontSize = "14px")),
fullWidth = T, defaultColDef = colDef(minWidth = 40),
columns = list(TRT = colDef(width = 100), RACE = colDef(width = 80)))OS estimate
survfit(Surv(DTHMO, DTH) ~ TRT, data = pds)Call: survfit(formula = Surv(DTHMO, DTH) ~ TRT, data = pds)
n events median 0.95LCL 0.95UCL
TRT=FOLFOX alone 466 350 19.4 17.4 21.8
TRT=Panitumumab + FOLFOX 469 340 19.6 18.0 21.7
Here we see that for the whole cohort there is no difference in OS.
survfit(Surv(DTHMO, DTH) ~ TRT, data = pds_WT)Call: survfit(formula = Surv(DTHMO, DTH) ~ TRT, data = pds_WT)
n events median 0.95LCL 0.95UCL
TRT=FOLFOX alone 254 176 20.5 17.8 23.6
TRT=Panitumumab + FOLFOX 260 170 23.9 20.3 28.0
However when we only look at WT patients the median survival is 3.4 months longer.
PFS estimate
survfit(Surv(PFSMOCR, PFSCR) ~ TRT, data = pds_WT)Call: survfit(formula = Surv(PFSMOCR, PFSCR) ~ TRT, data = pds_WT)
n events median 0.95LCL 0.95UCL
TRT=FOLFOX alone 254 213 9.2 7.7 10.9
TRT=Panitumumab + FOLFOX 260 212 9.6 9.2 11.3
We find 0.4 months longer median progression free survival for the treated group.
Plotting KM-curves
fit <- survfit(Surv(DTHMO, DTH) ~ TRT, data = pds_WT)
p1 <- ggsurvplot(fit, risk.table = T, risk.table.y.text = F, break.time.by = 6, xlim = c(0, 48),
censor = T, censor.shape = 124, censor.size = 2, palette=RColorBrewer::brewer.pal(2,"Paired"))
fit <- survfit(Surv(PFSMOCR, PFSCR) ~ TRT, data = pds_WT)
p2 <- ggsurvplot(fit, risk.table = T,risk.table.y.text = F, break.time.by = 6, xlim = c(0, 48),
censor = T, censor.shape = 124, censor.size = 2, palette=RColorBrewer::brewer.pal(2,"Paired"))
p1$plot + labs(title="OS") + p2$plot + labs(title="PFS") + p1$table + p2$table +
plot_layout(height = c(1, 0.3)) & theme(legend.position=c(0.75,0.9))Cox-PH
coxph(Surv(DTHMO,DTH) ~ TRT, data = pds_WT) |> summary()Call:
coxph(formula = Surv(DTHMO, DTH) ~ TRT, data = pds_WT)
n= 514, number of events= 346
coef exp(coef) se(coef) z Pr(>|z|)
TRTPanitumumab + FOLFOX -0.1149 0.8915 0.1077 -1.066 0.286
exp(coef) exp(-coef) lower .95 upper .95
TRTPanitumumab + FOLFOX 0.8915 1.122 0.7219 1.101
Concordance= 0.515 (se = 0.014 )
Likelihood ratio test= 1.14 on 1 df, p=0.3
Wald test = 1.14 on 1 df, p=0.3
Score (logrank) test = 1.14 on 1 df, p=0.3
We can also output the Cox model as a forrest plot:
forest_model(coxph(Surv(DTHMO,DTH) ~ TRT + AGE + SEX + RACE + B_ECOG + KRAS + LIVERMET, data = pds),
format_options = forest_model_format_options(banded=F))Extrapolating survival
dist_list <- c("exp", "weibull", "gompertz", "lnorm", "llogis", "gamma")
fit_results <- list()
for (d in dist_list) {
fit_results[[d]] <- flexsurvreg(Surv(DTHMO, DTH) ~ TRT, data = pds_WT, dist = d)
}
map_df(fit_results, function(x) c(x$AIC, x$BIC)) |> signif(5) |> mutate(fit=c("AIC","BIC"), .before=1) |>
reactable(bordered=T)For AIC/BIC lower is better so according to these model fit metrics a log normal distribution fitted to the hazard gives the best fit. However we know that this is likely unrealistic as this is also the most optimistic in relation to long term survival. Something like Weibull distribution is likely more realistic, depending on the disease.
Fitted survival functions
surv_data <- imap_dfr(fit_results, function(mod, name) {
summary(mod, type="survival", tidy = TRUE, t = seq(0, 120, by = 1)) |> mutate(DIST = name)
})
surv_data |> ggplot(aes(time, est, color=DIST, linetype=TRT)) +
geom_line() + labs(title = "Survival functions across fits",x="Months",y=NULL) +
scale_x_continuous(breaks=seq(0,120,12)) +
scale_y_continuous(labels=scales::percent_format()) + theme_classic()Fitted hazard functions
hazard_data <- imap_dfr(fit_results, function(mod, name) {
summary(mod, type = "hazard", tidy = TRUE, t = seq(0, 120, by = 1)) |> mutate(DIST = name)
})
ggplot(hazard_data, aes(x = time, y = est, color = DIST, linetype=TRT)) +
geom_line() + labs(title = "Hazard functions across fits",x="Months",y="Hazard") + coord_cartesian(ylim=c(0,0.15)) +
scale_x_continuous(breaks=seq(0,120,12)) + theme_classic()KM-curve with fitted hazard (llogis)
Show code
fit_results_PFS <- list()
for (d in dist_list) {
fit_results_PFS[[d]] <- flexsurvreg(Surv(PFSMOCR, PFSCR) ~ TRT, data = pds_WT|> filter(PFSDYCR!=0), dist = d)
}
fit <- survfit(Surv(DTHMO,DTH) ~ TRT, data = pds_WT)
pred_surv <- summary(fit_results[["llogis"]], t = seq(0, 120, by = 1), tidy = TRUE)
p <- ggsurvplot(fit, palette=RColorBrewer::brewer.pal(2,"Paired"),
data = pds_WT,
xlim = c(0, 120),
break.time.by = 12,
legend.labs = c("FOLFOX alone", "Panitumumab + FOLFOX"),
conf.int = TRUE,
conf.int.alpha = 0.2,
conf.int.style = "ribbon",
censor = F)
p$plot <- p$plot +
geom_line(data = pred_surv, aes(x = time, y = est, color = TRT),
linetype = "dashed", linewidth = 0.4) + labs(title="OS", color=NULL,fill=NULL) +
scale_y_continuous(labels=scales::percent_format())
fit <- survfit(Surv(PFSMOCR,PFSCR) ~ TRT, data = pds_WT)
pred_surv <- summary(fit_results_PFS[["llogis"]], t = seq(0, 120, by = 1), tidy = TRUE)
p2 <- ggsurvplot(fit, palette=RColorBrewer::brewer.pal(2,"Paired"),
data = pds_WT,
xlim = c(0, 120),
break.time.by = 12,
legend.labs = c("FOLFOX alone", "Panitumumab + FOLFOX"),
conf.int = TRUE,
conf.int.alpha = 0.2,
conf.int.style = "ribbon",
censor = F)
p2$plot <- p2$plot +
geom_line(data = pred_surv, aes(x = time, y = est, color = TRT),
linetype = "dashed", linewidth = 0.4) + labs(title="PFS", color=NULL,fill=NULL) +
scale_y_continuous(labels=scales::percent_format())
p$plot + p2$plot & theme(legend.position=c(0.75,0.9))Partitioned survival model
To make the data fit into the format of the hesim package we need to do some data management. I make some helper functions to reverse the changes i make.
# what i assume are people included on progression the PFS can't be calculated, these 16 individuals are excluded
pds_ex <- pds_WT |> filter(PFSDYCR!=0)
end_data <- pds_ex |>
mutate(strategy_id = if_else(TRT=="FOLFOX alone", 2, 1)) |>
select(age = AGE, sex = SEX, strategy_id,
endpoint1_time = DTHMO, endpoint1_status = DTH,
endpoint2_time = PFSMOCR, endpoint2_status = PFSCR)add_strategy <- function(x) { x |> mutate(strategy_id = case_when(strategy_id == 1 ~ "Panitumumab + FOLFOX",
strategy_id == 2 ~ "FOLFOX alone")) }
add_state <- function(x) { x |> mutate(state_id = case_when(state_id == 1 ~ "Progression free",
state_id == 2 ~ "Progressed",
state_id == 3 ~ "Dead")) }
add_curve <- function(x) { x |> mutate(curve = case_when(curve == 1 ~ "PFS",
curve == 2 ~ "OS")) }I have to define patients (which i wont use here), strategies and states. These are defined from the trial. I collect all in the hesim_dat object.
patients <- data.frame(patient_id = c(1))
strategies <- data.frame(strategy_id = c(1,2), strategy_name = c("Panitumumab + FOLFOX", "FOLFOX alone"))
states <- data.frame(state_id = c(1,2), state_name = c("Progression free","Progressed"))
hesim_dat <- hesim_data(patients = patients, strategies = strategies, states = states)I fit the survival using flexsurv and the log logistic hazard.
fit_os <- flexsurvreg(Surv(endpoint1_time, endpoint1_status) ~ strategy_id, data = end_data, dist = "llogis")
fit_pfs <- flexsurvreg(Surv(endpoint2_time, endpoint2_status) ~ strategy_id, data = end_data, dist = "llogis")
psfit_wei <- flexsurvreg_list(fit_pfs, fit_os)I define utility. Unfortunately they didn’t publizice the utility data from this trial, so this is based on an assumption.
utility_tbl <- stateval_tbl(tbl = data.frame(state_id = states$state_id, min = c(0.8,0.7), max = c(0.9,0.8)),
dist = "unif")I define cost. This is based on the drug cost as i can find is around 80.000 USD for an average patient, this fits with around 4000 per month.
drugcost_tbl <- stateval_tbl(tbl = data.frame(strategy_id = strategies$strategy_id, est = c(5000, 1000)),
dist = "fixed")Some preparing the model
surv_input_data <- expand(hesim_dat, by = c("strategies", "patients"))
survmods <- create_PsmCurves(psfit_wei, input_data = surv_input_data, n = 500,
uncertainty = "bootstrap", est_data = end_data)
utilitymod <- create_StateVals(utility_tbl, n = 500, hesim_data = hesim_dat)
drugcostmod <- create_StateVals(drugcost_tbl, n = 500, hesim_data = hesim_dat)
psm <- Psm$new(survival_models = survmods,
utility_model = utilitymod,
cost_models = list(drug = drugcostmod))The hesimpackage uses some fancy Object Oriented Programming (OOP) this is not very common in R outside internal package use, however here are some of it. This means that if this looks weird as this is “mutable” compared to the functional programming approach of “immutable” code, that is the code will have no side-effects (a function should only take inputs and return an output).
psm$sim_survival(t = seq(0, 120, by = 1))
psm$sim_stateprobs()
psm$sim_costs(dr = 0.035)
psm$sim_qalys(dr = 0.035)Fitted curves
Not that the model is simulated we can make some output, first lets take a look of the bootstrapped survival data:
psm$survival_ |>
add_curve() |> add_strategy() |>
ggplot(aes(x=t,y=survival, color=curve)) +
geom_jitter(size=0.5) + facet_wrap(~strategy_id) + labs(title="Bootstrapped survival") +
scale_y_continuous(labels=scales::percent_format())psm$survival_ |>
add_curve() |> add_strategy() |>
group_by(strategy_id, t,curve) |> summarize(survival = mean(survival)) |>
ggplot(aes(x=t,y=survival, color=curve, fill=curve, linetype=strategy_id)) +
geom_line(size=0.5) + labs(title="Mean of bootstrapped survival") +
scale_y_continuous(labels=scales::percent_format())State diagram
psm$stateprobs_ |> add_strategy() |> add_state() |>
group_by(strategy_id, t,state_id) |> summarize(prob = mean(prob)) |>
ggplot(aes(y=prob, x=t, color=state_id)) + geom_line(size=1) + facet_wrap(~strategy_id) +
labs(title="State diagram", x="Months", y=NULL, color=NULL) +
scale_y_continuous(labels = scales::percent_format())psm$stateprobs_ |> add_strategy() |> add_state() |>
group_by(strategy_id, t, state_id) |> summarize(prob = mean(prob)) |>
ggplot(aes(y=prob, x=t, fill=state_id)) + geom_area(alpha=0.6) +
facet_wrap(~strategy_id) + labs(title="State diagram", x="Months", y=NULL,fill=NULL) +
scale_y_continuous(labels = scales::percent_format())CE-summary
We can run the cost-effectiveness analysis and return a summarized dataframe.
ce_sim <- psm$summarize()
cea_out <- cea(ce_sim, dr_qalys = 0.035, dr_costs = 0.035, k = seq(0, 300000, 1000))
cea_pw_out <- cea_pw(ce_sim, comparator = 2, dr_qalys = 0.035, dr_costs = 0.035)
reactable(signif(cea_pw_out$summary,4)|> add_strategy(),bordered = T)reactable(signif(cea_out$summary,4)|> add_strategy(),bordered = T)CE-plane
#plot_ceplane(cea_pw_out, k = 100000)
cea_pw_out$delta |> add_strategy() |>
ggplot(aes(x=ie,y=ic)) + geom_point(color="#1F78B4") +
geom_hline(yintercept = 0) + geom_vline(xintercept = 0) +
geom_abline(intercept = 0, slope = 100000,linetype="dashed") +
scale_y_continuous(limits=c(-15000,90000), breaks=seq(-15000,100000,15000), labels = scales::dollar_format()) +
scale_x_continuous(limits=c(-1,3), breaks=seq(-5,5,0.5))+
labs(x="Incremental QALYs",y="Incremental cost", title="ICER plane")CEAC
cea_out$mce |> add_strategy() |>
ggplot(aes(x=k,y=prob,color=strategy_id)) + geom_line(size=1) +
scale_y_continuous(labels = scales::percent_format()) +
labs(y="Cost-effective", x="WTP",color=NULL, title="CEAC") +
scale_x_continuous(breaks=seq(0,300000,50000), labels = scales::dollar_format())Assignments
- Load the data
- Plot a histogram of age by treatment group
- Estimate median OS for each group with KM
- Plot a KM curve (try stratifying by RACE or ECOG)
- Fit a coxph model adjusting for basic covariates
- using flexsurv to fit different distributions (see AIC and BIC for best fit)
- Plot extrapolation
- Fit a PSM