Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Survival analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Cox proportional hazards (PH) regression analysis=== Kaplan–Meier curves and log-rank tests are most useful when the predictor variable is categorical (e.g., drug vs. placebo), or takes a small number of values (e.g., drug doses 0, 20, 50, and 100 mg/day) that can be treated as categorical. The log-rank test and KM curves don't work easily with quantitative predictors such as gene expression, white blood count, or age. For quantitative predictor variables, an alternative method is [[Proportional hazards model#The Cox model|Cox proportional hazards regression]] analysis. Cox PH models work also with categorical predictor variables, which are encoded as {0,1} indicator or dummy variables. The log-rank test is a special case of a Cox PH analysis, and can be performed using Cox PH software. ====Example: Cox proportional hazards regression analysis for melanoma==== This example uses the melanoma data set from Dalgaard Chapter 14. <ref name="Dalgaard2008">{{Citation |last1= Dalgaard |first1= Peter |title= Introductory Statistics with R |edition=Second |year=2008 |publisher= Springer |isbn= 978-0387790534 }} </ref> Data are in the R package ISwR. The Cox proportional hazards regression using{{nbsp}}R gives the results shown in the box. [[File:Cox proportional hazards regression output for melanoma data set.png|thumb|400px|right|Cox proportional hazards regression output for melanoma data. Predictor variable is sex 1: female, 2: male.]] The Cox regression results are interpreted as follows. *Sex is encoded as a numeric vector (1: female, 2: male). The R{{nbsp}}summary for the Cox model gives the hazard ratio (HR) for the second group relative to the first group, that is, male versus female. *coef = 0.662 is the estimated logarithm of the hazard ratio for males versus females. *exp(coef) = 1.94 = exp(0.662) - The log of the hazard ratio (coef= 0.662) is transformed to the hazard ratio using exp(coef). The summary for the Cox model gives the hazard ratio for the second group relative to the first group, that is, male versus female. The estimated hazard ratio of 1.94 indicates that males have higher risk of death (lower survival rates) than females, in these data. *se(coef) = 0.265 is the standard error of the log hazard ratio. *z = 2.5 = coef/se(coef) = 0.662/0.265. Dividing the coef by its standard error gives the z score. *p=0.013. The p-value corresponding to z=2.5 for sex is p=0.013, indicating that there is a significant difference in survival as a function of sex. The summary output also gives upper and lower 95% confidence intervals for the hazard ratio: lower 95% bound = 1.15; upper 95% bound = 3.26. Finally, the output gives p-values for three alternative tests for overall significance of the model: *Likelihood ratio test = 6.15 on 1 df, p=0.0131 *Wald test = 6.24 on 1 df, p=0.0125 *Score (log-rank) test = 6.47 on 1 df, p=0.0110 These three tests are asymptotically equivalent. For large enough N, they will give similar results. For small N, they may differ somewhat. The last row, "Score (logrank) test" is the result for the log-rank test, with p=0.011, the same result as the log-rank test, because the log-rank test is a special case of a Cox PH regression. The Likelihood ratio test has better behavior for small sample sizes, so it is generally preferred. ====Cox model using a covariate in the melanoma data==== The Cox model extends the log-rank test by allowing the inclusion of additional covariates.<ref>{{Cite journal |last1=Saegusa |first1=Takumi |last2=Di |first2=Chongzhi |last3=Chen |first3=Ying Qing |date=September 2014 |title=Hypothesis testing for an extended cox model with time-varying coefficients |journal=Biometrics |language=en |volume=70 |issue=3 |pages=619–628 |doi=10.1111/biom.12185 |pmid=24888739 |issn=0006-341X|pmc=4247822 }}</ref> This example use the melanoma data set where the predictor variables include a continuous covariate, the thickness of the tumor (variable name = "thick"). [[File:Histograms of melanoma thickness.png|thumb|700px|Histograms of melanoma tumor thickness]] In the histograms, the thickness values are [[Skewness|positively skewed]] and do not have a [[Normal distribution|Gaussian]]-like, [[Symmetric probability distribution]]. Regression models, including the Cox model, generally give more reliable results with normally-distributed variables.{{Citation needed|date=February 2023}} For this example we may use a [[logarithm]]ic transform. The log of the thickness of the tumor looks to be more normally distributed, so the Cox models will use log thickness. The Cox PH analysis gives the results in the box. [[File:Cox PH output for melanoma with thickness.png|thumb|500px|Cox PH output for melanoma data set with covariate log tumor thickness]] The p-value for all three overall tests (likelihood, Wald, and score) are significant, indicating that the model is significant. The p-value for log(thick) is 6.9e-07, with a hazard ratio HR = exp(coef) = 2.18, indicating a strong relationship between the thickness of the tumor and increased risk of death. By contrast, the p-value for sex is now p=0.088. The hazard ratio HR = exp(coef) = 1.58, with a 95% confidence interval of 0.934 to 2.68. Because the confidence interval for HR includes 1, these results indicate that sex makes a smaller contribution to the difference in the HR after controlling for the thickness of the tumor, and only trend toward significance. Examination of graphs of log(thickness) by sex and a t-test of log(thickness) by sex both indicate that there is a significant difference between men and women in the thickness of the tumor when they first see the clinician. The Cox model assumes that the hazards are proportional. The proportional hazard assumption may be tested using the R{{nbsp}}function cox.zph(). A p-value which is less than 0.05 indicates that the hazards are not proportional. For the melanoma data we obtain p=0.222. Hence, we cannot reject the null hypothesis of the hazards being proportional. Additional tests and graphs for examining a Cox model are described in the textbooks cited. ====Extensions to Cox models==== Cox models can be extended to deal with variations on the simple analysis. *Stratification. The subjects can be divided into strata, where subjects within a stratum are expected to be relatively more similar to each other than to randomly chosen subjects from other strata. The regression parameters are assumed to be the same across the strata, but a different baseline hazard may exist for each stratum. Stratification is useful for analyses using matched subjects, for dealing with patient subsets, such as different clinics, and for dealing with violations of the proportional hazard assumption. *Time-varying covariates. Some variables, such as gender and treatment group, generally stay the same in a clinical trial. Other clinical variables, such as serum protein levels or dose of concomitant medications may change over the course of a study. Cox models may be extended for such time-varying covariates.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)