Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Survival analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Tree-structured survival models=== The Cox PH regression model is a linear model. It is similar to linear regression and logistic regression. Specifically, these methods assume that a single line, curve, plane, or surface is sufficient to separate groups (alive, dead) or to estimate a quantitative response (survival time). In some cases alternative partitions give more accurate classification or quantitative estimates. One set of alternative methods are tree-structured survival models,<ref>{{Cite journal|last=Segal|first=Mark Robert|date=1988|title=Regression Trees for Censored Data|url=https://www.jstor.org/stable/2531894|journal=Biometrics|volume=44|issue=1|pages=35β47|doi=10.2307/2531894|jstor=2531894|s2cid=60974957 |url-access=subscription}}</ref><ref>{{Cite journal|last1=Leblanc|first1=Michael|last2=Crowley|first2=John|date=1993|title=Survival Trees by Goodness of Split|url=http://www.tandfonline.com/doi/abs/10.1080/01621459.1993.10476296|journal=Journal of the American Statistical Association|language=en|volume=88|issue=422|pages=457β467|doi=10.1080/01621459.1993.10476296|issn=0162-1459|url-access=subscription}}</ref><ref>{{Cite journal|last1=Ritschard|first1=Gilbert|last2=Gabadinho|first2=Alexis|last3=Muller|first3=Nicolas S.|last4=Studer|first4=Matthias|date=2008|title=Mining event histories: a social science perspective|url=http://www.inderscience.com/link.php?id=22538|journal=International Journal of Data Mining, Modelling and Management|language=en|volume=1|issue=1|pages=68|doi=10.1504/IJDMMM.2008.022538|issn=1759-1163}}</ref> including survival random forests.<ref name=":0">{{Cite journal|last1=Ishwaran|first1=Hemant|last2=Kogalur|first2=Udaya B.|last3=Blackstone|first3=Eugene H.|last4=Lauer|first4=Michael S.|date=2008-09-01|title=Random survival forests|journal=The Annals of Applied Statistics|volume=2|issue=3|doi=10.1214/08-AOAS169|s2cid=2003897|issn=1932-6157|doi-access=free|arxiv=0811.1645}}</ref> Tree-structured survival models may give more accurate predictions than Cox models. Examining both types of models for a given data set is a reasonable strategy. ====Example survival tree analysis==== This example of a survival tree analysis uses the R{{nbsp}}package "rpart".<ref name=":1">{{Cite web|last1=Therneau|first1=Terry J.|last2=Atkinson|first2=Elizabeth J.|title=rpart: Recursive Partitioning and Regression Trees|url=https://CRAN.R-project.org/package=rpart|access-date=November 12, 2021|website=CRAN}}</ref> The example is based on 146 stage{{nbsp}}C prostate cancer patients in the data set stagec in rpart. Rpart and the stagec example are described in Atkinson and Therneau (1997),<ref>{{Cite book|last1=Atkinson|first1=Elizabeth J.|url=https://www.researchgate.net/publication/235665541|title=An introduction to recursive partitioning using the RPART routines|last2=Therneau|first2=Terry J.|publisher=Mayo Foundation|year=1997}}</ref> which is also distributed as a vignette of the rpart package.<ref name=":1" /> The variables in stages are: *'''pgtime''': time to progression, or last follow-up free of progression *'''pgstat''': status at last follow-up (1=progressed, 0=censored) *'''age''': age at diagnosis *'''eet''': early endocrine therapy (1=no, 0=yes) *'''ploidy''': diploid/tetraploid/aneuploid DNA pattern *'''g2''': % of cells in G2 phase *'''grade''': tumor grade (1-4) *'''gleason''': Gleason grade (3-10) The survival tree produced by the analysis is shown in the figure. [[File:Survival tree for prostate cancer.png|thumb|700px|Survival tree for prostate cancer data set]] Each branch in the tree indicates a split on the value of a variable. For example, the root of the tree splits subjects with grade < 2.5 versus subjects with grade 2.5 or greater. The terminal nodes indicate the number of subjects in the node, the number of subjects who have events, and the relative event rate compared to the root. In the node on the far left, the values 1/33 indicate that one of the 33 subjects in the node had an event, and that the relative event rate is 0.122. In the node on the far right bottom, the values 11/15 indicate that 11 of 15 subjects in the node had an event, and the relative event rate is 2.7. ====Survival random forests==== An alternative to building a single survival tree is to build many survival trees, where each tree is constructed using a sample of the data, and average the trees to predict survival.<ref name=":0" /> This is the method underlying the survival random forest models. Survival random forest analysis is available in the R{{nbsp}}package "randomForestSRC".<ref>{{Cite web|last1=Ishwaran|first1=Hemant|last2=Kogalur|first2=Udaya B.|title=randomForestSRC: Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)|url=https://CRAN.R-project.org/package=randomForestSRC|access-date=November 12, 2021|website=CRAN}}</ref> The randomForestSRC package includes an example survival random forest analysis using the data set pbc. This data is from the Mayo Clinic Primary Biliary Cirrhosis (PBC) trial of the liver conducted between 1974 and 1984. In the example, the random forest survival model gives more accurate predictions of survival than the Cox PH model. The prediction errors are estimated by [[Bootstrapping (statistics)|bootstrap re-sampling]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)