Random Forest Based on Pseudo-Values for Competing Risks Analysis
Ulla B. Mogensen, Thomas A. Gerds
Department of Biostatistics, University of Copenhagen, Copenhagen, Denmark

In medical statistics, a challenging task is to establish computer algorithms that gather information from previous patients and provide evidence based medical guidance. Tools from machine learning are popular for analyzing complex and high dimensional data. One is the random forest method – an ensemble method which combines the results of many classification or regression trees to achieve accurate predictions. The idea presented here is to use time-dependent pseudo-values to build trees that can predict an event of interest in the presence of competing risks. The data of the Copenhagen stroke study are used for illustration. The predictive power of the prosed method is compared to that obtained with an extension of the random survival forest approach, and to alternative regression strategies based on cause-specific Cox and Fine-Gray models.

Keywords: survival analysis; competing risks; random forests; prediction

Biography: Ulla Mogensen, Ph.D. student in biostatistics, University of Copenhagen.