A drug is developed with the intention of treatment of the entire population of patients with certain disease. However, if a drug is efficacious only for a fraction of population, then the conventional clinical trial is unlikely to be able to detect its efficacy if the fraction of the effective subpopulation is not sufficiently large. On the other hand, approved drugs are sometimes removed from the marketplace after the post-marketing discovery of unexpected toxicity that was not detected in extensive pre-clinical and clinical studies. A plausible explanation for the observation of such an unanticipated adverse event is the existence of relatively small, unidentified, hyper-sensitive subpopulations and adverse events only stand out when a drug is administered to a large segment of the general population. A main goal of pharmacogenomics for personalized medicine is to develop genomic signatures to predict patients' responses to drug or biologic therapy for treatment decision.
Recent advances of molecular technologies can screen a large number of potential markers in a single experiment and may provide more sensitive identification methods and with increased predictive accuracy. This article presents a statistical model to distinguish two types of biomarkers for treatment decision: biomarkers of susceptibility and biomarkers of response, and proposes an approach for identifying a fraction of susceptible patients who should be spared from the unnecessary treatment. The approach involves two steps. The first step is to identify a set of biomarkers of susceptibility from a mixture of biomarkers of susceptibility and biomarkers of response. The second step is to develop a class-imbalanced classifier, based on the biomarkers identified, using an ensemble classification algorithm since the number of susceptible patients is generally much smaller than the number of non-susceptible patients. Simulation experiment was used to illustrate the approach and discuss important issues and applications in the development of biomarker classifiers to identify a small number of susceptible patients.
Keywords: Class Prediction; Class Imbalanced; Gene Expression; Personalized Medicine
Biography: Dr. Chen is Senior Mathematical Statistician in the U. S. Food and Drug Administration. He has over 200 scientific publications in peer-reviewed journals and numerous invited subject review articles in the areas of statistics, toxicology,and pharmacogenomics. His current research interests are 1) statistical and data mining methods for the analysis of high dimensional data and 2) statistical modeling for risk assessment. Dr. Chen is an elected Fellow of the American Statistical Association and member of editorial board in several pharmacogenomics, bioinformatics, and nutritional journals