This article provides an introduction to ensemble statistical procedures as a special case of algorithmic methods. The discussion begins with classification and regression trees (CART) as a didactic device to introduce many of the key issues. Following the material on CART is a consideration of cross-validation, bagging, random forests, and boosting. Major points are illustrated with analyses of real data.
Berk, R. A., and J. Baek. 2003. “Ensemble Procedures for Finding High Risk Prison Inmates.” Unpublished manuscript, Department of Statistics, UCLA.
2.
Berk, R. A., A. Li, and L. J. Hickman. Forthcoming. “Statistical Difficulties in Determining the Role of Race in Capital Cases: A Re-analysis of Data From the State of Maryland.”Quantitative Criminology.
3.
Berk, R. A., S. B. Sorenson, and Y. He. 2005. “Developing a Practical Forecasting Screener for Domestic Violence Incidents.”Evaluation Review29:358-382.
Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Monterey, CA: Wadsworth.
9.
Buehlmann, P. and Bin Yu. 2002. “Analyzing Bagging.”The Annals of Statistics30:927-961.
10.
Christianini, N. and J. Shawne-Taylor. 2000. An Introduction to Support Vector Machines. Cambridge, UK: Cambridge University Press.
11.
Dasu, T. and T. Johnson. 2003. Exploratory Data Mining and Data Cleaning. New York: John Wiley.
12.
Faraway, J.2004. “Human Animation Using Nonparametric Regression.”Journal of Computational and Graphical Statistics13:537-553.
13.
Freund, Y. and R. Schapire. 1996. “Experiments With a New Boosting Algorithm.” Pp. 148-156 in Machine Learning: Proceedings for the Thirteenth International Conference. San Francisco: Morgan Kaufmann.
14.
Friedman, J., T. Hastie, and R. Tibsharini. 2000. “Additive Logistic Regression: A Statistical View of Boosting” (with discussion). Annals of Statistics28:337-407.
15.
Friedman, J. H.2001. “Greedy Function Approximation: A Gradient Boosting Machine.”Annals of Statistics29:1189-1232.
16.
Friedman, J. H.2002. “Stochastic Gradient Boosting.”Computational Statistics and Data Analysis38 (4): 367-378.
Hastie, T., R. Tibshirani, and J. Friedman. 2001. The Elements of Statistical Learning. New York: Springer-Verlag.
19.
Hothorn, T.2003. “Bundling Predictors in R.” DSC 2003 Working Papers (draft version). http://www.ci.tuwien.ac.at/Conferences/DSC-2003/
20.
Hothorn, T., L. Berthold, A. Brenner, and M. Radespiel-Tröger. 2002. “Bagging Survival Trees.” Preprint, Department of Medical Informatics, Biometry and Epidemiology, Freidrich-Alexander-University Erlangen-Nuremberg.
21.
LeBlanc, M. and R. Tibshirani. 1996. “Combining Estimates on Regression and Classification.”Journal of the American Statistical Association91:1641-1650.
22.
Loh, W.-Y.2002. “Regression Trees With Unbiased Variable Selection and Interaction Detection.”Statistica Sinica12:361-386.
23.
Mannor, S., R. Meir, and T. Zhang. 2002. “The Consistency of Greedy Algorithms for Classification.”COLT2002:319-333.
24.
Mertz, C. J.1999. “Using Correspondence Analysis to Combine Classifiers.”Machine Learning36:33-58.
25.
Mojirsheibani, M.1997. “A Consistent Combine Classification Rule.”Statistics & Probability Letters36:43-47.
26.
Mojirsheibani, M.1999. “Combining Classifiers vis Discretization.”Journal of the American Statistical Association94:600-609.
27.
Rosenbaum, P. R.2002. Observational Studies, 2nd ed.New York: Springer-Verlag.
28.
Schapire, R. E. 1999. “A Brief Introduction to Boosting.” In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence.
29.
Su, X., M. Wang, and J. Fan. 2004. “Maximum Likelihood Regression Trees.”Journal of Computational and Graphical Statistics13:586-598.
30.
Witten, I. H. and E. Frank. 2000. Data Mining. New York: Morgan Kaufman.
31.
Zhang, H. and B. Singer. 1999. Recursive Partitioning in the Health Sciences. New York: Springer-Verlag.