Sage Journals: Discover world-class research

Abstract

In machine learning, classification involves identifying the categories or classes to which a new observation belongs based on a training set. The performance of a classification model is generally measured by the classification accuracy of a test set. The first step in developing a classification model is to divide an acquired dataset into training and test sets through random sampling. In general, random sampling does not guarantee that test accuracy reflects the performance of a developed classification model. If random sampling produces biased training/test sets, the classification model may result in bias. In this study, we show the problems of random sampling and propose balanced sampling as an alternative. We also propose a measure for evaluating sampling methods. We perform empirical experiments using benchmark datasets to verify that our sampling algorithm produces proper training and test sets. The results confirm that our method produces better training and test sets than random and several non-random sampling methods can.

Keywords

Classification training and test sets accuracy random sampling balanced sampling

Get full access to this article

View all access options for this article.

References

Walczaka

Massart

D.L.

Heuerding

Erni

Last

I.R.

and Prebble

K.A.

, Artificial neural networks in classification of NIR spectral data: design of the training set, Chemometrics and Intelligent Laboratory Systems 33(1) (1996), 35–46.

Yasri

and Hartsough

, Toward an optimal procedure for variable selection and QSAR model building, Journal of Chemical Information and Computer Sciences 41(5) (2001), 1218–1227.

Golbraikh

and Tropsha

, Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection, Molecular Diversity 5(4) (2000), 231–243.

Huuskonen

, QSAR modeling with the electrotopological state: TIBO derivatives, Journal of Chemical Information and Computer Sciences 41(2) (2001), 425–429.

Pötter

and Matter

, Random or rational design? Evaluation of diverse compound subsets from chemical structure databases, Journal of Medicinal Chemistry 41(4) (1998), 478–488.

Loukas

Y.L.

, Adaptive neuro-fuzzy inference system: an instant and architecture-free predictor for improved QSAR studies, Journal of Medicinal Chemistry 44(17) (2001), 2772–2783.

Bernard

Pintore

Berthon

J.Y.

and Chrétien

J.R.

, A molecular modeling and 3D QSAR study of a large series of indole inhibitors of human non-pancreatic secretory phospholipase A2, European Journal of Medicinal Chemistry 36(1) (2001), 1–19.

Burden

F.R.

Ford

M.G.

Whitley

D.C.

and Winkler

D.A.

, Use of automatic relevance determination in QSAR studies using Bayesian neural networks, Journal of Chemical Information and Computer Sciences 40(6) (2000), 1423–1430.

Burden

F.R.

and Winkler

D.A.

, Robust QSAR models using Bayesian regularized neural networks, Journal of Medicinal Chemistry 42(16) (1999), 3183–3187.

10.

Tetko

I.V.

Kovalishyn

V.V.

and Livingstone

D.J.

, Volume learning algorithm artificial neural networks for 3D QSAR studies, Journal of Medicinal Chemistry 44(15) (2001), 2411–2420.

11.

Hudson

B.D.

Hyde

R.M.

Rahr

Wood

and Osman

, Parameter Based Methods for Compound Selection from Chemical Databases, Quantitative Structure-Activity Relationships 15 (1996), 285–289. doi: 10.1002/qsar.19960150402.

12.

Martin

E.J.

and Critchlow

R.E.

, Beyond mere diversity: tailoring combinatorial libraries for drug discovery, Journal of Combinatorial Chemistry 1(1) (1999), 32–45.

13.

, A new dataset evaluation method based on category overlap, Computers in Biology and Medicine 41(2) (2011), 115–122.

14.

Lee

Batnyam

and Oh

, Efficient feature selection method based on R-value, Computers in Biology and Medicine 43(2) (2013), 91–99.

Balanced training/test set sampling for proper evaluation of classification models

Abstract

Keywords

Get full access to this article

References