For the development of computer-aided diagnosis (CAD) systems, a classifier that can effectively differentiate more than two classes is often needed. For example, a detected object on an image may need to be classified as a malignant lesion, a benign lesion, or normal tissue. Currently, a three-class problem is usually treated as a two-stage, two-class problem, in which the detected object is first differentiated as a lesion or normal tissue, and, in the second stage, the lesion is further classified as malignant or benign. In this work, we explored methods for classification of an object into one of the three classes, and compared the three-class approach with the common two-class approach. We conducted Monte Carlo simulation studies to evaluate the dependence of the performance of 3-class classification schemes on design sample size and feature space configurations. A k-dimensional multivariate normal feature space with three classes having different means was assumed. Linear classifiers and artificial neural networks (ANNs) were examined. ROC analysis for the 3-class approach was explored under simplifying conditions. A performance index representing the normalized volume under the ROC surface (NVUS) was defined. Linear classifiers for classification of three classes and two classes were compared. We found that a 3-class approach with a linear classifier can achieve a higher NVUS than that of a 2-class approach. We further compared the performance of an ANN having three or one output nodes with a linear classifier. At large sample sizes, a 3-output-node ANN was basically the same as that of a one-output-node ANN. When the three class distributions had equal covariance matrices and the distances between pairs of class means were equal, the linear classifiers could reach a higher performance for the test samples than the ANN when the design sample size was small; the linear classifier and the ANNs approached the same performance in the limit of large design sample size. However, under complex feature space configurations such as the class means located along a line, the class in the middle was poorly differentiated from the other two classes by the linear classifiers for any dimensionality; the ANN outperformed the linear classifier at all design sample size studied. This simulation study may provide some useful information to guide the design of 3-class classifiers for various CAD applications.
|