Breast cancer is one of the main causes of death among women in occidental countries. In the last years, screening
mammography has been established worldwide for early detection of breast cancer, and computer-aided diagnostics
(CAD) is being developed to assist physicians reading mammograms. A promising method for CAD is content-based
image retrieval (CBIR). Recently, we have developed a classification scheme of suspicious tissue pattern based on the
support vector machine (SVM). In this paper, we continue moving towards automatic CAD of screening mammography.
The experiments are based on in total 10,509 radiographs that have been collected from different sources. From this,
3,375 images are provided with one and 430 radiographs with more than one chain code annotation of cancerous
regions. In different experiments, this data is divided into 12 and 20 classes, distinguishing between four categories of
tissue density, three categories of pathology and in the 20 class problem two categories of different types of lesions.
Balancing the number of images in each class yields 233 and 45 images remaining in each of the 12 and 20 classes,
respectively. Using a two-dimensional principal component analysis, features are extracted from small patches of 128 x
128 pixels and classified by means of a SVM. Overall, the accuracy of the raw classification was 61.6 % and 52.1 % for
the 12 and the 20 class problem, respectively. The confusion matrices are assessed for detailed analysis. Furthermore, an
implementation of a SVM-based CBIR system for CADx in screening mammography is presented. In conclusion, with a
smarter patch extraction, the CBIR approach might reach precision rates that are helpful for the physicians. This,
however, needs more comprehensive evaluation on clinical data.