In magnetic resonance imaging (MRI), intensity inhomogeneity refers to an acquisition artifact which introduces
a non-linear variation in the signal intensities within the image. Intensity inhomogeneity is known to significantly
affect computerized analysis of MRI data (such as automated segmentation or classification procedures), hence
requiring the application of bias field correction (BFC) algorithms to account for this artifact. Quantitative
evaluation of BFC schemes is typically performed using generalized intensity-based measures (percent coefficient
of variation, %CV ) or information-theoretic measures (entropy). While some investigators have previously
empirically compared BFC schemes in the context of different domains (using changes in %CV and entropy
to quantify improvements), no consensus has emerged as to the best BFC scheme for any given application.
The motivation for this work is that the choice of a BFC scheme for a given application should be dictated by
application-specific measures rather than ad hoc measures such as entropy and %CV. In this paper, we have
attempted to address the problem of determining an optimal BFC algorithm in the context of a computer-aided
diagnosis (CAD) scheme for prostate cancer (CaP) detection from T2-weighted (T2w) MRI. One goal of this work
is to identify a BFC algorithm that will maximize the CaP classification accuracy (measured in terms of the area
under the ROC curve or AUC). A secondary aim of our work is to determine whether measures such as %CV and
entropy are correlated with a classifier-based objective measure (AUC). Determining the presence or absence of
these correlations is important to understand whether domain independent BFC performance measures such as
%CV , entropy should be used to identify the optimal BFC scheme for any given application. In order to answer
these questions, we quantitatively compared 3 different popular BFC algorithms on a cohort of 10 clinical 3 Tesla
prostate T2w MRI datasets (comprising 39 2D MRI slices): N3 , PABIC, and the method of Cohen et al. Results
of BFC via each of the algorithms was evaluated in terms of %CV , entropy, as well as classifier AUC for CaP
detection from T2w MRI. The CaP classifier was trained and evaluated on a per-pixel basis using annotations
of CaP obtained via registration of T2w MRI and ex vivo whole-mount histology sections. Our results revealed
that different BFC schemes resulted in a maximization of different performance measures, that is, the BFC
scheme identified by minimization of %CV and entropy was not the one that maximized AUC as well. Moreover,
existing BFC evaluation measures (%CV , entropy) did not correlate with AUC (application-based evaluation),
but did correlate with each other, suggesting that domain-specific performance measures should be considered
in making a decision regarding choice of appropriate BFC scheme. Our results also revealed that N3 provided
the best correction of bias field artifacts in prostate MRI data, when the goal was to identify prostate cancer.
|