1.IntroductionIt is well established that optical scattering signals are sensitive to alterations in the morphology and internal structure of tissue constituents. This has paved the way for the development of numerous optical tools that can be employed to monitor cells or cell nuclei, mitochondria, lysosomes, and fibrous networks for various diagnostic purposes.1,2 Cell nuclei serve as repositories of genetic material and have thus far been a major focus for the diagnosis of diseases. In particular, changes in chromatin organization within cell nuclei are important indicators of precancer progression. Previous studies provide strong evidence that such changes can be detected via scattering-based optical modalities.3–14 Subnuclear chromatin distribution can be modeled as a continuum of refractive index fluctuations.15 For a quantitative description, one parameter of interest is functional factor , which controls the shape of the refractive index correlation function. This parameter is related to chromatin packing scaling and is a measure of the heterogeneity of chromatin distribution. Two other parameters of interest are the characteristic length scale of refractive index fluctuations and the extent of refractive index fluctuations, which together allow for a more intuitive characterization of chromatin organization. The length scale of fluctuations represents the characteristic size of subnuclear structures and can be defined in terms of the spatial correlation length , roughly indicating the distance over which the correlation of refractive index values drops to a negligible level. The extent of fluctuations, on the other hand, can be defined as the standard deviation of refractive index values within the nucleus and is directly related to the inhomogeneity of macromolecular density. These two parameters are often lumped into a single quantity referred to as the disorder strength and expressed as , where is usually set to 1 or 2.3,4,9,14,16 It is possible to extract information on , , or both using scattering-based optical techniques along with relevant analytical formulations of light propagation or algorithms to analyze measurements. In fact, a number of prior studies on low-coherence enhanced backscattering spectroscopy, inverse spectroscopic optical coherence tomography, partial-wave spectroscopic microscopy, or quantitative phase imaging show that nuclear and tend to increase with the progression of cancer.3–5,8,13,14 This most likely corresponds to chromatin compaction that is expected to manifest as an increase in heterogeneity of subnuclear chromatin organization. Reporting on and separately as in Refs. 6 and 10 can possibly be more informative since independent assessment of changes in both parameters can lead to a more direct interpretation of alterations in chromatin distribution. We recently demonstrated that azimuth-resolved optical scattering signals obtained from cell nuclei can provide significant insights into their internal refractive index profile.11 Features calculated based on azimuth-dependent intensity variations in these two-dimensional signals are sensitive to the length scale and extent of subnuclear refractive index fluctuations; further, these features are not susceptible to changes in the overall size, shape, and mean refractive index of nuclei. Therefore, our results indicate that precancer-related changes in chromatin organization can be selectively monitored via analysis of two-dimensional scattering signals. An important question that arises is whether we can use two-dimensional scattering signals in an inverse scheme to extract the spatial correlation length and extent of refractive index fluctuations to obtain a quantitative measure of subnuclear chromatin distribution. Since an analytical formulation that links azimuth-resolved signals to and is not feasible, it is best to resort to a data-driven approach. Machine learning and deep learning methods are increasingly being used in the field of biomedical optics. These methods have also been applied to scattering-based measurements, yet mainly for classification purposes.12,13,17–23 In this work, we present a convolutional neural network (CNN)-based regression analysis aimed at the extraction of and from two-dimensional scattering signals. Our dataset consists of numerically computed signals for three-dimensional nuclear models constructed with varying values of and . The results obtained with this dataset show that CNN-based regression on scattering signals provides a potential means to extract both parameters and make predictions on subnuclear chromatin organization. 2.Methods2.1.Two-Dimensional Optical Scattering SignalsThe methodology for obtaining the set of optical scattering signals used in the study presented here has been previously described.11 Briefly, nuclear models were constructed in voxelated grids as spheres with a radius of or ellipsoids with semiaxis lengths of , , and . The mean refractive index of the nuclei was set to 1.40, and the refractive index of the embedding cytoplasm was assumed to be 1.36. A stochastic approach was adopted to generate subnuclear refractive index fluctuations; the spatial refractive index profile conformed to a Gaussian correlation function. The values of the correlation length were selected from {0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0} , and the values of the extent of refractive index fluctuations were selected from {0.005, 0.010, 0.015, 0.020, 0.025, 0.030, 0.035}. Three different nuclear models were constructed for each combination of and considered. Overall, there were a total of 198 models (including 99 spherical and 99 ellipsoidal models) available for our study. Each constructed nuclear model was fed as input into an in-house finite-difference time-domain (FDTD) simulation code to numerically compute the resulting optical scattering response at a wavelength of 800 nm. The simulation output consisted of a two-dimensional array of scattered light intensities , where was the polar scattering angle, defined to be the angle between the incident and scattered light directions, and was the azimuthal scattering angle, defined to be the angle between the incident wave polarization direction and the scattering plane, both in degrees. Note that each simulation output covers a dynamic range of . The two-dimensional FDTD signals presented henceforth and employed for analysis correspond to subjected to a full-scale contrast stretch algorithm to produce image-like functions denoted by with values varying between 0 and 255. Figure 1 shows sample FDTD signals obtained for spherical or ellipsoidal nuclear models with different values of and . Depictions of central cross-sections of the constructed models are also included for reference; the grayscale for these cross-sectional depictions is adjusted so that darker areas correspond to regions of higher refractive index. As discussed in detail in Ref. 11, the signals for the spherical nuclear models are characterized by vertical background fringes [Figs. 1(a)–1(c)], whereas the signals for the ellipsoidal nuclear models are characterized by curved background fringes [Figs. 1(d)–1(f)]. In both cases, however, the signals become more irregular with significant intensity variations along the direction when decreases or when increases. 2.2.CNN-Based Regression AnalysisA convolutional neural network (CNN) is a type of deep neural network primarily applied for image classification and regression tasks. We followed standard CNN frameworks24–26 and designed an architecture for the regression task at hand to be able to predict the correlation length and extent of nuclear refractive index fluctuations from two-dimensional optical scattering signals. We implemented our design in Google Colab using the TensorFlow library.27 The details regarding our CNN architecture are provided in the block diagram in Fig. 2. As described in Section 2.1, input signals had dimensions of . We used five convolutional (Conv) layers with the ReLU activation function; these Conv layers, from the first to the fifth layers, had 8, 16, 32, 64, and 128 filters of size , respectively. After each Conv operation, we applied the MaxPooling operation with a pool size of . At the end of the fifth Conv layer, batch normalization was performed and data was flattened so that it could be input to a Dense layer with 512 neurons and the ReLU activation function. We also had a dropout operation after the Dense layer with a rate of 0.5; dropout is a technique to prevent overfitting, and a rate of 0.5 is typically observed to be effective in improving the generalization performance for a wide range of networks and tasks.25,28 The final layer had only one neuron with the linear activation function to give the prediction results. We used the Adam optimizer,29 which is an algorithm for the first-order gradient-based optimization of stochastic objective functions. This algorithm relies on adaptive estimates of lower-order moments and combines the advantages of two other common optimizers, namely AdaGrad and RMSProp; it is one of the most popular methods for training as it is computationally efficient, has low memory requirements, and needs minimal hyperparameter tuning. We selected and tuned our hyperparameters as suggested in Ref. 29: the step size or learning rate was set to 0.0005; the exponential decay rates and for the first- and second-moment estimates were set to 0.9 and 0.999, respectively; and , a small constant to prevent division by zero, was assigned a value of . A callback function, which saved the weights that gave the lowest validation loss value, was also employed. We note that our dataset, consisting of 198 signals, was randomly partitioned into training, validation, and test subsets with split ratios of 60%, 20%, and 20%, respectively. Five-fold cross-validation was used to assess the prediction performance of the CNN. 3.Results and DiscussionFigure 3 shows the prediction results for 40 signals in a single test set. The triangular markers represent the true values, and the circular markers represent the predicted values. For this set, we observe a close agreement between the true and predicted values for both [Fig. 3(a)] and [Fig. 3(b)]. Similar results are observed for the other test sets. To illustrate and quantify the overall performance of our CNN-based regression analysis, we combine the results obtained for all test sets in Fig. 4, and we compute the mean absolute percent errors (MAPEs) for both parameters. The central marks in the box plots for [Fig. 4(a)] and [Fig. 4(b)] show the median predicted values, and the bottom and top edges of the boxes indicate the 25th and 75th percentiles, respectively. The whiskers extend to the most extreme values that are not considered outliers, and the outliers are plotted individually using plus markers. The circular markers in the plots correspond to the mean values of the predicted values, with error bars indicating 95% confidence intervals of the mean values. Note that the dotted diagonal lines in the background represent perfect agreement and are meant to guide the eye. The MAPEs computed for and are 8.5% and 13.5%, respectively. Since our nuclear models were constructed such that varied in steps of between 0.4 and , and varied in steps of 0.005 between 0.005 and 0.035, these errors are smaller than the minimum percent increment between successive values for respective parameters and can thus be considered to signify an extremely good prediction performance over the range of interest. Further, the MAPEs that we obtained here are comparable to those previously reported in a similar context.10 It is important to point out that an increase in and a decrease in give rise to similar changes in two-dimensional scattering signals [Figs. 1(a)–1(f)]. Hence, isolating the effects of these two parameters is quite challenging. Our results reveal that a CNN-based approach can be used to tackle this issue; close agreement between the true and predicted values suggests that both parameters can be independently quantified. This definitely proves advantageous because mutually exclusive information on the length scale and extent of refractive index fluctuations can provide more specific details regarding the internal structure of the cell nuclei. It is also worth reiterating that our dataset included optical scattering signals from both spherical and ellipsoidal nuclear models. We can thus claim that refractive index fluctuations can be well characterized despite shape-dependent differences in signals, as observed between Figs. 1(a)–1(c) and Figs. 1(d)–1(f). The results presented here, albeit obtained with a limited dataset, offer strong evidence that CNN-based regression can be a powerful approach to exploit the information content of two-dimensional optical scattering signals obtained from cell nuclei and to monitor chromatin organization in a quantitative manner. Analysis on an extended dataset will potentially lead to a broader perspective on how to refine the CNN architecture for an improved performance. It is apparent from Figs. 3 and 4 that the most notable deviations between the true and CNN-predicted values are observed for large or large . We presume that a relatively smaller number of nuclear models constructed with large and large is the main underlying factor for this trend. As such, training with an extended dataset that includes more cases with large parameter values is likely to result in a better prediction performance. From a practical perspective, experimental setups for the acquisition of two-dimensional optical scattering signals from cells have already been implemented and described in a series of studies.17–21,30,31 The specific angular ranges for signal acquisition vary depending on the particular setup employed, but it is common to see a wide range of side scattering covered with an angular sampling interval of one degree or less. In fact, we previously reported that azimuth-dependent intensity variations in scattering signals over the angular range of are extremely sensitive to subnuclear refractive index fluctuations.11 Hence, limiting the range of interest in a CNN-based regression analysis to side scattering angles may even aid in improving the prediction performance, leading to smaller MAPEs. In addition, this will have the added benefit of reducing the dynamic range of optical scattering signals to be measured to or . CNN algorithms can actually be combined with optimization routines to determine the optimal angular range and angular sampling interval that can be used to minimize MAPEs so that scanning and detection systems can be designed and fine-tuned accordingly. On a related matter, we also note that a full assessment of our analysis approach requires a detailed characterization of any influence of noise that will inevitably be present in real measurements. To offer some preliminary insights into the potential influence of noise on the prediction performance, we added zero-mean white Gaussian noise32,33 to the two-dimensional optical scattering signals in the test sets and applied our CNN algorithm trained with noise-free signals to noise-degraded signals. Here, the noise level was quantified in terms of the signal-to-noise ratio given by , where represents the average signal power and represents the average noise power, which is equal to the Gaussian noise variance. For an SNR of 25 dB, roughly corresponding to the case in which the noise standard deviation is 5% of the root-mean-square signal value in line with Ref. 34, the MAPEs obtained for and are 12.3% and 14.1%, respectively. These results do not point to a significant decrease in prediction performance for the noise level specified. This can be regarded as initial evidence that our CNN-based regression algorithm trained with simulated data can possibly be applied to real measurements; that said, the corroboration of predicted values via high-resolution imaging techniques will be the ultimate benchmark. A comprehensive analysis of performance deterioration due to higher noise levels will certainly provide guidelines for devising noise reduction strategies that should be explored as part of any study involving an inverse scheme for prediction of relevant parameters from measurements. In our study, we assumed that the main contribution to cellular scattering comes from the nucleus. Hence, our nuclear models were constructed in a homogeneous cytoplasm with a fixed refractive index. We remark that this is a valid assumption for cells that are characterized by a very low volume fraction of organelles.8,11 For cells with a high volume fraction of organelles, however, we cannot exclude contributions from the cytoplasm. In that case, we need to assess the influence of mitochondria or other subcellular structures on two-dimensional scattering signals. There is also a need to determine whether surface roughness as discussed in Refs. 21 and 35 can be a compounding factor for the prediction of parameters related to the subnuclear refractive index profile. A systematic investigation based on numerical studies as presented in this work will potentially reveal whether a CNN-based analysis can distinctively pick out features linked to different sources of scattering. We intend to address these issues as part of our future research efforts. 4.ConclusionsIn summary, the research described here highlights the potential of CNN-based regression on two-dimensional optical scattering signals obtained from cell nuclei to extract the length scale and extent of internal refractive index fluctuations. Even though our work focuses on the prediction of chromatin organization, which is strongly linked to precancer progression, a similar methodology can be used to monitor the internal refractive index profiles of other subcellular organelles or tissue constituents. This can facilitate scattering-based delineation of the progressive development of a wide spectrum of diseases. DisclosuresThe authors have no relevant financial interests in this letter and no potential conflicts of interest to disclose. Code and Data AvailabilityThe set of simulated optical signals used in this letter is available from the corresponding author upon reasonable request. ReferencesN. N. Boustany, S. A. Boppart and V. Backman,
“Microscopic imaging and spectroscopy with scattered light,”
Annu. Rev. Biomed. Eng., 12 285
–314 https://doi.org/10.1146/annurev-bioeng-061008-124811 ARBEF7 1523-9829
(2010).
Google Scholar
Z. A. Steelman et al.,
“Light-scattering methods for tissue diagnosis,”
Optica, 6 479
–489 https://doi.org/10.1364/OPTICA.6.000479
(2019).
Google Scholar
J. E. Chandler et al.,
“High-speed spectral nanocytology for early cancer screening,”
J. Biomed. Opt., 18 117002 https://doi.org/10.1117/1.JBO.18.11.117002 JBOPFO 1083-3668
(2013).
Google Scholar
V. J. Konda et al.,
“Nanoscale markers of esophageal field carcinogenesis: potential implications for esophageal cancer screening,”
Endoscopy, 45 983
–988 https://doi.org/10.1055/s-0033-1344617 ENDCAM
(2013).
Google Scholar
A. J. Radosevich et al.,
“Buccal spectral markers for lung cancer risk stratification,”
PLoS One, 9 e110157 https://doi.org/10.1371/journal.pone.0110157 POLNCL 1932-6203
(2014).
Google Scholar
J. Yi et al.,
“Spatially resolved optical and ultrastructural properties of colorectal and pancreatic field carcinogenesis observed by inverse spectroscopic optical coherence tomography,”
J. Biomed. Opt., 19 036013 https://doi.org/10.1117/1.JBO.19.3.036013 JBOPFO 1083-3668
(2014).
Google Scholar
S. Uttam et al.,
“Early prediction of cancer progression by depth-resolved nanoscale mapping of nuclear architecture from unstained tissue specimens,”
Cancer Res., 75 4718
–4727 https://doi.org/10.1158/0008-5472.CAN-15-1274 CNREA8 0008-5472
(2015).
Google Scholar
J. Yi et al.,
“Fractal characterization of chromatin decompaction in live cells,”
Biophys. J., 109 2218
–2226 https://doi.org/10.1016/j.bpj.2015.10.014 BIOJAU 0006-3495
(2015).
Google Scholar
L. M. Almassalha et al.,
“Label-free imaging of the native, living cellular nanoarchitecture using partial-wave spectroscopic microscopy,”
Proc. Natl. Acad. Sci. U.S.A., 113 E6372
–E6381 https://doi.org/10.1073/pnas.1608198113 PNASA6 0027-8424
(2016).
Google Scholar
L. Cherkezyan et al.,
“Reconstruction of explicit structural properties at the nanoscale via spectroscopic microscopy,”
J. Biomed. Opt., 21 025007 https://doi.org/10.1117/1.JBO.21.2.025007 JBOPFO 1083-3668
(2016).
Google Scholar
D. Arifler and M. Guillaud,
“Assessment of internal refractive index profile of stochastically inhomogeneous nuclear models via analysis of two-dimensional optical scattering patterns,”
J. Biomed. Opt., 26 055001 https://doi.org/10.1117/1.JBO.26.5.055001 JBOPFO 1083-3668
(2021).
Google Scholar
P. N. Thota et al.,
“Prediction of neoplastic progression in Barrett’s esophagus using nanoscale nuclear architecture mapping: a pilot study,”
Gastrointest. Endosc., 95 1239
–1246 https://doi.org/10.1016/j.gie.2022.01.007
(2022).
Google Scholar
A. Daneshkhah et al.,
“Early detection of lung cancer using artificial intelligence-enhanced optical nanosensing of chromatin alterations in field carcinogenesis,”
Sci. Rep., 13 13702 https://doi.org/10.1038/s41598-023-40550-6 SRCEC3 2045-2322
(2023).
Google Scholar
A. Rancu et al.,
“Multiscale optical phase fluctuations link disorder strength and fractal dimension of cell structure,”
Biophys. J., 122 1390
–1399 https://doi.org/10.1016/j.bpj.2023.03.005 BIOJAU 0006-3495
(2023).
Google Scholar
J. D. Rogers et al.,
“Modeling light scattering in tissue as continuous random media using a versatile refractive index correlation function,”
IEEE J. Sel. Top. Quantum Electron., 20 7000514 https://doi.org/10.1109/JSTQE.2013.2280999 IJSQEN 1077-260X
(2014).
Google Scholar
M. Takabayashi et al.,
“Disorder strength measured by quantitative phase imaging as intrinsic cancer marker in fixed tissue biopsies,”
PLoS One, 13 e0194320 https://doi.org/10.1371/journal.pone.0194320 POLNCL 1932-6203
(2018).
Google Scholar
X. Su et al.,
“Pattern recognition cytometry for label-free cell classification by 2D light scattering measurements,”
Opt. Express, 23 27558
–27565 https://doi.org/10.1364/OE.23.027558 OPEXFF 1094-4087
(2015).
Google Scholar
L. Xie et al.,
“Automatic classification of acute and chronic myeloid leukemic cells with wide-angle label-free static cytometry,”
Opt. Express, 25 29365
–29373 https://doi.org/10.1364/OE.25.029365 OPEXFF 1094-4087
(2017).
Google Scholar
H. Wei et al.,
“Automatic classification of label-free cells from small cell lung cancer and poorly differentiated lung adenocarcinoma with 2D light scattering static cytometry and machine learning,”
Cytometry A, 95A 302
–308 https://doi.org/10.1002/cyto.a.23671
(2019).
Google Scholar
X. Su et al.,
“Two-dimensional light scattering anisotropy cytometry for label-free classification of ovarian cancer cells via machine learning,”
Cytometry A, 97A 24
–30 https://doi.org/10.1002/cyto.a.23865
(2020).
Google Scholar
W. Y. Wan et al.,
“Integration of light scattering with machine learning for label free cell detection,”
Biomed. Opt. Express, 12 3512
–3529 https://doi.org/10.1364/BOE.424357 BOEICL 2156-7085
(2021).
Google Scholar
H. Zhang et al.,
“Deep learning classification of cervical dysplasia using depth-resolved angular light scattering profiles,”
Biomed. Opt. Express, 12 4997
–5007 https://doi.org/10.1364/BOE.430467 BOEICL 2156-7085
(2021).
Google Scholar
G. Cioffi et al.,
“Unknown cell class distinction via neural network based scattering snapshot recognition,”
Biomed. Opt. Express, 14 5060
–5074 https://doi.org/10.1364/BOE.492028 BOEICL 2156-7085
(2023).
Google Scholar
Y. Lecun et al.,
“Gradient-based learning applied to document recognition,”
Proc. IEEE, 86 2278
–2324 https://doi.org/10.1109/5.726791 IEEPAD 0018-9219
(1998).
Google Scholar
A. Krizhevsky, I. Sutskever and G. E. Hinton,
“ImageNet classification with deep convolutional neural networks,”
Commun. ACM, 60 84
–90 https://doi.org/10.1145/3065386 CACMA2 0001-0782
(2017).
Google Scholar
M. Fernández-Delgado et al.,
“An extensive experimental survey of regression methods,”
Neural Netw., 111 11
–34 https://doi.org/10.1016/j.neunet.2018.12.010 NNETEB 0893-6080
(2019).
Google Scholar
N. Srivastava et al.,
“Dropout: a simple way to prevent neural networks from overfitting,”
J. Mach. Learn. Res., 15 1929
–1958
(2014).
Google Scholar
D. P. Kingma and J. Ba,
“Adam: a method for stochastic optimization,”
in 3rd Int. Conf. for Learn. Represent. (ICLR),
(2015). https://doi.org/10.48550/arXiv.1412.6980 Google Scholar
M. Giacomelli et al.,
“Size and shape determination of spheroidal scatterers using two-dimensional angle resolved scattering,”
Opt. Express, 18 14616
–14626 https://doi.org/10.1364/OE.18.014616 OPEXFF 1094-4087
(2010).
Google Scholar
H. Shahin et al.,
“Physical characterization of hematopoietic stem cells using multidirectional label-free light scatterings,”
Opt. Express, 24 28877
–28888 https://doi.org/10.1364/OE.24.028877 OPEXFF 1094-4087
(2016).
Google Scholar
H. Zhang et al.,
“Angular range, sampling, and noise considerations for inverse light scattering analysis of nuclear morphology,”
J. Biophotonics, 12 e201800258 https://doi.org/10.1002/jbio.201800258
(2019).
Google Scholar
K. J. Dunn and A. J. Berger,
“Three-dimensional angular scattering simulations inform analysis of scattering from single cells,”
J. Biomed. Opt., 28 086501 https://doi.org/10.1117/1.JBO.28.8.086501 JBOPFO 1083-3668
(2023).
Google Scholar
L. Cherkezyan, H. Subramanian and V. Backman,
“What structural length scales can be detected by the spectral variance of a microscope image?,”
Opt. Lett., 39 4290
–4293 https://doi.org/10.1364/OL.39.004290 OPLEDP 0146-9592
(2014).
Google Scholar
D. Zhang et al.,
“Spectroscopic microscopy can quantify the statistics of subdiffractional refractive-index fluctuations in media with random rough surfaces,”
Opt. Lett., 40 4931
–4934 https://doi.org/10.1364/OL.40.004931 OPLEDP 0146-9592
(2015).
Google Scholar
|
Refractive index
Scattering
Convolutional neural networks
Finite-difference time-domain method
Interference (communication)
Signal to noise ratio
Cross validation