This paper presents systematic lossless data compression studies conducted at Cooperative Institute of Meteorological
Satellite Studies (CIMSS), University of Wisconsin-Madison in support of the real-time rebroadcast of NOAA's future
GOES ultraspectral sounders. Ultraspectral sounders provide data with high spectral and spatial resolutions. Since an
ultraspectral sounder could be either a grating spectrometer or a Michelson interferometer, we have
investigated/developed various 2D and 3D lossless compression techniques for both grating and interferometer data.
The lossless compression results are obtained and compared from wavelet/multiwavelt transform-based (e.g.
JPEG2000, 3D SPIHT, MWT), prediction-based (e.g. JPEG-LS, CALIC), projection-based (e.g. Lossless PCA,
Optimized Orthogonal Matching Pursuit-based Linear Prediction, PLT), and clustering-based (e.g. PPVQ, FPVQ,
AVQLP) methods. Robust data preprocessing schemes (e.g. BAR, MST reordering) are also demonstrated to improve
compression gains of existing state-of-the-art compression methods such as JPEG2000, 3D SPIHT, JPEG-LS, and
CALIC for high-spectral-resolution data compression. Our studies show that high lossless compression gains are
achievable for both grating and interferometer data.
JPEG-LS1 is the new ISO/ITU standard for lossless and near-lossless compression of 2D continuous-tone images. For contemporary and future ultraspectral sounder data that features good correlations in disjoint spectral channels, we develop an MST-embedded JPEG-LS (Minimum Spanning Tree embedded JPEG-LS) for achieving higher compression gains through MST channel reordering. Unlike previous non-embedded MST work with other cost functions used only for data preprocessing, the MST-embedded JPEG-LS uniquely uses the sum of absolute median prediction errors as the cost function for MST to determine each optimal pair of predicting and predicted channels. The MST can be embedded within JPEG-LS because of the same median prediction used in JPEG-LS. The advantage of this embedding is that the median prediction residuals are available to JPEG-LS after MST channel reordering without recalculation. Numerical experiments show that the MST-embedded JPEG-LS yields an average compression ratio of 2.81, superior to 2.46 obtained with JPEG-LS for the 10 standard ultraspectral granules.
Contemporary and future ultraspectral sounders represent a significant technical advancement for environmental and meteorological prediction and monitoring. Given their large volume of spectral observations, the use of robust data compression techniques will be beneficial to data transmission and storage. In this paper, we propose a novel Adaptive Vector Quantization (VQ)-based Linear Prediction (AVQLP) method for ultraspectral data compression. The method is compared with several state-of-the-art methods such as CALIC, JPEG-LS and JPEG2000. The compression experiments show that our AVQLP method is the first to surpass the 4 to 1 lossless compression barrier for a selected set of AIRS ultraspectral sounder test data.
Most source coding techniques generate bitstream where different regions have unequal influences on data reconstruction. An uncorrected error in a more influential region can cause more error propagation in the reconstructed data. Given a limited bandwidth, unequal error protection (UEP) via channel coding with different code rates for different regions of bitstream may yield much less error contamination than equal error protection (EEP). We propose an optimal UEP scheme that minimizes error contamination after channel and source decoding. We use JPEG2000 for source coding and turbo product code (TPC) for channel coding as an example to demonstrate this technique with ultraspectral sounder data. Wavelet compression yields unequal significance in different wavelet resolutions. In the proposed UEP scheme, the statistics of erroneous pixels after TPC and JPEG2000 decoding are used to determine the optimal channel code rates for each wavelet resolution. The proposed UEP scheme significantly reduces the number of pixel errors when compared to its EEP counterpart. In practice, with a predefined set of implementation parameters (available channel codes, desired code rate, noise level, etc.), the optimal code rate allocation for UEP needs to be determined only once and can be done offline.
The Karhunen-Loeve transform (KLT) is the optimal unitaty transform that yields the maximum coding gain. The prediction-based lower triangular transform (PLT) features the same decorrelation and coding gain properties as the KLT, but with lower complexity. Unlike KLT, PLT has the perfect reconstruction property which allows its use for lossless compression. In this paper we apply PLT to lossless compression of the ultraspectral sounder data. The experiment on the standard ultraspectral test dataset of 10 AIRS digital count granules shows that the PLT compression outperforms JPEG2000, SPIHT, JPEG-LS, and CCSDS IDC 5/3.
Previous study shows 3D Wavelet Transform with Reversible Variable-Length Coding (3DWT-RVLC) has much better error resilience than JPEG2000 Part 2 on 1-bit random error remaining after channel decoding. Errors in satellite channels might have burst characteristics. Low-density parity-check (LDPC) codes are known to have excellent error correction capability near the Shannon limit performance. In this study, we investigate the burst error correction performance of LDPC codes via the new Digital Video Broadcasting - Second Generation (DVB-S2) standard for ultraspectral sounder data compressed by 3DWT-RVLC. We also study the error contamination after 3DWT-RVLC source decoding. Statistics show that 3DWT-RVLC produces significantly fewer erroneous pixels than JPEG2000 Part 2 for ultraspectral sounder data compression.
The ultraspectral sounder data features strong correlations in disjoint spectral regions due to the same type of absorbing gases. This paper compares the compression performance of two robust data preprocessing schemes, namely Bias-Adjusted reordering (BAR) and Minimum Spanning Tree (MST) reordering, in the context of entropy coding. Both schemes can take advantage of the strong correlations for achieving higher compression gains. The compression methods consist of the BAR or MST preprocessing schemes followed by linear prediction with context-free or context-based arithmetic coding (AC). Compression experiments on the NASA AIRS ultraspectral sounder data set show that MST without bias-adjustment produces lower
compression ratios than BAR and bias-adjusted MST for both context-free and context-based AC. Biasadjusted MST outperforms BAR for context-free arithmetic coding, whereas BAR outperforms MST for
context-based arithmetic coding. BAR with context-based AC yields the highest average compression ratios in comparison to MST with context-free or context-based AC.
A method combining the empirical mode decomposition (EMD) and the principal component analysis (PCA) was recently proposed for lossless compression of ultraspectral sounder data. In that method, data residual is obtained via the linear regression of the data against m intrinsic mode functions (IMFs) which are obtained from the EMD of the data mean, followed by the linear regression of the IMF regression error against a truncated number, n, of their corresponding principal components (PCs). In this paper we show that this two-stage (m IMFs + n PCs) linear transform approach is not as good as its counterpart two-stage (m PCs + n PCs) linear transform approach in terms of data residual and compression ratio of ultraspectral data, given the same number of IMFs and PCs used respectively at the first stage, followed by the same number of PCs used at the second stage. Mathematically, the two-stage (m PCs + n PCs) linear transform approach is equivalent to a single linear transform with (m + n) PCs. In other words, the simple PCA compression method outperforms this combined EMD and PCA compression method. This is expected because the PCA (also called the Karhunen-Loève transform or the Hotelling transform) is known to be the optimal linear transform in the sense of minimizing the mean squared error.
This paper presents current status of lossless compression of ultraspectral sounder data. The lossless compression results from the transform-based (e.g. JPEG2000, 3D SPIHT, and Lossless PCA), prediction-based (e.g. JPEG-LS, CALIC, and linear prediction using OOMP), and clustering-based (e.g. PVQ, DPVQ, PPVQ and FPVQ) methods are presented. The ultraspectral sounder data features strong correlations in disjoint spectral regions affected by the same type of absorbing gases. Some robust data preprocessing scheme (e.g. BAR) is also demonstrated to improve compression gains of existing state-of-the-art compression methods such as JPEG2000, 3D SPIHT, JPEG-LS, and CALIC.
Research has been undertaken to examine the robustness of JPEG2000 when corrupted by transmission bit errors in a satellite data stream. Contemporary and future ultraspectral sounders such as Atmospheric Infrared Sounder (AIRS), Cross-track Infrared Sounder (CrIS), Infrared Atmospheric Sounding Interferometer (IASI), Geosynchronous Imaging Fourier Transform Spectrometer (GIFTS), and Hyperspectral Environmental Suite (HES) generate a large volume of three-dimensional data. Hence, compression of ultraspectral sounder data will facilitate data transmission and archiving. There is a need for lossless or near-lossless compression of ultraspectral sounder data to avoid potential retrieval degradation of geophysical parameters due to lossy compression. This paper investigates the simulated error propagation in AIRS ultraspectral sounder data with advanced source and channel coding in a satellite data stream. The source coding is done via JPEG2000, the latest International Organization for Standardization (ISO)/International Telecommunication Union (ITU) standard for image compression. After JPEG2000 compression the AIRS ultraspectral sounder data is then error correction encoded using a rate 0.954 turbo product code (TPC) for channel error control. Experimental results of error patterns on both channel and source decoding are presented. The error propagation effects are curbed via the block-based protection mechanism in the JPEG2000 codec as well as memory characteristics of the forward error correction (FEC) scheme to contain decoding errors within received blocks. A single nonheader bit error in a source code block tends to contaminate the bits until the end of the source code block before the inverse discrete wavelet transform (IDWT), and those erroneous bits propagate even further after the IDWT. Furthermore, a single header bit error may result in the corruption of almost the entire decompressed granule. JPEG2000 appears vulnerable to bit errors in a noisy channel of satellite transmission, and thus has difficulty to preserve the quality of ultraspectral sounder data. A channel decoded bit error rate (BER) of 10-11 or better may be necessary for a granule error rate of 0.00116 in a compressed ultraspectral sounder data stream that is transmitted in a satellite channel. This work at The Aerospace Corporation and the University of Wisconsin, CIMSS, was under separate contracting from and performed for the National Oceanic and Atmospheric Administration (NOAA) National Environmental Satellite, Data, and Information Service (NESDIS), a component of the U.S. Department of Commerce.
Nonreversible variable-length codes (e.g. Huffman coding, Golomb-Rice coding, and arithmetic coding) have been used in source coding to achieve efficient compression. However, a single bit error during noisy transmission can cause many codewords to be misinterpreted by the decoder. In recent years, increasing attention has been given to the design of reversible variable-length codes (RVLCs) for better data transmission in error-prone environments. RVLCs allow instantaneous decoding in both directions, which affords better detection of bit errors due to synchronization losses over a noisy channel. RVLCs have been adopted in emerging video coding standards--H.263+ and MPEG-4--to enhance their error-resilience capabilities. Given the large volume of three-dimensional data that will be generated by future space-borne ultraspectral sounders (e.g. IASI, CrIS, and HES), the use of error-robust data compression techniques will be beneficial to satellite data transmission. In this paper, we investigate a reversible variable-length code for ultraspectral sounder data compression, and present its numerical experiments on error propagation for the ultraspectral sounder data. The results show that the RVLC performs significantly better error containment than JPEG2000 Part 2.
Improvements in weather and climate observation, analysis, and prediction will be achieved through advances of contemporary and future ultraspectral infrared sounders such as Atmospheric Infrared Sounder (AIRS), Tropospheric Emission Spectrometer (TES), Geosynchronous Imaging Fourier Transform Spectrometer (GIFTS), and Hyperspectral Environmental Suite (HES). Given their unprecedented 3D data sizes to be generated each day, the use of robust data compression techniques will be beneficial to data transfer and archive. Lossless or near-lossless compression of this ultraspectral sounder data is desired to avoid potentially significant degradation of the geophysical parameter retrieval in an associated ill-posed inverse problem. In this paper we investigate various 2D and 3D compression techniques applicable to ultraspectral sounder data. These techniques include transform-based (JPEG2000, 3D-SPIHT), prediction-based (JPEG-LS, CALIC), and clustering-based (PVQ, DPVQ, PPVQ) compression methods. Data preprocessing schemes for compression gains are also illustrated.
The unprecedented size of ultraspectral sounder data makes its compression a challenging task. Ultraspectral sounder data features strong correlations in disjoint spectral regions affected by the same type of absorbing gases. Previously, we proposed a reordering scheme to better explore these correlations of the ultraspectral sounder data. With this preprocessing scheme, the state-of-the-art compression algorithms such as CALIC, JPEG-LS and JPEG2000 significantly improve the compression ratios up to 15% on average. In this paper, we investigate the effects of different starting channels for spectral reordering on the lossless compression of 3D ultraspectral sounder data obtained from Atmospheric Infrared Sounder (AIRS) observations. It is shown that the compression ratios and reordering indices are dependent on the choice of the starting channel for reordering.
The compression of hyperspectral sounder data is beneficial for more efficient archive and transfer given its large 3-D volume. Moreover, since physical retrieval of geophysical parameters from hyperspectral sounder data is a mathematically ill-posed problem that is sensitive to the error of the data, lossless or near-lossless compression is desired. This paper provides an update into applications of state-of-the-art 2D and 3D lossless compression algorithms such as 3D EZW, 3D SPIHT, 2D JPEG2000, 2D JPEG-LS and 2D CALIC for hyperspectral sounder data. In addition, in order to better explore the correlations between the remote spectral regions affected by the same type of atmospheric absorbing constituents or clouds, the Bias-Adjusted Reordering (BAR) scheme is presented which reorders the data such that the bias-adjusted distance between any two neighboring vectors is minimized. This scheme coupled with any of the state-of-the-art compression algorithms produces significant compression gains.
The compression of three-dimensional hyperspectral sounder data is a challenging task given its unprecedented size and nature. Vector quantization (VQ) is explored for the compression of this hyperspectral sounder data. The high dimensional vectors are partitioned into subvectors to reduce codebook search and storage complexity in coding of the data. The partitions are made by use of statistical properties of the sounder data in the spectral dimension. Moreover, the data is decorrelated at first to make it better suited for vector quantization. Due to the data characteristics, the iterative codebook generation procedure converges much faster and also leads to a better reconstruction of the sounder data. For lossless compression of the hyperspectral sounder data, the residual error and the quantization indices are entropy coded. The independent vector quantizers for different partitions make this scheme practical for compression of the large volume 3D hyperspectral sounder data.
Hyperspectral sounder data is used for retrieval of atmospheric temperature, moisture and trace gas profiles, surface temperature and emissivity, and cloud and aerosol optical properties. This large volume of data is 3-D in nature with many scan lines containing cross-track footprints, each with thousands of IR channels. Unlike hyperspectral imager data compression, hyperspectral sounder data compression is desired to be lossless or near-lossless to avoid substantial degradation of the geophysical retrieval. For this new class of data for compression studies, a lossless compression algorithm combining the context-based adaptive lossless image codec (CALIC) and a novel bias-adjusted reordering (BAR) scheme is presented. The 3-D data are arranged into two dimensions with the original 2-D spatial domain converted into one dimension using a continuous scan order. In the BAR scheme, the data are reordered such that the bias-adjusted distance between any two neighboring vectors is minimized. The result is then encoded using the CALIC algorithm with significant compression gains over using the CALIC algorithm alone.
The next-generation NOAA/NESDIS GOES-R hyperspectral sounder, now referred to as the HES (Hyperspectral Environmental Suite), will have hyperspectral resolution (over one thousand channels with spectral widths on the order of 0.5 wavenumber) and high spatial resolution (less than 10 km). Hyperspectral sounder data is a particular class of data requiring high accuracy for useful retrieval of atmospheric temperature and moisture profiles, surface characteristics, cloud properties, and trace gas information. Hence compression of these data sets is better to be lossless or near lossless. Given the large volume of three-dimensional hyperspectral sounder data that will be generated by the HES instrument, the use of robust data compression techniques will be beneficial to data transfer and archive. In this paper, we study lossless data compression for the HES using 3D integer wavelet transforms via the lifting schemes. The wavelet coefficients are processed with the 3D set partitioning in hierarchical trees (SPIHT) scheme followed by context-based arithmetic coding. SPIHT provides better coding efficiency than Shapiro's original embedded zerotree wavelet (EZW) algorithm. We extend the 3D SPIHT scheme to take on any size of 3D satellite data, each of whose dimensions need not be divisible by 2N, where N is the levels of the wavelet decomposition being performed. The compression ratios of various kinds of wavelet transforms are presented along with a comparison with the JPEG2000 codec.
Hyperspectral sounder data is a particular class of data that requires high accuracy for useful retrieval of atmospheric temperature and moisture profiles, surface characteristics, cloud properties, and trace gas information. Therefore compression of these data sets is better to be lossless or near lossless. The next-generation NOAA/NESDIS GOES-R hyperspectral sounder, now referred to as the HES (Hyperspectral Environmental Suite), will have hyperspectral resolution (over one thousand channels with spectral widths on the order of 0.5 wavenumber) and high spatial resolution (less than 10 km). Given the large volume of three-dimensional hyperspectral sounder data that will be generated by the HES instrument, the use of robust data compression techniques will be beneficial to data transfer and archive. In this paper, we study lossless data compression for the HES using 3D integer wavelet transforms via the lifting schemes. The wavelet coefficients are then processed with the 3D embedded zerotree wavelet (EZW) algorithm followed by context-based arithmetic coding. We extend the 3D EZW scheme to take on any size of 3D satellite data, each of whose dimensions need not be divisible by 2N, where N is the levels of the wavelet decomposition being performed. The compression ratios of various kinds of wavelet transforms are presented along with a comparison with the JPEG2000 codec.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.