Paper
2 November 2011 Is the CCSDS rice coding suitable for GPU massively parallel implementation?
Author Affiliations +
Proceedings Volume 8183, High-Performance Computing in Remote Sensing; 818308 (2011) https://doi.org/10.1117/12.896893
Event: SPIE Remote Sensing, 2011, Prague, Czech Republic
Abstract
The Consultative Committee for Space Data Systems (CCSDS) Rice Coding is a recommendation for lossless compression of satellite data. It was also integrated with HDF (Hierarchical Data Format) software for lossless compression of scientific data, and was proposed for lossless compression of medical images. The CCSDS Rice coding is an approximate adaptive entropy coder. It uses a subset of the family of Golomb codes to produce a simpler, suboptimal prefix code. The default preprocessor is a unit-delay predictor with positive mapping. The adaptive entropy coder concurrently applies a set of variable-length codes to a block of consecutive preprocessed samples. The code option that yields the shortest codeword sequence for the current block of samples is then selected for transmission. A unique identifier bit sequence is attached to the code block to indicate to the decoder which decoding option to use. In this paper we explore the parallel efficiency of the CCSDS Rice code running on Graphics Processing Units (GPUs) with Compute Unified Device Architecture (CUDA). The GPU-based CCSDS Rice encoder will process several codeword blocks in a massively parallel fashion on different GPU multiprocessors. We parallelized the CCSDS Rice coding by using reduction sum for code option selection, prefix sum for intra-block and inter-block bit stream concatenation as well as asynchronous data transfer. For NASA AVIRIS hyperspectral data, the speedup is near 6× as compared to the single-threaded CPU counterpart. The CCSDS Rice coding has too many flow control instructions which significantly affect the instruction throughput by causing threads of the same CUDA warp to diverge. Consequently, the different execution paths must be serialized, increasing the total number of instructions executed within the same warp. We conclude that this branching and divergence issue is the bottleneck of the Rice coding that leads to smaller speedup than other entropy coding on GPUs.
© (2011) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xianyun Wu, Yunsong Li, Chengke Wu, and Bormin Huang "Is the CCSDS rice coding suitable for GPU massively parallel implementation?", Proc. SPIE 8183, High-Performance Computing in Remote Sensing, 818308 (2 November 2011); https://doi.org/10.1117/12.896893
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Computer programming

Data compression

Image compression

Medical imaging

Satellites

Computer architecture

Parallel computing

Back to Top