Paper
16 January 2025 Performance optimization and comparative analysis of distributed big data processing framework
Xinyan Xie
Author Affiliations +
Proceedings Volume 13447, International Conference on Mechatronics and Intelligent Control (ICMIC 2024); 134470B (2025) https://doi.org/10.1117/12.3052245
Event: International Conference on Mechatronics and Intelligent Control (ICMIC 2024), 2024, Wuhan, China
Abstract
The research and application of big data processing and analysis have been quite mature, but more and more fields have put forward demands for real-time analysis and rapid response of fast and massive distributed big data. Distributed big data framework performance needs to be optimized. To this end, the article studies and analyzes the platform architecture and data computing model of typical batch processing technology Hadoop, memory computing technology Spark, and stream computing technology Storm, and summarizes the similarities and differences of these three big data processing technologies. Then, on the basis of studying the big data platform and traditional cardinality estimation algorithm, a HyperLogLog algorithm application model based on the streaming platform is proposed, and the cardinality calculation is performed in the Storm processing engine to achieve the performance of the distributed big data processing framework optimization. The results show that the HyperLogLog cardinality estimation algorithm of the Storm stream computing platform can be used to optimize the performance of the distributed big data processing framework.
(2025) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Xinyan Xie "Performance optimization and comparative analysis of distributed big data processing framework", Proc. SPIE 13447, International Conference on Mechatronics and Intelligent Control (ICMIC 2024), 134470B (16 January 2025); https://doi.org/10.1117/12.3052245
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data processing

Analytical research

Data storage

Mathematical optimization

Distributed computing

Data modeling

Printing

Back to Top