Exponential increase in the demand for high-quality user-generated content (UGC) videos and limited bandwidth pose great challenges for hosting platforms in practice. How to optimize the compression of UGC videos efficiently becomes critical. As the ultimate receiver is human visual system, there is a growing consensus that the optimization of the video coding and processing shall be fully driven by the perceptual quality, so traditional rate control-based methods may not be optimal. In this paper, a novel perceptual model on compressed UGC video quality is proposed by exploiting characteristics extracted from only source video. In the proposed method, content-aware features and quality-aware features are explored to estimate quality curves against quantization parameter (QP) variations. Specifically, content revelant deep semantic features from pre-trained image classification neural networks and quality revelant handcrafted features from various objective video quality assessment (VQA) models are utilized. Finally, a machine-learning approach is proposed to predict the quality of compressed videos of different QP values. Hence, the quality curves can be driven, by estimating the QP for given target quality, a quality-centered compression paradigm can be built. Based on experimental results, the proposed method can accurately model quality curves for various UGC videos and control compression quality well.
Significance: Fourier ptychography (FP) is a computational imaging approach that achieves high-resolution reconstruction. Inspired by neural networks, many deep-learning-based methods are proposed to solve FP problems. However, the performance of FP still suffers from optical aberration, which needs to be considered.
Aim: We present a neural network model for FP reconstructions that can make proper estimation toward aberration and achieve artifact-free reconstruction.
Approach: Inspired by the iterative reconstruction of FP, we design a neural network model that mimics the forward imaging process of FP via TensorFlow. The sample and aberration are considered as learnable weights and optimized through back-propagation. Especially, we employ the Zernike terms instead of aberration to decrease the optimization freedom of pupil recovery and perform a high-accuracy estimation. Owing to the auto-differentiation capabilities of the neural network, we additionally utilize total variation regularization to improve the visual quality.
Results: We validate the performance of the reported method via both simulation and experiment. Our method exhibits higher robustness against sophisticated optical aberrations and achieves better image quality by reducing artifacts.
Conclusions: The forward neural network model can jointly recover the high-resolution sample and optical aberration in iterative FP reconstruction. We hope our method that can provide a neural-network perspective to solve iterative-based coherent or incoherent imaging problems.
The video quality assessment (VQA) technology has attracted a lot of attention in recent years due to an increasing demand of video streaming services. Existing VQA methods are designed to predict video quality in terms of the mean opinion score (MOS) calibrated by humans in subjective experiments. However, they cannot predict the satisfied user ratio (SUR) of an aggregated viewer group. Furthermore, they provide little guidance to video coding parameter selection, e.g. the Quantization Parameter (QP) of a set of consecutive frames, in practical video streaming services. To overcome these shortcomings, the just-noticeable-difference (JND) based VQA methodology has been proposed as an alternative. It is observed experimentally that the JND location is a normally distributed random variable. In this work, we explain this distribution by proposing a user model that takes both subject variabilities and content variabilities into account. This model is built upon user’s capability to discern the quality difference between video clips encoded with different QPs. Moreover, it analyzes video content characteristics to account for inter-content variability. The proposed user model is validated on the data collected in the VideoSet. It is demonstrated that the model is flexible to predict SUR distribution of a specific user group.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.