Assessing smile genuineness from video sequences is a vital topic concerned with recognizing facial expression and linking them with the underlying emotional states. There have been a number of techniques proposed underpinned with handcrafted features, as well as those that rely on deep learning to elaborate the useful features. As both of these approaches have certain benefits and limitations, in this work we propose to combine the features learned by a long short-term memory network with the features handcrafted to capture the dynamics of facial action units. The results of our experiments indicate that the proposed solution is more effective than the baseline techniques and it allows for assessing the smile genuineness from video sequences in real-time.
This paper tackles the problem of mixed Gaussian and impulsive noise suppression in color images. The proposed method comprises two essential steps. Firstly, we detect impulsive noise through an approach based on the concept of digital path exploring the local pixel neighborhood. Each pixel is assigned a cost of a path connecting the boundary of a local processing window with its center. When the central pixel exhibits a high value of the path with lowest cost, it is identified as an impulse. To achieve this, we use a thresholding procedure for detecting corrupted pixels. Analyzing the distribution of minimum path costs, we employ the k-means technique to classify pixels into three distinct categories: those nearly undistorted, those corrupted by Gaussian noise, and those affected by impulsive noise. Subsequently, we employ the Laplace interpolation technique to restore the impulsive pixels — a fast and effective method yielding satisfactory denoising results. In the second step, we address the residual Gaussian noise using the Non-Local Means method, which selectively considers pixels from the local window that have not been flagged as impulsive. The experimental results confirm that our proposed hybrid method consistently yields superior outcomes compared to state-of-the-art denoising techniques. Moreover, its computational complexity remains low, rendering it suitable for real-time applications.
Hyperspectral image analysis has been attracting research attention in a variety of fields. Since the size of hyperspectral data cubes can easily reach gigabytes, their efficient transfer, manual delineation, and intrinsic heterogeneity have become serious obstacles in building ground-truth datasets in emerging scenarios. Therefore, applying supervised learners for the hyperspectral classification and segmentation remains a difficult yet very important task in practice, as segmentation is a pivotal step in the process of extracting useful information about the scanned area from such highly dimensional data. We tackle this problem using self-organizing maps and exploit an unsupervised algorithm for segmenting such imagery. The experimental study, performed over two benchmark hyperspectral scenes and backed up with the sensitivity analysis, showed that our technique can be applied for this purpose due to its flexibility, it delivers reliable segmentations, and offers fast operation.
Deep Neural Networks (DNNs) have been deployed in many real-world applications in various domains, both industry and academic, and have proven to deliver outstanding performance. However, DNNs are vulnerable to adversarial attacks, that are small perturbations embedded in an image. As a result, introduction of DNNs into safety-critical systems, such as autonomous vehicles, unmanned aerial vehicles or healthcare devices, would introduce very high risk of limiting their capabilities to recognize and interpret the environment in which they are used and therefore would lead to devastating consequences. Thus, robustness enhancement of DNNs by development of defense mechanisms is a matter of the utmost importance. In this paper, we evaluated a set of state-of-the-art denoising filters designed for impulsive noise removal as defensive solutions. The proposed methods are applied as a pre-processing step, in which the adversarial patterns in the source image are removed before performing classification task. As a result, the pre-processing defense block can be easily integrated with any type of classifier, without any knowledge about utilized training procedures or internal architecture of the model. Moreover, the evaluated filtering methods can be considered as universal defensive techniques, as they are completely unrelated with the internal aspects of the selected attack and can be applied against any type of adversarial threats. The experimental results obtained on German Traffic Sign Recognition Benchmark (GTSRB) have proven that the denoising filters provide high robustness against sparse adversarial attacks and do not significantly decrease the classification performance on non-altered data.
KEYWORDS: Signal detection, Signal processing, Video, Distance measurement, Environmental monitoring, Environmental sensing, Video surveillance, Signal analyzers, Underwater imaging
Autonomous underwater drone operation requires on-line analysis of signals coming from various sensors. In this paper we focus on design of the visual front-end of an underwater drone which is optimized for abrupt signal change detection for help in maneuvering and underwater object search operations. The proposed method relies on tensor space comparison with the chordal kernel function. This kernel measures a distance expressed as principal angles on Grassman manifolds of unfolded tensors. Although tested on color videos, the method can be scaled to accept more signal types in the input tensors. Experiments show promising results.
In this paper a novel method of impulsive noise removal in color images is presented. The proposed filtering design is based on a new measure of pixel similarity, which takes into account the structure of the local neighborhood of the pixels being compared. Thus, the new distance measure can be regarded as an extension of the reachability distance used in the construction of the local outlier factor, widely used in the big data analysis. Using the new similarity measure, an extension of the classic Vector Median Filter (VMF) has been developed. The new filter is extremely robust to outliers introduced by the impulsive noise, retains details and has the unique ability to sharpen image edges. Using the structure of the developed filter, a new impulse detector has been constructed. The cumulated sum of smallest reachability distances in the filtering window serves as a robust measure of pixel outlyingness. In this way, a pixel will be treated as corrupted if a predefined threshold is exceeded and will be replaced by the average of pixels which were found to belong to the original, pristine image; otherwise the processed pixel will be retained. This structure is similar to the Fast Averaging Peer Group Filter, however the incorporation of the reachability measure makes this technique more robust. The new filtering design can be applied in real time scenario, as its computational efficiency is comparable with the standard VMF, which is fast enough to be used for the enhancement of video sequences. The new filter operates in a 3×3 filtering window, however the information acquired from a larger window is processed. The source of additional information is the local neighborhood of pixels, which is used for the determination of the novel reachability measure. The experiments performed on a large database of color images show that the new filter surpasses existing designs especially in the case of highly polluted images. The robust reachability measure assures that the clusters of impulses are being removed, as not only the pixels, but also their neighborhoods are considered. The novel measure of dissimilarity can be also used in other tasks whose main goal is the detection of outliers.
Recent advancements in single-image super-resolution reconstruction (SRR) are attributed primarily to convolutional neural networks (CNNs), which effectively learn the relation between low and high resolution and allow for obtaining high-quality reconstruction within seconds. SRR from multiple images benefits from information fusion, which improves the reconstruction outcome compared with example-based methods. On the other hand, multiple-image SRR is computationally more demanding, mainly due to required subpixel registration of the input images. Here, we explore how to exploit CNNs in multiple-image SRR and we demonstrate that competitive reconstruction outcome can be obtained within seconds.
Data augmentation is a popular technique which helps improve generalization capabilities of deep neural net- works, and can be perceived as implicit regularization. It is widely adopted in scenarios where acquiring high- quality training data is time-consuming and costly, with hyperspectral satellite imaging (HSI) being a real-life example. In this paper, we investigate data augmentation policies (exploiting various techniques, including generative adversarial networks applied to elaborate artificial HSI data) which help improve the generalization of deep neural networks (and other supervised learners) by increasing the representativeness of training sets. Our experimental study performed over HSI benchmarks showed that hyperspectral data augmentation boosts the classification accuracy of the models without sacrificing their real-time inference speed.
Deep learning has been widely applied in many computer vision tasks due to its impressive capability of automatic feature extraction and classification. Recently, deep neural networks have been used in image denosing, but most of the proposed approaches were designed for Gaussian noise suppression. Therefore, in this paper, we address the problem of impulsive noise reduction in color images using Denoising Convolutional Neural Networks (DnCNN). This network architecture utilizes the concept of deep residual learning and is trained to learn the residual image instead of the directly denoised one. Our preliminary results show that direct application of DnCNN allows to achieve significantly better results than the state-of-the-art filters designed for impulsive noise in color images.
Cortical surface extraction from magnetic resonance (MR) scans is a preliminary, yet crucial step in brain segmentation and analysis. Although there are many algorithms that address this problem, they often sacrifice execution speed for accuracy or they depend on many parameters that have to be tuned manually by an experienced practitioner. Therefore fast, accurate and autonomous cortical surface extraction algorithms are in high demand and they are being actively developed to enable clinicians to appropriately plan a treatment pathway and quantify response in patients with brain lesions based on precise image analysis. In this paper, we present an automated approach for cortical surface extraction from MR images based on 3D image morphology, connected component labeling and edge detection. Our technique allows for real-time processing of MR scans – an average study of 102 slices, each 512x512 pixels, takes approximately 768 ms to process (about 7 ms per slice) with known parameters. To automate the process of tuning the algorithm parameters, we developed a genetic algorithm for this task. Experimental study performed using real-life MR brain images revealed that the proposed algorithm offers very high-quality cortical surface extraction, it works in real-time, and it is competitive with the state of the art.
In the paper a hybrid underwater drone maneuvering front-end, joining background subtraction and stereovision is presented. Novel formulation of the median based background subtraction allows for fast and reliable foreground/background scene segmentation based on drone-environment relative movement analysis. The following stereovision block performs matching of the foreground objects detected by the background subtraction module. Based on this, information can be provided to the drone on relative distance to the nearest objects in order to avoid collisions. The system does not assume any prior calibration and can operate in real-time.
In the paper, a novel approach to the enhancement of color images corrupted by impulsive noise is presented. The proposed algorithm first calculates for every image pixel the distances in the RGB color space to all elements belonging to the filtering window. Then, a sum of a specified number of smallest distances, which serves as a measure of pixel similarity, is calculated. This generalization of the Rank-Ordered Absolute Difference (ROAD) is robust to outliers, as the high distances are not considered when calculating this measure. Next, for each pixel, a neighbor with smallest ROAD value is searched for. If such a pixel is found, then the filtering window is moved to a new position and again a neighbor, with ROAD measure lower than the initial value is looked for. If it is encountered, the window is moved again, otherwise the process is terminated and the starting pixel is replaced with the last pixel in the path formed by the iterative procedure of the window shifting. The comparison with the filters intended for the removal of noise in color images revealed excellent properties of the new enhancement technique. It is very fast, as the ROAD values can be pre-computed, and the formation of the paths needs only comparisons of scalar values. The proposed technique can be applied for the restoration of color images distorted by impulsive noise and can also be used as a method of edge sharpening. Its low computational complexity allows also for its application in the processing of video sequences.
In many practical situations visual pattern recognition is vastly burdened by low quality of input images due to noise, geometrical distortions, as well as low quality of the acquisition hardware. However, although there are techniques of image quality improvements, such as nonlinear filtering, there are only few attempts reported in the literature that try to build these enhancement methods into a complete chain for multi-dimensional object recognition such as color video or hyperspectral images. In this work we propose a joint multilinear signal filtering and classification system built upon the multi-dimensional (tensor) approach. Tensor filtering is performed by the multi-dimensional input signal projection into the tensor subspace spanned by the best-rank tensor decomposition method. On the other hand, object classification is done by construction of the tensor sub-space constructed based on the Higher-Order Singular Value Decomposition method applied to the prototype patters. In the experiments we show that the proposed chain allows high object recognition accuracy in the real-time even from the poor quality prototypes. Even more importantly, the proposed framework allows unified classification of signals of any dimensions, such as color images or video sequences which are exemplars of 3D and 4D tensors, respectively. The paper discussed also some practical issues related to implementation of the key components of the proposed system.
A long-lasting inflammation of joints results between others in many arthritis diseases. When not cured, it may influence other organs and general patients' health. Therefore, early detection and running proper medical treatment are of big value. The patients' organs are scanned with high frequency acoustic waves, which enable visualization of interior body structures through an ultrasound sonography (USG) image. However, the procedure is standardized, different projections result in a variety of possible data, which should be analyzed in short period of time by a physician, who is using medical atlases as a guidance. This work introduces an efficient framework based on statistical approach to the finger joint USG image, which enables automatic localization of skin and bone regions, which are then used for localization of the finger joint synovitis area. The processing pipeline realizes the task in real-time and proves high accuracy when compared to annotation prepared by the expert.
In this paper we address the problem of the reduction of multiplicative noise in digital images. This kind of image distortion, also known as speckle noise, severely decreases the quality of medical ultrasound images and therefore their effective enhancement and restoration is of vital importance for proper visual inspection and quantitative measurements. The structure of the proposed Pixel-Patch Similarity Filter (PPSF) is a weighted average of pixels in a processing block and the weights are determined calculating the sum of squared differences between the mean of a patch and the intensities of pixels of the local window at the block center. The structure of the proposed design is similar to the bilateral and non-local means filters, however we neglect the topographic distance between pixels, which decreases substantially its computational complexity. The new technique was evaluated on standard gray scale test images contaminated with multiplicative noise modelled using Gaussian and uniform distribution. Its efficiency was also assessed utilizing a set of simulated ultrasonographic images distorted by means of the Field II simulation software and real ultrasound images of a finger joint. The comparison with the state-of-the-art techniques revealed very high efficiency of the proposed filtering framework, especially for strongly degraded images. Visually, the homogeneous areas are smoother, while image edges and small details are better preserved. The experiments have shown that satisfactory results were obtained with patches consisting of only 9 samples belonging to a relatively small processing block of 7x7 pixels, which ensures low computational complexity of the proposed denoising scheme and allows its application in real-time image processing scenarios.
In the paper a novel filtering design based on the concept of exploration of the pixel neighborhood by digital paths is presented. The paths start from the boundary of a filtering window and reach its center. The cost of transitions between adjacent pixels is defined in the hybrid spatial-color space. Then, an optimal path of minimum total cost, leading from pixels of the window's boundary to its center is determined. The cost of an optimal path serves as a degree of similarity of the central pixel to the samples from the local processing window. If a pixel is an outlier, then all the paths starting from the window's boundary will have high costs and the minimum one will also be high. The filter output is calculated as a weighted mean of the central pixel and an estimate constructed using the information on the minimum cost assigned to each image pixel. So, first the costs of optimal paths are used to build a smoothed image and in the second step the minimum cost of the central pixel is utilized for construction of the weights of a soft-switching scheme. The experiments performed on a set of standard color images, revealed that the efficiency of the proposed algorithm is superior to the state-of-the-art filtering techniques in terms of the objective restoration quality measures, especially for high noise contamination ratios. The proposed filter, due to its low computational complexity, can be applied for real time image denoising and also for the enhancement of video streams.
In this paper a system for real-time recognition of objects in multidimensional video signals is proposed. Object
recognition is done by pattern projection into the tensor subspaces obtained from the factorization of the signal tensors
representing the input signal. However, instead of taking only the intensity signal the novelty of this paper is first to build
the Extended Structural Tensor representation from the intensity signal that conveys information on signal intensities, as
well as on higher-order statistics of the input signals. This way the higher-order input pattern tensors are built from the
training samples. Then, the tensor subspaces are built based on the Higher-Order Singular Value Decomposition of the
prototype pattern tensors. Finally, recognition relies on measurements of the distance of a test pattern projected into the
tensor subspaces obtained from the training tensors. Due to high-dimensionality of the input data, tensor based methods
require high memory and computational resources. However, recent achievements in the technology of the multi-core
microprocessors and graphic cards allows real-time operation of the multidimensional methods as is shown and analyzed
in this paper based on real examples of object detection in digital images.
The rapid development of the Internet in the early 1990s caused an explosive growth of publicly accessible multimedia
resources. It created new viewpoint on storage, distribution and processing of enormous collections of images. Along
with the development of the World Wide Web there is much effort dedicated to create a content-based image retrieval
systems which are able to efficiently index, retrieve and manage large scale databases. In this paper we propose a color
indexing method based on the Gaussian Mixture Model of color histograms. The model parameters serve as signatures
enabling fast and efficient color image retrieval. In this paper we show that the proposed approach is robust to color image
distortions introduced by lossy compression artifacts and therefore it is well suited for indexing and retrieval of Internet
based collections of color images stored in lossy compression formats.
KEYWORDS: Image segmentation, Image processing, RGB color model, Digital imaging, Image processing algorithms and systems, Image quality, 3D image processing, Video, Image filtering, Binary data
Colorization is a term introduced by W. Markle1 to describe a computerized process for adding color to black
and white pictures, movies or TV programs. The task involves replacing a scalar value stored at each pixel of
the gray scale image by a vector in a three dimensional color space with luminance, saturation and hue or simply
RGB. Since different colors may carry the same luminance value but vary in hue and/or saturation, the problem
of colorization has no inherently "correct" solution. Due to these ambiguities, human interaction usually plays
a large role.
In this paper we present a novel colorization method that takes advantage of the morphological distance transformation,
changes of neighboring pixel intensities and gradients to propagate the color within the gray scale
image. The proposed method frees the user of segmenting the image, as color is provided simply by scribbles
which are next automatically propagated within the image. The effectiveness of the algorithm allows the user
to work interactively and to obtain the desired results promptly after providing the color scribbles. In the paper
we show that the proposed method allows for high quality colorization results for still images.
The rapid growth of image archives increases the need for efficient and fast tools that can retrieve and search through large amount of visual data. In this paper we propose an efficient method of extracting the image color content, which serves as an image digital signature, allowing to efficiently index and retrieve the content of large, heterogeneous multimedia databases. We apply the proposed method for the retrieval of images from the WEBMUSEUM Internet database, containing the collection of fine art images and show that the new method of image color representation is robust to image distorsions caused by resizing and compression and can be incorporated into existing retrieval systems which exploit the information on color content in digital images.
The smoothing function of widely used vector filters such as vector median (VMF), basic vector directional filter (BVDF) and directional distance filter (DDF) is designed to perform the fixed amount of smoothing. It may become the undesired property, because in some image areas these filters introduce too much smoothing and blur thin details and image edges. In general, the common problem is how to preserve some desired signal features while the noise elements are removed. An optimal situation would arise if the filter could be designed so that the desired features were invariant to the filtering operation and only noise would be affected. In case of the impulsive noise corruption, the problem is stated often as searching for the switching function that allows to reduce the filter effect only to noisy samples. In this paper, a new nonlinear filtering scheme for the removal of impulsive noise in multichannel digital images is presented. A new class of multichannel sigma filters is based on the combination of the standard sigma-filter concept provided by Lee and the robust order-statistics theory. With respect to a variety of the measures (e.g. vector distance expressed through Minkowski metric, angular distance or their combination) for quantification of the distance between multichannel samples, we provide a rich class of adaptive vector sigma filters taking advantages of the threshold structure with the approximation of the standard deviation and also the fully adaptive filter structure. Thus, by adaptive switching between the smoothing function and the identity operation, the behavior of the proposed method is attractive for filtering of image environments degraded by impulsive noise, bit errors and outliers. The new filtering scheme is computationally efficient and able to achieve excellent balance between the image detail preservation and the noise suppression. The achieved results show that the new filtering class has excellent preservation capabilities and provides significant improvement in comparison with well-known vector filters such as VMF, BVDF and DDF in terms of all commonly used quality measures.
In this paper we address the problem of impulsive noise reduction in multichannel images. A new class of filters for noise attenuation is introduced and its relationship with commonly used filtering techniques is investigated. The computational complexity of the new filter is significantly lower than that of the Vector Median Filter,
(VMF). Extensive simulation experiments indicate that the new filter outperforms the VMF, as well as other techniques currently used to eliminate impulsive noise in color images.
We provide a unified framework of nonlinear vector techniques outputting the lowest ranked vector. The proposed framework constitutes a generalized filter class for multichannel signal processing. A new class of nonlinear selection filters are based on the robust order-statistic theory and the minimization of the weighted distance function to other input samples. The proposed method can be designed to perform a variety of filtering operations
including previously developed filtering techniques such as vector median, basic vector directional filter, directional distance filter, weighted vector median filters and weighted directional filters. A wide range of filtering operations is guaranteed by the filter structure with two independent weight vectors for angular and distance domains of the vector space. In order to adapt the filter parameters to varying signal and noise statistics, we provide also the generalized optimization algorithms taking the advantage of the weighted median filters and the relationship between standard median filter and vector median filter. Thus, we can deal with both statistical and deterministic aspects of the filter design process. It will be shown that the proposed method holds the required
properties such as the capability of modelling the underlying system in the application at hand, the robustness with respect to errors in the model of underlying system, the availability of the training procedure and finally, the simplicity of filter representation, analysis, design and implementation. Simulation studies also indicate that the new filters are computationally attractive and have excellent performance in environments corrupted by bit errors and impulsive noise.
In this paper a novel approach to the problem of edge preserving noise reduction in color images is proposed and evaluated. The new algorithm is based on the combined forward and backward anisotropic diffusion with incorporated time dependent cooling process. This method is able to efficiently remove image noise, while preserving and even enhancing image edges. The proposed algorithm can be used as a first step of different techniques, which are based on color, shape and spatial location information.
We provide a new non-motion compensated adaptive multichannel filter for the detection and removal of impulsive noise, bit errors and outliers in color video or color image sequences. The proposed nonlinear filter takes the advantages of the concept of the local entropy contrast and the robust order-statistics theory. The new entropy based vector median is computationally attractive, robust for a wide range of the impulsive noise corruption and significantly improves the signal-detail preservation capability of standard vector median filter. Because the precision of statistical operators such as mean or entropy increases with the increased number of observed samples, the used spatiotemporal cube filter window guarantees a high accuracy of the proposed method that is able to achieve excellent results in terms of commonly used objective measures and clearly outperforms standard vector filtering schemes
This paper presents a new filtering scheme for the removal of impulsive noise in color images. It is based on estimating the probability density function for color pixels in a filter window by means of the kernel density estimation method. A quantitative comparison of the proposed filter with the vector median filter shows its excellent ability to reduce noise while simultaneously preserving fine image details.
KEYWORDS: Image filtering, Digital filtering, Optical filters, Nonlinear filtering, Signal to noise ratio, Image processing, Denoising, RGB color model, Color image processing, Image enhancement
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.