With advancements in unmanned aerial vehicle (UAV) technology, UAV applications are rapidly growing, and their operations are becoming increasingly intelligent. Localization of UAVs commonly relies on global navigation satellite systems combined with inertial navigation systems through sensor fusion. However, this approach is vulnerable to significant risks, such as signal spoofing. In military conflicts, signal spoofing by hackers poses a severe security threat with potentially catastrophic outcomes. To address this issue, we propose a two-stage vision-based UAV localization method. This approach utilizes multi-category semantic segmentation and template matching to establish a connection between heterogeneous sensors. Experimental results demonstrate the method’s effectiveness in accurately identifying the UAV’s location within extensive geographical areas captured in remote sensing images. In addition, it achieves high precision in aligning UAV locations with Baidu maps, offering robust and accurate localization capabilities.
Recently, Mueller matrix polarimetry has been widely used in a number of aspects, such as biomedicine, remote sensing, target decomposition, etc. However, it is still challenging for the measurement, decomposition, and depolarization resulting from complex light-matter interactions of mixed samples. In this work, the Mueller matrix measurement and decomposition methods are proposed for the mixed sample with rough surfaces. First, the linear combination of the two different Mueller matrices resulting in depolarization is proved. Second, the optimal method is used to decompose and correct the Mueller matrices of the sample. Finally, the accurate depolarization of the mixed sample is obtained by using the eigenvalue method. The experiment results validate that the method has great potential in the accurate measurement, noise reduction, and decomposition of Mueller matrices of mixed samples with rough surfaces.
Feature extraction and matching of remote sensing images is becoming increasingly important with a wide range of applications. It matches and superimposes images obtained from the same scene at different times, different sensors, and different angles, and maps the optimal to the target image. CNN-based algorithms have shown superior expressiveness compared to traditional methods in almost all fields with image. This paper optimises the network based on SuperPoint by replacing convolution with a depth-separable convolution which has smaller number of parameters, and replacing the conv block with a spindle-type Inverted Residuals block composed of dimension expansion, depth-separable convolution and Dimension reduction. The network depth is fine-tuned to ensure accuracy. The model is trained on the RSSCN7 remote sensing dataset. Compared with other traditional algorithms in a cross-sectional manner with the combination of SuperGlue, the optimized algorithm shows the superior comprehensive performance.
The task of segmenting small infrared targets, which have few pixels and weak features, has been a difficult problem in the field of small target image processing. Small targets exist not only in general images, but also widely in UAV cameras, communication base station cameras, rescue cameras and vehicle cameras. The study of small target segmentation algorithms is very important for analyzing and utilizing these images, and has important applications in security, transportation, and rescue. Traditional small target segmentation algorithms are able to segment objects with simple target contour edges and large differences in signal strength. The traditional algorithm often has high false detection rate and missed detection rate when facing several targets with weak signal strength, and does not perform well in complex scenes. In this paper, we introduce an infrared small target segmentation scheme facing multiple types and numbers of targets. We also produce an infrared UAV and pedestrian dataset for validation.
Distinguishing the target from the background, judging target occlusion, and real-time processing are the problems that the visual tracking algorithm still needs to solve. Color information and position information of the target block are fused as new features to track the target under the framework of particle filtering. First, the hues, saturation, value space, and color integral graph of the image are constructed. The vector representation of the target is obtained on the color integral image by sparse matrix. Then, candidate particles are produced by a particle filter and the sampling mode of particles is adjusted by a uniform acceleration model. The difference of particles reflects the position and scale change of the target. Finally, the candidate with the smallest eigenvector projection error is taken as the tracking target and the feature template is updated based on the tracking results. The presented algorithm can be used to track a single target in the color image sequence and has some robustness to the scale change, occlusion, and morphological change of the target. Experiment results on public datasets show that the proposed algorithm performs favorably in both speed and tracking effect when compared with other conventional trackers.
As the application of UAVs in military and civilian fields becomes more and more widespread, the detection of UAVs in the low-altitude range has also become an important research direction. Compared with radar and visible light detection, infrared technology has become the major UAV detection method with its advantages of all-weather and long range. Most of the current infrared target detection methods are based on convolutional neural networks (CNN), which achieve target detection through feature extraction and feature classification. The performance of all such detection algorithms is highly dependent on their training set. A data set with a large number of samples and wide coverage tends to train a more robust and accurate detector. So, to obtain better detection effects, we perform data augmentation on the infrared UAV dataset by adversarial generative network (GAN). First, we extract the targets from the training set and train a GAN network, using its generator to obtain many new targets which are different from the training set samples, then we randomly extend these targets to the original dataset, and finally we retrain the detectors using the new dataset to achieve better detection. We created an infrared UAV image dataset for our experiments, with only a single target on each image. After data augmentation, multiple UAV targets are randomly generated. The experiments demonstrate that the new dataset trains the model with better detection results. And the GAN data augmentation can be combined with many advanced detectors to make a large improvement in detection.
For the problems of large viewpoint variation, heavy distortion and small overlap area in UAV images registration, this paper proposes a density analysis based method to remove mismatches in putative feature correspondences. Our method uses intra-cluster topological constraints for mismatch filtering, which is based on a density-based hierarchical clustering algorithm. Compared with other methods that perform mismatch filtering based on neighborhood topological relationship, our method is more robust to viewpoint changes both in horizontal and vertical directions. The algorithm in this paper uses a coarse-to-fine strategy, which starts with establishing putative feature correspondences based on local descriptors, such as SIFT, ORB, etc. After that we focus on removing outliers by clustering these feature points and verifying topology consistency of the clusters in different images. We view the feature point matching problem as a correspondence problem of the same visual model in two images, and clustering the feature points based on density can approximate the separation of multiple visual patterns. We tested our algorithm on a UAV image dataset which includes several pairs of images and their ground truth. These image pairs contain viewpoint changes in horizontal, vertical and their mixture which produce problems of low overlap, image distortion and severe outliers. Experiments demonstrate that our method significantly outperforms the state-of-the-art in terms of matching precision.
The histomorphology of retina is closely related to some common human diseases, such as glaucoma, macular degeneration. The use of deep learning-assisted diagnosis reduces the rate of misdiagnosis and early screening of diseases. There are several difficulties in retinal vessel segmentation as follows. Small vessels located at the end of branches are difficult to be discerned by human eyes. Camera illumination is insufficient or overexposed, resulting in too bright optic disc area, low contrast and blurred retinal vessel side inspection. The unique tree bifurcation structure of retinal vessels is difficult to maintain the original appearance of the structure because the vessels are too thin to be detected. In this paper, we use a U-net network with a stacked full convolutional structure to achieve accurate segmentation of retinal vessels. The main work is as follows. Firstly, the original data are preprocessed: the database images are RGB images, and in order to improve the segmentation accuracy, the channels are extracted for preprocessing first. Secondly, CLAHE is performed to enhance the contrast of the vascular region. Finally, the data is fed into the network for training. U-net is a modified network model of FCN, which mainly consists of feature extraction and upsampling. The feature extraction is used to capture the contextual information in the image, and the upsampling part is used to recover the location information of the image. Compared with the existing algorithms, the proposed algorithm can segment retinal vessels more effectively its sensitivity and accuracy have been significantly improved.
In order to identify the camouflage materials in military targets, this paper extracts multiple features to study the difference in optical characteristics between natural targets and man-made camouflage materials. Since Fresnel reflection can be regarded as a statistical description of scattering, this paper uses a multi-angle polarization measurement device to measure polarization and scattering characteristics. According to the physical meaning of the Mueller-Jones matrix, the expressions of amplitude ratio and phase retardation are extracted. Based on Pauli decomposition, new scattering similarity parameter formulas is defined. We discuss the curves of three characteristic parameters and analyze the difference between natural objects and camouflage materials. The experimental results show that the characteristic curves change significantly at Brewster’s angle, which clearly distinguishes the target from the camouflage material.
Infrared images and visible images have different imaging principles and contain different information. The fusion of infrared and visible images can synthesize the information of both, and at the same time, the complete edge structure of infrared images can guarantee the acquisition of image information under harsh and complex environments. Therefore, this paper proposes an infrared and visible image fusion method based on deep learning. Visible and infrared image pairs are divided into high-frequency and low-frequency parts in this paper. The weighted average strategy is directly used to add the low-frequency parts of the fused image. This method Uses the ResNet network to visible and infrared images of the high frequency parts of image feature extraction. FISHER discriminant method was used to screen the extracted features, and ZCA whitening was performed on the selected features to further remove the redundant information in the features. The initial weight graph was obtained by L1 generalization of the whitening features, and the final weight graph was obtained by softmax method. The high-frequency parts of infrared image and visible light image were added according to the weights to get the fused image high-frequency part, and the high and low frequency parts of the fused image were added to get the final fused image. The experimental results were compared with other methods in terms of subjective feeling and objective indicators respectively. The experimental results showed that the proposed method was more natural in fusion effect and had advantages in objective indicators.
Optical frequency-domain reflectometry and frequency-modulated continuous wave (FMCW)-based sensing technologies, such as LiDAR and distributed fiber sensors, fundamentally rely on the performance of frequency-swept laser sources. Specifically, frequency-sweep linearity, which determines the level of measurement distortion, is of paramount importance. Sweep-velocity-locked semiconductor lasers (SVLLs) controlled via phase-locked loops (PLLs) have been studied for many FMCW applications owing to its simplicity, low cost, and low power consumption. We demonstrate an alternative, self-adaptive laser control system that generates an optimized predistortion curve through PLL iterations. The described self-adaptive algorithm was successfully implemented in a digital circuit. The results show that the phase error of the SVLL improved by around 1 order of magnitude relative to the one without using this method, demonstrating that this self-adaptive algorithm is a viable method of linearizing the output of frequency-swept laser sources.
The accuracy of air target identification is of great significance for air defense operations and civilian management. A fine-grained recognition model of aerial target based on bilayer faster regions with convolution neural network (Faster R-CNN) with feedback is proposed in the paper. Faster R-CNN model is a typical target detection model based on deep learning. However, its ability to distinguish categories with subtle differences is not enough. In the proposed model, Faster R-CNN model is used for the first training to get a classification model and the clustering analysis of the classification result is used to get confused categories. Then the first training model is fine-tuned to retrain the confusing categories. The model is tested in the FGVC-Aircraft-2013b data set, and the average training accuracy is raised from 88.7% to 89.3%, the accuracy of the classification is raised from 88.98% to 91.21%, which shows that this model is effective in improving the fine-grained identification of air targets.
Infrared small target detection is one of the key techniques in infrared search and track system, and the essence of infrared small detection is background suppression and target enhancement. Inspired by that fact that phase spectrum is proved to be more effective to extract the salient areas than the amplitude spectrum of Fourier transform, a new infrared small target detection method based on phase spectrum of quaternion Fourier transform (PQFT) is proposed in this paper. First of all, four features including intensity, motion, gradients of horizontal and vertical directions are used to construct a quaternion of PQFT. Then, the target enhancement map that highlights the salient regions in the time domain is computed using the inverse PQFT. At last, the real target is directly segmented by an adaptive threshold. Both qualitative and quantitative experiments implemented on real infrared sequences evaluate the proposed method, and the results demonstrate that our method possesses more robustness and effectiveness in terms of background suppression and target enhancement when compared with other conventional methods.
In order to obtain accurate and stable image stitching results, we propose a stitching method for two images captured from different viewpoints based on correlation transformation. Aiming at resolving the limitation of the projective transformation that is commonly used in image stitching, a transformation called dual-correlation transformation is proposed in this paper. First, the estimation result of the fundamental matrix is calculated by the direct linear transformation based on the corresponding points in two images. Second, according to the presented dual-correlation transformation, a pair of correlation transformation matrices that are needed for dual-correlation warp can be obtained to realize the correspondence of each pixel in different images. Up to this stage, the method of image stitching based on transformation matrices has been accomplished. Finally, an optimization method based on factorization is especially proposed to solve the discontinuity problem that may occur in the dual-correlation warp. The experimental results and analyses show that the proposed method can achieve more accurate and natural stitching effects and has less computing time of the images in separate scenes compared with other similar methods.
Distributed optical fiber sensors are an increasingly utilized method of gathering distributed strain and temperature data. However, the large amount of data they generate present a challenge that limits their use in real-time, in-situ applications. This letter describes a parallel and pipelined computing architecture that accelerates the signal-processing speed of sub-terahertz fiber sensor (sub-THz-fs) arrays, maintaining high spatial resolution while allowing for expanded use for real-time sensing and control applications. The computing architecture described was successfully implemented in a field programmable gate array (FPGA) chip. The signal processing for the entire array takes only 12 system clock cycles. In addition, this design removes the necessity of storing any raw or intermediate data.
The recently proposed robust principal component analysis (RPCA) theory and its derived methods have attracted much attention in many computer vision and machine intelligence applications. From a wide view of these methods, independent motion objects are modeled as pixel-wised sparse or structurally sparse outliers from a highly correlated background signal, and all these methods are implemented under an ℓ1 -penalized optimization. Real data experiments reveal that even if ℓ1-penalty is convex, the optimization sometimes cannot be satisfactorily solved, especially when the signal-to-noise ratio is relatively high. In addition, the unexpected background motion (e.g., periodic or stochastic motion) may also be included. We propose a moving object detection method based on a proximal RPCA along with saliency detection. Convex penalties including low-rank and sparse regularizations are substituted with proximal norms to achieve robust regression. After the foreground candidates have been extracted, a motion saliency map using spatiotemporal filtering is constructed. The foreground objects are then filtered out by dynamically adjusting the penalty parameter according to the corresponding saliency values. Evaluations on challenging video clips and qualitative and quantitative comparisons with several state-of-the-art methods demonstrate that the proposed approach works efficiently and robustly.
This paper proposes a finite adaptive neighborhood suppression algorithm based on singular value decomposition for small target detection in the infrared imaging system. The algorithm firstly does singular value decomposition on the whole gray image, selecting the larger singular values to reconstruct the image and achieving the purpose of noise suppression, thereby obtaining the image matrix contains only weak point of the target and its possible. Then, the pixels are divided into foreground and background in the fixed neighborhood followed by contrast enhancement. Experimental results show that this method can effectively preserve image details and the inhibiting effect is better.
Moving small target detection in infrared image is a crucial technique of infrared search and tracking system. This paper present a novel small target detection technique based on frequency-domain saliency extraction and image sparse representation. First, we exploit the features of Fourier spectrum image and magnitude spectrum of Fourier transform to make a rough extract of saliency regions and use a threshold segmentation system to classify the regions which look salient from the background, which gives us a binary image as result. Second, a new patch-image model and over-complete dictionary were introduced to the detection system, then the infrared small target detection was converted into a problem solving and optimization process of patch-image information reconstruction based on sparse representation. More specifically, the test image and binary image can be decomposed into some image patches follow certain rules. We select the target potential area according to the binary patch-image which contains salient region information, then exploit the over-complete infrared small target dictionary to reconstruct the test image blocks which may contain targets. The coefficients of target image patch satisfy sparse features. Finally, for image sequence, Euclidean distance was used to reduce false alarm ratio and increase the detection accuracy of moving small targets in infrared images due to the target position correlation between frames.
In the analysis of neural cell images gained by optical microscope, accurate and rapid segmentation is the foundation of nerve cell detection system. In this paper, a modified image segmentation method based on Support Vector Machine (SVM) is proposed to reduce the adverse impact caused by low contrast ratio between objects and background, adherent and clustered cells’ interference etc. Firstly, Morphological Filtering and OTSU Method are applied to preprocess images for extracting the neural cells roughly. Secondly, the Stellate Vector, Circularity and Histogram of Oriented Gradient (HOG) features are computed to train SVM model. Finally, the incremental learning SVM classifier is used to classify the preprocessed images, and the initial recognition areas identified by the SVM classifier are added to the library as the positive samples for training SVM model. Experiment results show that the proposed algorithm can achieve much better segmented results than the classic segmentation algorithms.
CFAR (Constant False Alarm Rate) is a key technology in Infrared dim-small target detection system. Because the traditional constant false alarm rate detection algorithm gets the probability density distribution which is based on the pixel information of each area in the whole image and calculates the target segmentation threshold of each area by formula of Constant false alarm rate, the problems including the difficulty of probability distribution statistics and large amount of algorithm calculation and long delay time are existing. In order to solve the above problems effectively, a formula of Constant false alarm rate based on target coordinates distribution is presented. Firstly, this paper proposes a new formula of Constant false alarm rate by improving the traditional formula of Constant false alarm rate based on the single grayscale distribution which objective statistical distribution features are introduced. So the control of false alarm according to the target distribution information is implemented more accurately and the problem of high false alarm that is caused of the complex background in local area as the cloud reflection and the ground clutter interference is solved. At the same time, in order to reduce the amount of algorithm calculation and improve the real-time characteristics of algorithm, this paper divides the constant false-alarm statistical area through two-dimensional probability density distribution of target number adaptively which is different from the general identifying methods of constant false-alarm statistical area. Finally, the target segmentation threshold of next frame is calculated by iteration based on the function of target distribution probability density in image sequence which can achieve the purpose of controlling the false alarm until the false alarm is down to the upper limit. The experiment results show that the proposed method can significantly improve the operation time and meet the real-time requirements on condition of keeping the target detection performance.
Considering the complex features of public
places such as mass passenger flow, congestion
and disorder, it is hard to count the number of
passengers precisely. In this paper, a method of
passenger counting system is proposed based on
the range image. This system takes advantage of a
Kinect sensor to acquire the 3D depth information.
First of all, the range image is smoothed with Mean
Shift algorithm to direct every partial pixel toward the
maximal probability density enhanced. Therefore,
the smoothened range image can be better applied
to the subsequent image processing. Secondly, a
classical dynamic threshold segmentation method is
applied to segment the head regions, and the 3D
characteristics of heads are analyzed. They are
differentiated by pixel width, area and circle-like
shape, which efficiently surpass the limits of 2D
images. In addition, the self-adaptive multi-window
tracing method is applied for predicting possible
trajectories, speeds and positions of multi-windows,
in which we establish tracing chains of multiple
targets and lock the tracing targets precisely. This
method proves to be efficient for background noise
removal and environmental disturbance suppression
and can be applied for implementation of the
identifying and counting of heads in public places.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.