PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 12155, including the Title Page, Copyright information, and Table of Contents.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Digital image quality is disturbed by noise to some extent. Researchers proposed a series of wavelet transform, non-local mean, and partial differential equation denoising algorithms to obtain high-quality images for subsequent research. Removing noise and preserving the edges and details of the image has attracted wide publicity. Methods based on anisotropic diffusion models have recently gained popularity, but these lead to over-smooth the image details. In this paper, we propose an improved denoising algorithm based on the anisotropic diffusion model. Our method further modifies the diffusion coefficient of the denoising model based on fractional differential operator and Gauss curvature (FDOGC). We use the edge-preserving characteristic of bilateral filtering to recover the image texture and adjust the diffusion coefficient given the characteristics of local variance. To balance the performance of denoising and edge-preserving, we add a regularization term to the diffusion model. We conduct ablation studies to verify the effectiveness of the innovation points. Our method can adjust the counterpoise between noise removal and edge preservation. Extensive experiments on public standard datasets indicate the superiority of our algorithm, in terms of not only quantitative and qualitative evaluation but also better visual effects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Camera trap, a digital camera that is automatically triggered by activities around, has been widely used in wildlife conservation for decades to capture animals on film for later analysis. With increasingly available vision data (photos and videos) from camera traps in recent years, it becomes prohibitively costly to manually extract useful information from these data. In this project, I aim to help automate the process of knowledge extraction from the camera trap data with the help of deep learning models. Specifically, a popular convolutional neural network (CNN) architecture called YOLOv3 was used as the pre-trained model through transfer learning. The model was then fine-tuned on thousands of camera trap images that I primitively obtained from a crowdsourced Zooniverse dataset and subsequently labeled using an object tagging tool. Compared to previously proposed work of wildlife recognition, my model further performs wildlife detection by locating the object detected and adding a bounding box in addition to identifying the species. As a result, the trained model is applied to photos of wildlife taken by myself for predictions, and results show that the model is able to accurately and confidently classify and locate multiple wildlife in both photos and real-time videos.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this study, a typical Branch maogou valley in Wangmaogou watershed of loess Plateau was selected asss the research object. Consumer uav was combined with 1 ∶ 500 tilt photo grammetry .The flight control software GS RTK App is used to plan the zigzag course and simulate the effect achieved by multi-lens tilt camera. Meanwhile, tilt and orthographic images are obtained with the help of ground control points. Pix4D, Smart3D and other software were used to construct a high-resolution 3D model of erosion gully, as well as DOM and DEM. According to the detailed evaluation of the obtained results, both the horizontal error and elevation error of the two methods can meet the requirements of the Internal Practice Specification for Low Altitude Digital Aerial Photogrammetry (GH/Z 3003-2010). At the same time, the point cloud data obtained is more dense and uniform than RTK manual measurement, which solves the problem of reduced measurement accuracy in the area inaccessible to people due to the complex and steep terrain in the erosion ditch. On the premise of reasonable application, this method can replace hand-held RTK manual measurement, and its convenience, maneuverability and accuracy make it have the potential of popularization and application in watershed erosion monitoring.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The laser point cloud has high density and large amount of data, which will cause the point cloud coarse registration to have a high time cost and unstable registration accuracy. Point cloud fine registration takes the transformation parameters obtained from the coarse registration as the initial value, and usually uses the standard Iterative Closest Point(ICP) algorithm to find the corresponding points and iteratively optimize the transformation parameters. For improving the accuracy and robustness of the laser point cloud registration, this paper proposes to use the 3D Difference-of-Gaussian(DoG) operator to extract the key points with curvature invariance, and then input the key point cloud into 4-Points Congruent Sets(4PCS) algorithm performs coarse registration, and finally uses the standard ICP algorithm to perform fine registration. After using the method in this paper to do registration experiments on three datasets, the effectiveness of the method is verified.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Semi-Global Matching (SGM) algorithm is a conventional method in dense stereo matching and provides an acceptable result. Nevertheless, low accuracy and slow computation speed have been crucial factors restricting the processing of larger images. Meanwhile, similar texture, which appeared enormously on remote sensing images, ordinarily issues in the dilemma of computation failure. In this respect, the paper presents the method of stratifying and precis disparity search space by SGM pyramid and local invariant features to improve computational efficiency, reduce memory footprint and shrink the influence of similar textures, namely RS-rSGM. Experimental results indicate that RS-rSGM can efficiently improve the speed and reduce the time cost of computation on large resolution multi-similarity texture images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Very deep convolutional neural networks (CNNs) have shown great power in image compressed sensing (CS) reconstruction and achieved significant improvements against traditional methods. Among these CNN-based methods, the number of convolutional feature maps is critical to the performance of the network. However, existing algorithms only perform average weighting processing on feature maps, and do not make full use of image feature differences to adaptively assign feature weights. To address this issue, we propose an attention mechanism network for image compression sensing reconstruction (AM-CSNet). AM-CSNet uses multiple attention modules (AM) to adaptively learn feature weights in the channel and spatial dimensions, which makes the model more lightweight and efficient. To maximize the performance of AM-CSNet, we use Residual Feature Aggregation Group (RFAG) to fully retain the features on different residual branches. Extensive CS experiments demonstrate that the proposed AM-CSNet is superior to many other state-of-the-art methods, such as TIP-CSNet and SCSNet.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a new method based on Functional Maps to estimate the dense point-pair relationship between single-view 3D human point clouds and template point clouds. At present, most of the relevant work is based on triangular information to estimate the correspondence, and our innovation is to process directly on the point clouds. Because the single-perspective point clouds don’t have the full human body information, these methods cannot effectively find out the correspondence. Firstly, the template is used to complete the missing human body information to obtain the full human structure, so that the Laplace-Beltrami operator (LBO) can be calculated effectively. Then, features are extracted based on the deep learning method, and geometric information was converted to spatial information. Finally, the linear function mapping is calculated to characterize the dense point-to-point correspondence.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Stereo matching usually makes up of four steps: cost computation, cost aggregation, disparity optimization, and disparity refinement. The disparity refinement is used to further eliminate mismatches caused by occlusion, low texture, and other factors. The popular refinement methods are based on the consistency check of left and right two disparity maps. For efficiency, we propose a novel multistep disparity refinement framework using only one-sided image, which is organized into four main steps: leftmost occlusion detection, four-directional scanline outlier detection, black hole detection and eight-directional disparity propagation. Experimental results on Middlebury datasets show that our method is comparable with other postprocessing strategies, especially in occlusion handling, retaining object shapes and preserving discontinuities.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
When studying the machine vision of engineering machinery, it is found that the target detection algorithm has a low detection rate when dealing with the photos of night scene. In order to solve this problem, the image is enhanced before entering the neural network. This paper compares the common gray transformation image enhancement methods such as nonlinear transformation, linear transformation, logarithmic transformation and contrast stretching. By adjusting parameters, the four algorithms achieve the best results in dealing with pictures in dark environment. The comparative experimental results show that the image effect using contrast stretching technology is most similar to the image in daily environment. The image processed by contrast stretching can be easily recognized by the target detection algorithm to achieve the research purpose.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As we all know, COVID-19 is causing more and more human infections and deaths. In order to quickly and efficiently detect COVID-19, this paper has firstly proposed a detection framework based on reinforcement learning for COVID-19 diagnosis. We use the accuracy of the validation set as the reward value, and obtain the initial model for the next epoch by searching the model corresponding to the maximum reward value in each epoch. We also have proposed a prediction framework that integrates multiple detection frameworks using parameter sharing to predict the progression of patients’ disease. We experimented with our own dataset screened by professional physicians and obtained more excellent results. In external validation, we still achieved a high accuracy rate without additional training. Finally, the experimental results show that our classification accuracy can reach 96.81%, and the precision, sensitivity, specificity, and AUC (Area Under Curve) are 95.47%, 98.64%, 95.91%, and 0.9698, respectively. The accuracy of external verification can reach 93.04% and 90.85%. The accuracy of our prediction framework is 91.04%. A large number of experiments have proved that our proposed method is effective and robust for COVID-19 detection and prediction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To improve the accuracy of outdoor irregular object volume calculation, a signed projection method based on the reference plane is proposed for the determination of the reference plane of the projection method. First, use the Cloth Simulation Filter (CSF) to filter the 3D point cloud model to obtain the ground point cloud; then use the RANSAC algorithm to fit the ground point cloud to obtain the reference plane; secondly, use the signed projection method to calculate the model volume; finally, the true height of the object is calculated by the EPnP algorithm as a scale. In the experimental part, model the iron box and calculate the volume. Comparing the results with the slicing method and the traditional projection method, it is better than other algorithms in terms of accuracy and running time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Network pruning has achieved great success in the compression and acceleration of neural networks on resource- limited devices. Previous pruning algorithms utilize filter pruning or channel pruning with the definition of a specific global or local pruning rate. Conventional pruning only finds or considers global or local pruning rates. In the only consideration of global pruning works, they ignore the individual characteristics of each layer. Similarly, only consideration of local pruning works could lead to a fragmented connection between layers. In this paper, we propose a novel method named global and local pruning under knowledge distillation (GLKD) by a combination of filter pruning and channel pruning technology, which is trained with a mixture of global and local pruning rates. The proposed algorithm, GLKD, can accelerate the inference of ResNet-110 to 56.2% speed-up with 0.17% accuracy increase on the CIFAR-100 dataset, which has great trade-offs in accuracy and compression. Additionally, the experiments of GLKD on ImageNet with ResNet-56 and ResNet-110 are conducted to prove its effectiveness on the compressed model. Moreover, the knowledge distillation is adopted on the pruning step in GLKD algorithm and improves the accuracy of the pruned network.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In traditional self-supervised visual feature learning, convolutional neural networks (ConvNets) trained by a proposed pretext task with only unlabeled data encode high-level semantic visual representations for downstream tasks of interest. The proposed pretext tasks are mostly based on images or videos. In this work, starting from the feature layers, we propose a completely new pretext task formulated within ConvNets, and use it to enhance the supervised learning of fully labeled datasets. We discard the channels on feature maps after particular convolutional layers to generate self-supervised labels, and combine them with the original labels for classification. Our objective is to mine richer feature information by making ConvNets understand which channels are missing at the same time of classification. Experiments show that our improvement is effective across multiple models and datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
When the traditional image recoloring algorithms recolor the local objects in the image, other regions with similar colors in the image will be also changed and the color leakage will be occurred. This paper presents a local recoloring algorithm based on image segmentation. Firstly, our proposed algorithm segments the target region by using the improved GrabCut image segmentation algorithm based on bilateral filtering, so the edge of the target region is smoothed to solve the problem of under segmentation and reduce the edge sawtooth of the segmentation result. Secondly, the image recoloring algorithm can be used to recolor the segmented target region. Finally, the target region is superimposed with the original image to realize the local recoloring of the image. The experimental and theoretical analysis show that our algorithm can solve the problem of color leakage and achieve local recoloring compared with the traditional image recoloring algorithm. The proposed local recoloring algorithm has a significant improvement in structural similarity (SSIM) compared with other image recoloring algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents safe zone detection and tracking methods only based on vision for landing spacecraft on celestial bodies. Digital elevation model is used to generate a lunar surface image dataset. A modified residual-convolutional neural network is trained to extract craters from the trained binary images. For image sequences, safe landing areas without craters are recognized using Hough detection. Furthermore, when the camera loses the detected safe zone because of a violent shake, and then the safe zone moves back to the camera, our method can recognize it as beginning. The experimental proofs that our method improves the accuracy of crater identification and it can detect safe landing areas in the image sequence, In the case of large camera movement, the proposed method provides robust tracking results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Nowadays, people are gradually entering the digital information age. The rapid development of information technology has brought great challenges to people's life and work. Now, people are inseparable from this information age. Everywhere in all walks of life, information brings convenience. Gradually, people began to connect computer technology with digital media. The combination of the two made a startling breakthrough. This is also the direction of social development research. Through the breakthrough of computer vision art in digital media art application research, people skillfully use innovative type of computer, combined with the characteristics of digital art, to show digital media art in front of people with more perfect, more innovative products, and the existing technology connection innovation is applied to the extreme. This paper will discuss the application of computer vision art in digital media art, combining computer vision art with digital media art.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, deep convolutional features have been deployed in discriminative correlation filters (DCF) to boost object tracking performance. However, features captured from pre-trained classification networks are usually trained for image classification tasks, not object tracking. In this paper, we find that different convolutional feature channels play different roles in tracking different targets. Some feature channels are favorable for tracking a given target and can be acquired based on this target, some are irrelevant to track this target, and some can be the primary cause of trackers' performance degradation when tracking this target. Thus, we perform feature selection before learning correlation filters for object tracking, and the feature selection module is realized by reinforcement learning. We penalize the features non-positive to obtain a DCF tracker based on positive convolutional feature channels. Compared with DCF based trackers without a feature selection technique, our scheme improves the robustness of target representation, lessens the dimension of activations, and achieves better tracking performance. Extensive experiments on the OTB dataset demonstrate our feature selection scheme is simple, robust, and effective for DCF based trackers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Several neural networks corresponding to feature space in the paper were formed by Boosting method variant and RBF neural network based on particle swarm optimization (PSO), and these neural networks were integrated, so that the classification information of CAD three dimensional (3D) models was given. In the retrieval of CAD 3D model, the distance of the output results for the classifier and the distance for the feature space were weighted to calculate, which not only considered the difference of between the model's content and features, at the same time, and appended classification information parameters,but also took into account the semantic classified information of model. The experimental results showed that the classification method based on neural network ensemble could effectively improve the classification accuracy of CAD 3D model as well as consider the distance between models in feature space and the distance between models at semantic classification level, so that the 3D CAD model retrieval could be greatly improved accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the article, the development, processes, and sessions of OCR, recently published frameworks that aim at handwriting scripts of various kinds of languages, analysis of the algorithms, and recognition methods using different machine learning approaches are discussed in detail. Moreover, some challenges of handwriting words during the sessions of OCR are included.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Vehicle detection technology based on remote sensing images, as a new method of collecting traffic flow information, provides new ideas for traffic management. A feature-fusion-based convolutional neural network vehicle detection method is proposed. On the basis of image preprocessing, first use the VGG16 convolutional neural network to obtain multi-level features, and then use variable-scale stacking to obtain the basic feature layer to achieve the acquisition of deep convolution features, and then construct a feature pyramid to divide the basic feature layer operation, finally use the attention mechanism to fuse hierarchical information, and then efficiently extract vehicle features. In the example high-resolution remote sensing image vehicle automatic detection experiment, the vehicle automatic detection accuracy rate was 88.7%, and the false detection rate was 1.4%. The experiment shows that this model is better for automatic vehicle detection in high-resolution remote sensing images, especially in dense urban traffic scenes. good detection effect.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It has become a tendency to utilize the advantages of multi-platform to corporately solve the design problems of complex products. To handle the issue of the assembly models that losing the assembly constraints and modelling logs after exported in neutral format from CAD/CAM platform, and to enhance the editability of the neutral-format models loaded in alternative system, according to the standard No. ISO 10303-109 that describes the kinematics and geometrical constraints of assembly model, this paper presents a method that marks the geometrical constraints by the PMI modules, and records the assembly models and constraints through STEP+XML files, which guarantees the integrity of the assembly information while the models are reconstructed or exported. To verify the feasibility of this method, an executable module based on the NX is developed, and the data structure of the XML files, used to read or write the assembly constraints, is designed and actualized. Finally, in accordance with the test results of the four kinds of cases, this method can effectively help designers to solve the problems of constraint information loss for the assembly models under the cross-platform cooperation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The mainstream segmentation methods may suffer from problems in the separation of boundaries due to the presence of adhesion and overlapping. Therefore, subsequent processing is further required for segmentation tasks in different applications to obtain better performance. This paper proposes a post- segmentation processing method to solve the problem of boundary adhesion and overlapping. The method relies on morphological characteristics, which uses graphical rules to detect the target contour, find the adhesion area, and make judgments for separation. The proposed method is verified on post-segmentation processing of prediction masks of ore images, in which an improved U-Net model is firstly applied for ore image segmentation and then the proposed method is further applied for better boundary separation. The improved U-Net model segments the ores in pixel-level according to their types and output prediction masks of ore images. However, there are adhesions and overlaps among ores in the prediction masks. And our work strives to separate the adhered ore in the prediction masks. The experimental results show that our method is efficient in boundary separation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work, we propose an effective self-supervised method for document image binarization. The proposed method is based on image second-order central moment and multi-scale convolution neural network (CNN). It effectively binarizes document images by addressing degradation issues (such as uneven illumination, ink stain, and fading). We first remove noticeable noise and performs data normalization in preprocessing step. Then the pseudo binarization image is generated by the second-order central moment algorithm. Then a multi-scale self-supervised network is utilized to distinguish the foreground (character) from the degraded image (background). We combine traditional image processing and selfsupervised networks to ensure the efficiency and effectiveness of the method while improving the generalization ability on multiple data sets. Extensive experiments show that the proposed model performs best in DIBCO benchmarks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The mission of face anti-spoofing is to prevent facial fraud methods from creating security vulnerabilities in fraudulent systems and to improve system security and surveillance capabilities. With the widespread use of deep learning, face antispoofing methods have also seen a dramatic change. According to the chronological order of face anti-spoofing technology development and key technical challenges, this paper reviews face anti-spoofing algorithms from two aspects of traditional methods and deep learning-based methods. Firstly, based on extensive reading of the literature, this study analyzes traditional face anti-spoofing methods from the perspective of action command face live detection based on motion information/heuristic algorithm, face detection based on in vivo vital information and 3D face. Secondly, this paper further analyzes the deep learning-based face anti-spoofing method from the perspective of network structure and its variants of face anti-spoofing algorithms, face anti-spoofing with dual-stream training strategy, face anti-spoofing based on context feature and face anti-spoofing based on deep spatial and temporal information. Thirdly, the general data sets of face antispoofing are introduced, while the performance of representative algorithms in this field is compared and analyzed in detail. Finally, this article summarizes the existing problems and predicts future research directions in face anti-spoofing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
From structural characteristics, water quality, ecological environment, human influence and social function, this paper selected 25 indexes to establish an assessment system. Sequence analysis was performed to evaluate the advantages and disadvantages of lake remediation of sign lake in object layer and system layer based on entropy-weight TOPSIS. Then, cluster analysis was executed to categorize the remediation level of lakes based on modified DBSCAN. 26 lakes in Hangjiahu Plain and Xiaoshao Plain were researched according to the above methods. The results indicated that 1) Significant differences of object layer and system layer among lakes were detected. Xiaozhu Lake was optimal, and north Xiangfu Lake was worst. The difference was 4.45 times. 2) Correlation analysis showed that structural characteristics, water quality and ecological environment were the determining factors in the overall ranking. 3) Geographically, the higher ranking lakes existed in both regions, but almost all of the lower ranking lakes existed in Hangjiahu Plain. 3) All lakes were divided into four clusters. The first cluster was converted into wetland parks in different degrees, and they were the best remediation. The second cluster had obvious deficiencies in some system layer. The third cluster represented the current remediation level of plain lakes. The last cluster most were natural lakes that lack management, and they were the worst remediation. The conclusions of study on advantages and disadvantages of different aspects of lake remediation could be for the reference of concept optimization and management decision in lake remediation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Production safety is an eternal topic in industrial production. It is very important to detect the wearing condition of workers' safety helmets in construction sites to reduce the occurrence of production accidents. We collected pictures of construction site workers' hard hats, and then preprocessed and labeled them to train and test our models. In this paper, object detection algorithm based on convolutional neural network (YOLOv3) is used to detect whether workers wear safety helmets. Then, the model is improved and optimized by data enhancement, modifying training parameters and increasing training times. Experimental results show that the accuracy of the model is 83.48%, which indicates that the model has good generalization ability and can obtain better real-time recognition and detection effect under the condition of guaranteeing accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Case-based Reasoning (CBR) is an important reasoning methodology in the field of artificial intelligence. The main idea of CBR is to solve new problems by using historical cases. Since the construction of CBR models need to be based on the data of similar historical cases, the reusability of CBR models is usually low. Constructing CBR models for specific problems is one of the research hot-spots in the field of CBR methodology. The CBR model for power engineering cost estimation is studied in this paper. A novel CBR model considering the characteristics of power engineering industry is proposed. The multidimensional scale change (MDS) method and K-means method are introduced into the proposed CBR model to reduce the data dimensional and solve the problem of low calculation accuracy. An artificial neural network (ANN) model is constructed in the proposed CBR model and the deep learning technology is used to estimate the cost of engineering projects. Simulation results show that the proposed CBR model can estimate the cost of power engineering projects accurately and the estimation error is less than 8%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Computer Technology Application and Intelligent Design
With the continuous expansion of the application scope of BIM technology in the field of engineering construction, the CADC has also experienced the process of starting from nothing, from a daze to a firm technical route in the design work of exploring airport BIM technology. The pavement major is the basic major of flight zone design. In the process of applying BIM technology in many airports to design, the Civil Aviation Administration of China summarized a set of field road BIM design methods and developed an airport field road BIM design system. And the key issues in development are introduced.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Contemporarily, with the in-depth understanding of hydrodynamics, human beings are gradually applying the knowledge of fluids to architecture and making breakthroughs in this field. This paper discusses the applications of hydrodynamic on constructions based on literature review and information retrieval. Specifically, we demonstrate methods to take advantage of the effect of the wind on multistorey buildings and compares the impacts of different house layouts on the wind velocity around the buildings by 3D modelling and simulating the airflow around the buildings. In addition, the stability of suspension bridges and how to prevent damage are investigated by analysing static wind load and aerostatic instability. Based on calculating hydrodynamic loads, rope dipole moments, and external fluid predictions, the safety of the undersea tunnel is further addressed. These results shed light on fluid mechanics implementation in civil engineering.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Urban road landscape is a complex spatial system, not only it is an important part of the urban open space, but also it plays a role in the road environment. Road landscape design refers to the landscape design from an aesthetic point of view, taking full account of the harmony between the road landscape and the natural environment, so that drivers and passengers feel safe, comfortable and harmonious. However, the current design of the urban road landscape is less recognisable. Therefore, this paper firstly introduces the basic theory of urban road landscape design and analyses the recognition of road landscape in combination with the design principles of urban road landscape. This paper presents a design concept for the identifiability of urban road landscapes in the context of design for identifiability. This study will provide some theoretical support for the design of identifiable urban road landscapes and help promote the innovative development of urban road landscape design for traffic safety.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
At present, the related work of ground-to-air geolocation is mainly focused on alignment, that is, the direction of the street view image is accurately aligned with the corresponding satellite image. However, the orientation of the street view image and the aerial image cannot be exactly aligned in real life. In this work, we first studied the problem of unaligned ground-to-air positioning. Since there is no published cross-view dataset with unaligned directions, this work processes the CVUSA dataset to generate a unaligned cross-view dataset. This work introduces correlation layer modules in the cross-view positioning task for the first time. We also design a new regression network to estimate the similarity of the two views and design a triple loss function based on the similarity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In view that the joint design of precast concrete structures is complex with great construction difficulty, this paper proposes a permanent formwork technology based on an assembled monolithic shear wall project and elaborates the key design and construction points of the technology for connection joints of prefabricated building components, which improves assembly efficiency and drives better development of prefabricated buildings.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Building Information Modeling (BIM) has been widely applied in the engineering field in recent years. In response to the Ministry of Education of the reform plan to cultivate high-quality workers and technical personnel, integrating BIM technology into the teaching system is significant. This paper analyzes the research and practice of teaching reform in architecture at Kunming University of Science and Technology Oxbridge College based on BIM, highlights the various teaching reform initiatives and the results achieved. In this paper, the current situation of BIM talents training in the department of architecture is analyzed as the starting point, and the initiatives for supporting the reform based on BIM technology are proposed. The results show that it is able to significantly improve student performance, enhance teaching quality, and improve employment rates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The safety of architectural design plays a very important role in safety and stability. Taking a large commercial building in Tai'an City as an example, this paper analyzes its safety grade, and then extends the research scope to other fields through the analysis of this building. Because there are many evaluation indexes, it is difficult to accurately reflect the design safety grade, therefore, fuzzy comprehensive evaluation is selected to build a multi-level matrix evaluation system. Four firstlevel indexes and 16 second-level indexes are set up, choose to use the method of analytic hierarchy process to determine a series of weights of indicators, and the first-level and second-level evaluation vectors are calculated by MATLAB. According to the basic principle of the maximum membership degree of this research method, the safety grade of a large commercial building design in Tai'an City is evaluated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The analysis of driving status has always been one of the important research directions in the field of autonomous driving. With the development of deep learning, convolutional neural networks have become a hot spot in optical flow computing technology. This article mainly describes the difference between the calculation method of optical flow in deep learning and low-level optical flow calculation, and uses a low-level optical flow algorithm to realize more accurate and efficient analysis, calculation, and recognition of the driving state of autonomous driving based on the data in the KITTI database. Finally, it summarizes the existing problems and puts forward ideas.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
When Online learning got popular during the COVID-19 pandemic, tracking students’ in-class attention became a troublesome business. Our experiment is designed to find the possibility and reliability of using EEG signals to detect students’ attention level and ultimately determine whether detecting EEG signals can help online classes. It turns out human’s attention level could be determined, and such property could be used to develop certain device to help online teaching.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Medical image segmentation has long been suffering from the lack of datasets since labelling pathological data is laborious work and requires specialized skills, which could only be done by professional doctors, especially when it comes to nuclei semantic segmentation. Besides, due to the fact that the domain gap inevitably exists between different datasets, which could be caused by diversified staining methods or the heterogeneous appearance of different tissues, it is almost impossible to get labelled data under all circumstances. This paper applies domain adaptation as an effective and efficient method to align two domains in latent feature space. We experiment on both IoU and Excepted Calibration Error (ECE), an indicator mostly used in biomedical segmentation to evaluate our work. In two domain adaptation tasks, i.e., TNBC and MoNuSeg, we proved that by exchanging the low frequency of two styles of the datasets, can Fourier Domain Adaptation (FDA) successfully achieve a considerable increasement of 1% and 2.29% higher than simply using source images to train with U-net in the target dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Acquisition of target information in satellite interactive missions plays a key role in aerospace field. Tracking and Segmentation of Satellite Components are faced with some problems, such as insufficient illumination in space environment and occlusion of satellite components. This paper presents an effective approach to achieve video object segmentation under low light and occlusion of satellite components. Our approach is based on Rethinking Space-Time Networks with Improved Memory Coverage(STCN), and it can track and segment satellite components in video sequences. To solve the problems of target loss and low light in the space environment during the overturning of satellite components, we propose a position information encoding strategy. We improve the generalization ability of the model for image position information by embedding the position information matrix. Finally, we trained the model using the DAVIS dataset and the satellite dataset we built. Experiment results verify that our model improves 3.9% of J&F compared to STCN and its speed can reach 20+ frames per second(FPS).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the process of coal mining, the separation of coal and gangue is a very important step. Traditional coal preparation methods include manual coal preparation, heavy medium coal preparation, ray projection coal preparation, etc. these methods can not separate coal and gangue under the condition of safety and speed at the same time. Therefore, to improve the recognition rate of coal gangue separation, a coal gangue recognition method based on improved Support Vector Machine is proposed in this paper. First, the images of the coal and gangue are preprocessed. Then, the gray and texture features of the coal and gangue are extracted from the preprocessed images. Finally, each feature vector is input into the Support Vector Machine model optimized by Fruit Fly for recognition and classification. The experimental results show that the accuracy is 96.33%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, with the development of earth observation technology, satellite video can use optical sensors to obtain continuous images from mobile satellite platforms, providing new data for the detection and tracking of large-scale moving targets. At present, moving target detection algorithms have been widely used in ground surveillance video. However, there are many challenges in applying existing target detection algorithms directly to satellite video due to the low resolution of satellite video, non-appearance texture feature of small target, low signal-to-noise ratio, and nonstationary camera platforms, etc. Therefore, we propose a new satellite video moving target detection and tracking framework for this new type of computer vision task. First, we utilize a tensor data structure to exploit the inner spatial and temporal correlation to extract region of interest for target movement. Then, we designed a recognition strategy based on multi-morphology and motion cues to further identify the correct moving targets from the existing noise. Finally, we associate the target detection results of each frame to achieve multi-target tracking. We manually annotated the video data of Jilin-1 satellite, tested the algorithm under different evaluation criteria, and compared the test results with the most advanced benchmark, which proved the advantages of our framework over the benchmark. In addition, the data set can be downloaded from https://github.com/QingyongHu/VISO.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The traditional method of making true digital orthophoto map is orthophoto correction using digital surface model. Tall buildings will be blocked due to the displacement of image points in orthophoto correction. This paper studies the generation method of true digital orthophoto map based on 3D point cloud geometry, and completes the production of true digital orthophoto map through point cloud registration, absolute orientation, equal interval sampling, vertical projection and texture mapping. This paper studies the equal interval sampling method and vertical projection method in this process, which reduces the time of traditional methods in masking detection and repair, it has high automation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper combs the application and implementation of the achievement of “Three Lines and One List”. Based on Chongqing’s “Three Lines and One List” achievement data, the “Three Lines and One List” information management platform for strategic environmental assessment of Chongqing Yangtze River Economic Belt was established by using GIS, big data and other technologies. The “Three Lines and One List” control requirements were combined with the daily management of environmental protection, integrating functions such as data and achievement management, query and display of comprehensive data, analysis of intelligent judgment services. The platform standardized and integrated the achievement of “Three Lines and One List”, and was intelligently applied to planning and project EIA, adept at serving the implementation of major projects, optimizing the business environment, and providing decision support for optimizing strategies, planning EIA and ecological environment spatial planning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Based on users’ perceptual demand, this paper uses the method of Kansei Engineering to study the form of indoor activity space in public buildings. It establishes the relationship between the perceptual semantics and the form elements of indoor activity space, then proposes the design strategies for the form of indoor activity space with the flow theory. First, this research collects all kinds of indoor activity space pictures via the Internet and selects the representative samples. Second, the semantic differential scale is designed based on the perceptual words related to the indoor activity space. Third, by processing the data obtained from the experiment, we establish the regression model between the form element of indoor activity space and perceptual factors. Finally, combined with the perspective of flow theory, this paper sought for the law of quantitative results and applied the results to computer design, proposing an effective, accurate and innovative strategy of indoor environment design.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The exploration of geology and geophysics as an ancient and mysterious enterprise has uncovered the mysteries of different corners of the earth. Advances in machine learning have encouraged geologists and geophysicists to explore the earth further. Geologists develop and apply geological exploration machines to explore different parts of the world through previous practical research and academic theories. In this paper, the applications of machine learning in studying geological features including volcanoes, rocks, glaciers are systematically introduced. On the other hand, the utilization of machine learning analysis in exploration of the earthquake, geothermal, and geomagnetism. In summary, machine learning has provided more in-depth data collection, material analysis and model establishment for geophysical research objects, making geophysical research more active and effective development. This state-of-art review provide a comprehensive understanding of machine learning application in geology and geophysics and provides a clear guidance for further high efficiency and high precision geology and geophysics research.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As an important geometric feature of surface representation, the point cloud boundary plays an important role in the accuracy of 3D surface reconstruction. Aiming at the problems of the complex calculation process, low efficiency, and poor extraction effect of the existing point cloud boundary extraction algorithms, a new method of extracting point cloud boundary based on slicing technology is proposed. By setting the appropriate slice bandwidth, the point cloud is divided into horizontal slice block set and vertical slice block set respectively, and the internal and external boundary feature points are extracted by using the distance threshold between point clouds. Compared with the existing methods, the results show that this method can retain the feature information of the target object point cloud, effectively segment the boundary features of different shape object point clouds, and has strong robustness and accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at the problems of insecure user data in electric vehicle charging piles and easy waste of charging pile resources, an electric vehicle charging pile shared charging pile management system based on energy blockchain is proposed. The blockchain has the characteristics of decentralization, smart contracts, and openness and transparency, and uses its characteristics to construct a charging alliance chain algorithm for electric vehicle sharing and charging. Through the Kanonymity model, the data of users, charging pile operators and managers are unified, and a management model of electric vehicle shared charging piles based on energy blockchain is established. The test results show that the electric vehicle shared charging management system based on the energy blockchain designed in the article can meet the daily charging needs of electric vehicles, effectively solve the problems of charging privacy leakage of electric vehicle users and the allocation of charging pile resources, and provide a safe and efficient operation mode for the construction and development of charing piles.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.