KEYWORDS: Video, Prototyping, Education and training, Video surveillance, Video coding, Performance modeling, Data modeling, Feature extraction, Cameras, Visualization
Many existing works fail to make full use of temporal information and ignore the diversity of normal behaviors in video anomaly detection tasks. In this paper, we propose a multi-scale dynamic prototype unit based video anomaly detection method. Some works proposed an autoencoder anomaly detection model based on dynamic prototype unit (DPU), which effectively improves the performance of anomaly detection, but ignores the importance of different levels of features for normal event modeling. Therefore, this paper proposes an anomaly detection model based on multi-scale dynamic prototype unit (DPU), which uses memory units to establish connections between encoder and decoder. Normal patterns at different scales are learned. In addition, based on the Temporal Shift technique, the temporal information of video can be mined more effectively to generate future video frames. Experimental results on UCSD Ped2, CUHK Avenue and ShanghaiTech datasets show that the proposed method is superior to the current mainstream video anomaly detection methods while meeting the real-time requirements.
With the booming development of deep learning and image generation technology, the research on sketch-generated face images has achieved remarkable results, however, there are still deficiencies in some scenarios that require high face image fidelity, and it is not possible to generate images that are semantically and geometrically consistent with the input sketches. A semantically controllable sketch-generated face image method is proposed, where some modules is designed to extract the sketch semantics, merge them with the text semantics into a more expressive semantics, and feed them into the generator along with the sketches in order to achieve semantic and geometric alignment. The proposed method is experimentally validated on open-source datasets and homemade datasets, and the experimental results show that the method effectively improves the quality of the generated images.
A novel method is proposed to construct relatively rich vector patterns from existing examples to address the problem of excessively simple and coarse details in automatically generated patterns. This method involves several key steps, including the extraction of vectorized primitives, the construction of primitive relationships, and the intelligent generation of patterns through optimization algorithms. Specifically, vectorized primitives are extracted from raster images, and directed graphs are used to establish relationships between primitives, taking into account the geometric relationships of the graph. Primitive relationships are calculated based on the extracted geometric relationships, and relevant constraints are used to transform the original pattern. The transformed pattern is then optimized to produce a more harmonious and aesthetically pleasing pattern variation. Experimental results show that the proposed algorithm can generate a diverse set of novel pattern variants, and the optimized variants demonstrate high levels of harmony and aesthetics. Users have the ability to influence the direction of pattern generation by adjusting the primitives, enabling them to compare and select the generated pattern variants that align with their implicit preferences. The proposed method provides an effective solution for pattern generation, catering to various requirements in practical applications and delivering a range of diverse pattern graphics for products.
In order to adjust the local color style of real images in a simple and low-threshold way, and to maintain the harmony of the image and the real effect of the content, this paper proposes a high-fidelity image adjustment method using color envelopes and sliced local optimal transmission. The method uses a simple mask to segment the image; extracts the initial palette and palette weight matrix of each image block through the color envelope of RGB and RGBXY dual color space; uses the sliced local optimal transfer algorithm to achieve the operator to adjust the color style migration of the target related image block, and obtains the template image block; and obtains the template image block through the template image block and the palette weight matrix; and uses the sliced local optimal transfer algorithm to achieve the operator to adjust the color style migration of the target related image block; obtain the optimal color palette of the corresponding image block through the template image block and the palette weight matrix, so as to achieve the effect of automatic adjustment of the color palette. The experimental results show that the proposed method can automatically process other image blocks related to the operator's adjustment target, effectively adjust their color styles, and ensure the global harmony of the image.
An novel intelligent electronic document layout recognition method via deep learning is proposed. A text detection approach is used to detect the string position along with region, and those adjacent regions are merged based on the distance between text zones, then the document layout style is determined by calculating the match degree between the printed document and the publication template set. The proposed recognition method constructs a electronic document representation tree, the location of the area bounding box is added to the tree. The maximum match distance between the trees is calculated, and is used for judging the document layout based on the structural similarity. Experimental results show that this method can quickly and accurately distinguish electronic document among different layout styles. Users can not only recognize the layout of this printed publication real time, but also find the desired layout style of the printed publication from a large number of printed publication images. The given method could meet different usage needs in practical applications.
The least square method is common and classical in the regression analysis. It is often used to solve the convex optimization problem, but the traditional solving routine for least squares which is done by hand-written codes shows the disadvantages when dealing with common least square problems. One significant drawback for traditional solving routine is it is hard to work along with and produce high performance solver by non-professional users who do not have the knowledge of CPU/GPU architecture, and it is also a tough job to review or improve the solvers which already have been written, since many fine details that relate to the processor structure may be hard-coded in to the source code. In this paper, we propose a new domain specific language (DSL) for the producing of non-linear least square solver for research purpose with a back end of Gauss-Newton and Levenberg-Marquardt methods implemented in cuSPARSE and cuBLAS. The DSL paired with a C/C++ interface has a user-friendly syntax which can be easily used to write energy functions and generate GPU solvers which have the performance close to hand-written CUDA solvers.
Virtual reality (VR) is a new type of media that can provide users with an unprecedented sense of immersion. In the process of achieving this sense of immersion, the spatial information perception of human vision plays a crucial role, which is one of the key requirements for human perception of the environment and virtual reality. The existing virtual reality display rendering often adopts a foveated rendering method, which utilizes the characteristics of human vision to save computational resources. Against the background of human visual characteristics, this thesis proposes a real-time computing method for peripheral vision metamer images based on the encoding of peripheral vision to address the lack of peripheral vision encoding in current foveated rendering methods. Our method renders visual metamer images more efficiently and achieves real-time computing with limited additional computing resources.
Migrate the local color information of the reference image and the target image, and on this basis, maintain the color consistency of the transferred part and the untransferred part; 1. Based on the color channel statistical histogram, Use the Newton difference method to fit the histogram and segment the histogram; 2. Extract the color histogram information of the region of interest, this part mainly matches the color histogram information of the two regions; 3. Complete the local color After the migration, adjust the color information of the regional boundary to make the effect after the migration more natural. For scenes of different complexity, local color migration can be realized. The color distribution of the image is matched and transmitted based on the slice method of the optimal transmission theory, which greatly improves the efficiency of the algorithm; the method of dividing the channel image histogram is very good Realize the extraction of local area information; The experiment proves that the proposed slice method can realize the local color migration of the image more accurately and efficiently.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.