Aiming at the problem that pedestrian trajectory prediction network based on Encoder-Decoder structure is easy to lose part of trajectory information in the process of long sequence encoding and decoding, Generative Adversarial Network based on Temporal Attention (TA) is proposed in this paper. It is used to assign influence weights to the trajectory information of the encoding and decoding layer, so that the model can make full use of the trajectory information useful for predicting the trajectory in the future and reduce the influence of redundant information. This paper adds a TA to the encoding and decoding layers of the Generative Adversarial Network, and trains it on the ETH and UCY datasets. Experimental results show that the proposed network has better prediction accuracy than existing methods.
Action recognition in realistic scenes is a challenging task in the field of computer vision. Although trajectory-based methods have demonstrated promising performance, background trajectories cannot be filtered out effectively, which leads to a reduction in the ratio of valid trajectories. To address this issue, we propose a saliency-based sampling strategy named foreground trajectories on multiscale hybrid masks (HM-FTs). First, the motion boundary images of each frame are calculated to derive the initial masks. According to the characteristics of action videos, image priors and the synchronous updating mechanism based on cellular automata are exploited to generate an optimized weak saliency map, which will be integrated with a strong saliency map obtained via the multiple kernels boosting algorithm. Then, multiscale hybrid masks are achieved through the collaborative optimization strategy and masks intersection. The compensation schemes are designed to extract a set of foreground trajectories that are closely related to human actions. Finally, a hybrid fusion framework for combining trajectory features and pose features is constructed to enhance the recognition performance. The experimental results on two benchmark datasets demonstrate that the proposed method is effective and improves upon most of the state-of-the-art algorithms.
An omnidirectional mobile platform is designed for building point cloud based on an improved filtering algorithm which is employed to handle the depth image. First, the mobile platform can move flexibly and the control interface is convenient to control. Then, because the traditional bilateral filtering algorithm is time-consuming and inefficient, a novel method is proposed which called local bilateral filtering (LBF). LBF is applied to process depth image obtained by the Kinect sensor. The results show that the effect of removing noise is improved comparing with the bilateral filtering. In the condition of off-line, the color images and processed images are used to build point clouds. Finally, experimental results demonstrate that our method improves the speed of processing time of depth image and the effect of point cloud which has been built.
The majority of human action recognition methods use multifeature fusion strategy to improve the classification performance, where the contribution of different features for specific action has not been paid enough attention. We present an extendible and universal weighted score-level feature fusion method using the Dempster–Shafer (DS) evidence theory based on the pipeline of bag-of-visual-words. First, the partially distinctive samples in the training set are selected to construct the validation set. Then, local spatiotemporal features and pose features are extracted from these samples to obtain evidence information. The DS evidence theory and the proposed rule of survival of the fittest are employed to achieve evidence combination and calculate optimal weight vectors of every feature type belonging to each action class. Finally, the recognition results are deduced via the weighted summation strategy. The performance of the established recognition framework is evaluated on Penn Action dataset and a subset of the joint-annotated human metabolome database (sub-JHMDB). The experiment results demonstrate that the proposed feature fusion method can adequately exploit the complementarity among multiple features and improve upon most of the state-of-the-art algorithms on Penn Action and sub-JHMDB datasets.
This paper presents an optical flow based novel technique to perceive the instant motion velocity of mobile robots. The primary focus of this study is to determine the robot’s ego-motion using displacement field in temporally consecutive image pairs. In contrast to most previous approaches for estimating velocity, we employ a polynomial expansion based dense optical flow approach and propose a quadratic model based RANSAC refinement of flow fields to render our method more robust with respect to noise and outliers. Accordingly, techniques for geometrical transformation and interpretation of the inter-frame motion are presented. Advantages of our proposal are validated by real experimental results conducted on Pioneer robot.
Estimation efficiency is one of key topics in computationally intense optical flow algorithm. Traditional numerical iterative methods are effective at eliminating the high frequency components of the estimation error, while keeping most of low frequency components unchanged. In this paper, we consider the multi-grid based real-time implementation of dense optical flow computation by classical Horn-Schunck model. For this purpose, establishing of the linear set of equation, which is required in linear multi-grid model, is carefully studied, and the overall multi-grid framework is presented. Efficiency and effectiveness of the proposed algorithm is validated by experimental results.
A technique is developed to calibrate a camera which is equipped on mobile robot and detect a distant scene. A flexible
calibration target, on which calibration feature points are generated according to cross-ratio invariability, is employed in
this paper. As can be extended as needed, the target has a flexible size and can be used at discretionary distance from the
camera. To guarantee the generating accuracy of calibration points, coordinates of principal point and radial lens
distortion coefficients of first two orders are determined previously. Experiment results revealed the effectiveness of the
presented method and higher accuracy than traditional method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.