Exploring the indoor environment and finding unknown objects that appeared in a scene are important for research of scene understanding by a robot. However, background subtraction is traditionally used for segmenting unknown object regions, and it cannot be directly used for a moving camera on the robot. In this paper, we propose a task called view-independent panoptic scene change detection, which is the task of segmenting unknown object regions by comparing two images from different viewpoints before and after the objects appear. In this paper, we propose a method for segmenting unknown object regions by modeling a segmented known instance region as background. For the background modeling, we introduce two methods: histogram-based and deep metric-learning-based methods. In addition, we create a new panoptic scene change detection dataset consisting of images taken from different camera views. Through experiments, we confirm that the proposed method can segment regions of unknown class instances; the deep metric-learning-based method performs more accurately than the histogram-based method, achieving good performance on the change detection dataset.
In recent years, human pose estimation based on deep learning has been actively studied for various applications. A large amount of training data is required to achieve good performance, but, annotating human poses is quite an expensive task. Therefore, there is a growing need to improve the efficiency of training data preparation. In this paper, we take an active learning approach to reduce the cost of preparing training data for human pose estimation. We propose an active learning method that automatically selects images effective for improving the performance of a human pose estimation model from unlabeled image sequences, focusing on the fact that the human pose continuously changes between adjacent frames in an image sequence. Specifically, by comparing the estimated human poses between frames, we select images incorrectly estimated as candidates for manual annotation. Then, the human pose estimation model is re-trained by adding a small portion of manually annotated data as training data. Through experiments, we confirm that the proposed method can effectively select training data candidates from unlabeled image sequences, and that the proposed method can improve the performance of the model with reducing the cost of manual annotations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.