Even though various research has examined the factors that cause visual discomfort in watching stereoscopic 3D video, the brightness factor has not been dealt with sufficiently. In this paper, we analyze visual discomfort under various illumination conditions by considering eye-blinking rate and saccadic eye movement. In addition, we measure the perceived depth before and after watching 3D stereoscopic video by using our own 3D depth measurement instruments. Our test sequences consist of six illumination conditions for background. The illumination is changed from bright to dark or vice-versa, while the illumination of the foreground object is constant. Our test procedure is as follows: First, the subjects are rested until a baseline of no visual discomfort is established. Then, the subjects answer six questions to check their subjective pre-stimulus discomfort level. Next, we measure perceived depth for each subject, and the subjects watch 30-minute stereoscopic 3D or 2D video clips in random order. We measured eye-blinking and saccadic movements of the subject using an eye-tracking device. Then, we measured perceived depth for each subject again to detect any changes in depth perception. We also checked the subject’s post-stimulus discomfort level, and measured the perceived depth after a 40-minute post-experiment resting period to measure recovery levels. After 40 minutes, most subjects returned to normal levels of depth perception. From our experiments, we found that eye-blinking rates were higher with a dark to light video progression than vice-versa. Saccadic eye movements were a lower with a dark to light video progression than viceversa.
Visual discomfort is caused by various factors when watching stereoscopic 3D contents. In particular, brightness change
is known as one of the major factors related to visual discomfort. However, most previous research about visual
discomfort dealt with binocular disparity as related to accommodation and vergence linkage. In this paper, we analyze
visual discomfort caused by brightness change using eye-movements and a subjective test. Eye-movements are
computed using eye pupil motion as detected from a near-infrared eye image. We measure eye-blinking and pupil size
while watching stereoscopic 3D videos with global and local brightness variations. The results show that viewers felt
more visual discomfort in local change than in global change of brightness in a scene.
KEYWORDS: Video, Computer simulations, Cameras, Semantic video, Visualization, Systems modeling, Data modeling, Algorithm development, Detection and tracking algorithms, Information visualization
The scene boundary detection is important in the semantic understanding of video data and is usually determined by coherence between shots. To measure the coherence, two approaches have been proposed. One is a discrete approach and the other one is a continuous approach. In this paper, we use the continuous approach and propose some modifications on the causal First-In-First-Out(FIFO) short-term memory-based model. One modification is that we allow dynamic memory size in computing coherence reliably regardless of the size of each shot. Another modification is that some shots can be removed from the memory buffer not by the FIFO rule. These removed shots have no or small foreground objects. Using this model, we detect scene boundaries by computing shot coherence. In computing coherence, we add one new term which is the number of intermediate shots between two comparing shots because the effect of intermediate shots is important in computing shot recall. In addition, we also consider shot activity because this is important to reflect human perception. We experiment our computing model on different genres of videos and have obtained reasonable results.
In this paper, we propose a highlight generation method using contextual information and perception. The proposed method consists of three steps. In the first step, a long video is segmented into shots which are generated by an uninterrupted camera operation. In the second step, the contextual information is computed from video shots. We divide the contextual information into local and global contextual information. We represent the local contextual information in the shot with foreground information, shot activity, and background information. The global contextual information of a shot is represented by shots' interaction and coherency with other shots. Based on the contextual information, the story unit boundaries are detected. For each story unit, we determine meaningful shot candidates by computing shot length, shot activity, contrast value, and foreground object size. Finally, from the candidates, the meaningful shots are selected by applying perceptual grouping rule inversely. By concatenating selected shots, video highlights are generated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.