The purpose of this work was to estimate bias in measuring the size of spherical and non-spherical lesions by
radiologists using three sizing techniques under a variety of simulated lesion and reconstruction slice thickness
conditions. We designed a reader study in which six radiologists estimated the size of 10 synthetic nodules of various
sizes, shapes and densities embedded within a realistic anthropomorphic thorax phantom from CT scan data. In this
manuscript we report preliminary results for the first four readers (Reader 1-4). Two repeat CT scans of the phantom
containing each nodule were acquired using a Philips 16-slice scanner at a 0.8 and 5 mm slice thickness. The readers
measured the sizes of all nodules for each of the 40 resulting scans (10 nodules x 2 slice thickness x 2 repeat scans)
using three sizing techniques (1D longest in-slice dimension; 2D area from longest in-slice dimension and corresponding
longest perpendicular dimension; 3D semi-automated volume) in each of 2 reading sessions. The normalized size was
estimated for each sizing method and an inter-comparison of bias among methods was performed. The overall relative
biases (standard deviation) of the 1D, 2D and 3D methods for the four readers subset (Readers 1-4) were -13.4 (20.3),
-15.3 (28.4) and 4.8 (21.2) percentage points, respectively. The relative biases for the 3D volume sizing method was
statistically lower than either the 1D or 2D method (p<0.001 for 1D vs. 3D and 2D vs. 3D).
We have conducted an evaluation comparing the interpretability potential of two standardized HD video formats,
1080p30 and 720p60. Despite the lack of an existing motion imagery (MI) quality scale akin to the NIIRS scale, we
have exploited previous work on MI scale development in measuring critical imagery parameters affecting
interpretability. We developed a collection of MI clips that covers a wide parameter range. These well-characterized
clips provide the basis for relating perceived imagery interpretability to MI parameters, including resolution (related to
ground sample distance, GSD) and frame rate, and to target parameters such as motion and scene complexity. This report
presents key findings about the impact of resolution and frame rate on interpretability. Neither format is uniformly
preferred, but the analysis quantifies the interpretability difference between the formats and finds there are significant
effects of target motion and target size on the format preferences of the imagery analysts. The findings have implications
for sensor system design, systems architecture, and mission planning.
The motion imagery community would benefit from standard measures for assessing image interpretability. The National Imagery Interpretability Rating Scale (NIIRS) has served as a community standard for still imagery, but no comparable scale exists for motion imagery. Several considerations unique to motion imagery indicate that the standard methodology employed in the past for NIIRS development may not be applicable or, at a minimum, requires modifications. The dynamic nature of motion imagery introduces a number of factors that do not affect the perceived interpretability of still imagery—namely target motion and camera motion. We conducted a series of evaluations to understand and quantify the effects of critical factors. This paper presents key findings about the relationship of perceived interpretability to ground sample distance, target motion, camera motion, and frame rate. Based on these findings, we modified the scale development methodology and validated the approach. The methodology adapts the standard NIIRS development procedures to the softcopy exploitation environment and focuses on image interpretation tasks that target the dynamic nature of motion imagery. This paper describes the proposed methodology, presents the findings from a methodology assessment evaluation, and offers recommendations for the full development of a scale for motion imagery.
Motion imagery will play a critical role in future intelligence and military missions. The ability to provide a real time, dynamic view and persistent surveillance makes motion imagery a valuable source of information. The ability to collect, process, transmit, and exploit this rich source of information depends on the sensor capabilities, the available communications channels, and the availability of suitable exploitation tools. While sensor technology has progressed dramatically and various exploitation tools exist or are under development, the bandwidth required for transmitting motion imagery data remains a significant challenge. This paper presents a user-oriented evaluation of several methods for compression of motion imagery. We explore various codecs and bitrates for both inter- and intra-frame encoding. The analysis quantifies the effects of compression in terms of the interpretability of motion imagery, i.e., the ability of imagery analysts to perform common image exploitation tasks. The findings have implications for sensor system design, systems architecture, and mission planning.
A fundamental problem in image processing is finding objective metrics that parallel human perception of image
quality. In this study, several metrics were examined to quantify compression algorithms in terms of perceived loss
of image quality. In addition, we sought to describe the relationship of image quality as a function of bit rate. The
compression schemes used were JPEG2000, MPEG2, and H.264. The frame size was fixed at 848x480 and the
encoding varied from 6000 k bps to 200 k bps. The metrics examined were peak signal to noise ratio (PSNR),
structural similarity (SSIM), edge localization metrics, and a blur metric. To varying degrees, the metrics displayed
desirable properties, namely they were monotonic in the bit rate, the group of pictures (GOP) structure could be
inferred, and they tended to agree with human perception of quality degradations. Additional work is being
conducted to quantify the sensitivity of these measures with respect to our Motion Imagery Quality Scale.
The motion imagery community would benefit from the availability of standard measures for assessing image interpretability. The National Imagery Interpretability Rating Scale (NIIRS) has served as a community standard for still imagery, but no comparable scale exists for motion imagery. Previous studies have explored the factors affecting the perceived interpretability of motion imagery and the ability to perform various image exploitation tasks. More recently, a study demonstrated an approach for adapting the standard NIIRS development methodology to motion imagery. This paper presents the first step in implementing this methodology, namely the construction of the perceived interpretability continuum for motion imagery. We conducted an evaluation in which imagery analysts rated the interpretability of a large number of motion imagery clips. Analysis of these ratings indicates that analysts rate the imagery consistently, perceived interpretability is unidimensional, and that interpretability varies linearly with log(GSD). This paper presents the design of the evaluation, the analysis and findings, and implications for scale development.
The resources for evaluation of moving imagery coding include a variety of subjective and objective methods for quality
measurement. These are applied to a variety of imagery, ranging from synthetically-generated to live capture. NIST has
created a family of synthetic motion imagery (MI) materials providing image elements such as moving spirals, blocks,
text, and spinning wheels. Through the addition of a colored noise background, the materials support the generation of
graded levels of MI coding impairments such as image blocking and mosquito noise, impairments that are found in
imagery coded with Motion Pictures Expert Group (MPEG) and similar codecs. For typical available synthetic imagery,
human viewers respond unfavorably to repeated viewings; so in this case, the use of objective (computed) metrics for
evaluation of quality is preferred. Three such quality metrics are described: a standard peak-signal-to-noise measure, a
new metric of edge-blurring, and another of added-edge-energy. As applied to the NIST synthetic clips, the metrics
confirm an approximate doubling [1] of compression efficiency between two commercial codecs, one an implementation
of AVC/H.264 and the other of MPEG-2.
The motion imagery community would benefit from the availability of standard measures for assessing image interpretability. The National Imagery Interpretability Rating Scale (NIIRS) has served as a community standard for still imagery, but no comparable scale exists for motion imagery. Several considerations unique to motion imagery indicate that the standard methodology employed in the past for NIIRS development may not be applicable or, at a minimum, requires modifications. The dynamic nature of motion imagery introduces a number of factors that do not affect the perceived interpretability of still imagery - namely target motion and camera motion. A set of studies sponsored by the National Geospatial-Intelligence Agency (NGA) have been conducted to understand and quantify the effects of critical factors. This study discusses the development and validation of a methodology that has been proposed for the development of a NIIRS-like scale for motion imagery. The methodology adapts the standard NIIRS development procedures to the softcopy exploitation environment and focuses on image interpretation tasks that target the dynamic nature of motion imagery. This paper describes the proposed methodology, presents the findings from a methodology assessment evaluation, and offers recommendations for the full development of a scale for motion imagery.
The development of a motion imagery (MI) quality scale, akin to the National Image Interpretibility Rating Scale (NIIRS) for still imagery, would have great value to designers and users of surveillance and other MI systems. A multiphase study has adopted a perceptual approach to identifying the main MI attributes that affect interpretibility. The current perceptual study measured frame rate effects for simple motion imagery interpretation tasks of detecting and identifying a known target. By using synthetic imagery, there was full control of the contrast and speed of moving objects, motion complexity, the number of confusers, and the noise structure. To explore the detectibility threshold, the contrast between the darker moving objects and the background was set at 5%, 2%, and 1%. Nine viewers were to detect or identify a moving synthetic "bug" in each of 288 10-second clip. We found that frame rate, contrast, and confusers had a statistically significant effect on image interpretibility (at the 95% level), while the speed and background showed no significant effect. Generally, there was a significant loss in correct detection and identification for frame rates below 10 F/s. Increasing the contrast improved detection and at high contrast, confusers did not affect detection. Confusers reduced detection of higher speed objects. Higher speed improved detection, but complicated identification, although this effect was small. Higher speed made detection harder at 1 Frame/s, but improved detection at 30 F/s. The low loss of quality at moderately lower frame rates may have implications for bandwidth limited systems. A study is underway to confirm, with live action imagery, the results reported here with synthetic.
The motion imagery community would benefit from the availability of standard measures for assessing image interpretability. The National Imagery Interpretability Rating Scale (NIIRS) has served as a community standard for still imagery, but no comparable scale exists for motion imagery. Several considerations unique to motion imagery indicate that the standard methodology employed in the past for NIIRS development may not be applicable or, at a minimum, require modifications. Traditional methods for NIIRS development rely on a close linkage between perceived image quality, as captured by specific image interpretation tasks, and the sensor parameters associated with image acquisition. The dynamic nature of motion imagery suggests that this type of linkage may not exist or may be modulated by other factors. An initial study was conducted to understand the effects target motion, camera motion, and scene complexity have on perceived image interpretability for motion imagery. This paper summarizes the findings from this evaluation. In addition, several issues emerged that require further investigation:
- The effect of frame rate on the perceived interpretability of motion imagery
- Interactions between color and target motion which could affect perceived interpretability
- The relationships among resolution, viewing geometry, and image interpretability
- The ability of an analyst to satisfy specific image exploitation tasks relative to different types of motion imagery clips
Plans are being developed to address each of these issues through direct evaluations. This paper discusses each of these concerns, presents the plans for evaluations, and explores the implications for development of a motion imagery quality metric.
The development of new video processing, new displays, and new modes of dissemination and usage enables a variety of moving picture applications intended for mobile and desktop devices as well as the more conventional platforms. These applications include multimedia as well as traditional video and require novel lighting environments and bit rates previously unplumbed in Moving Picture Experts Group (MPEG) video compression. The migration to new environments poses a methodological challenge to testers of video quality. Both the viewing environment and the display characteristics differ dramatically from those used in well-established subjective testing methods for television. The MPEG Test Committee has adapted the television-centric methodology to the new testing environments. The adaptations that are examined here include: (1) The display of progressive scan pictures in the Common Intermediate Format (CIF at 352x288 pixel/frame) and Quarter CIF (QCIF at 176x144 pixel/frame) as well as other, larger moving pictures requires new ways of testing the subjects including different viewing distances and altered ambient lighting. (2) The advent of new varieties of display technologies suggests there is a need for methods of characterizing them to assure the results of the testing do not depend strongly on the display. (3) The use of non-parametric statistical tests in test data analysis. In MPEG testing these appear to provide rigorous confidence statements more in line with testing experience than those provided by classical parametric tests. These issues have been addressed in a recent MPEG subjective test. Some of the test results are reviewed; they suggest that these adaptations of long-established subjective testing methodology for TV are capable of providing practical and reliable measures of subjective video quality for a new generation of technology.
The proponents of digital cinema seek picture quality exceeding that of the best film-based presentation. Quantifying the performance of systems for the presentation of high quality imagery presents several challenges. One is that the dynamic range and the resolution may not be simply related to the nominal characteristics of bit-depth and pixel counts. We review some of the measurement methods that have been applied to determining these characteristics. One of the
presumed advantages of high bit depth systems is to reduce the visibility of image banding. Non-uniformity of the display can be compensated in test pattern design to enable the measurement of banding contrast. The subjective assessment of banding is compared to a contrast-weighted model of just noticeable image differences. Applied to a class of image banding test patterns, the metric relates dynamic range to contouring. The model produces an estimate of the visibility threshold for image contouring in a 10-bit system, superior to a simple Weber model. These measurement issues will
continue to be challenges as d-cinema systems improve.
Mosquito noise is a time dependent video compression impairment in which the high frequency spatial detail in video images having crisp edges is aliased intermittently. A new synthetic test pattern of moving spirals or circles is described which generates mosquito noise (MN) under Motion Pictures Expert Group (MPEG) compression. The spiral pattern is one of several NIST-developed patterns designed to stress specific features of compression based on motion estimation and quantization. The 'Spirals' pattern has several spirals or circles superimposed on a uniform background. The frames are filtered to avoid interline flicker which may be confounded with MN. Motion of the spirals and changing luminance of the background can be included to reduce the correlation between successive frames. Unexpectedly, even a static pattern of spirals can induce mosquito noise due to the stochastic character of the encoder. We consider metrics which are specific to the impairment being measured. For mosquito noise, we examine two separable detectors: each consists of a temporal (frame-to-frame) computation applied to the output of a spatial impairment detector which is applied to each frame. The two spatial detectors are: FLATS, which detects level 8 X 8 pixel image blocks; and the root-mean-square (RMS) applied to the image differences between original and compressed frames. The test patterns are encoded at low bit rates. We examine the measured mosquito noise as a function of the Group-of-Pictures (GOP) pattern in the MPEG-2 encoding and find that the GOP structure defines the periodicities of the MN.
KEYWORDS: Video, Computer programming, Visualization, Video compression, Data modeling, Visual process modeling, Chromium, Calibration, Video processing, Signal to noise ratio
The investigation examines two methodologies by which to control the impairment level of digital video test materials. Such continuous fine-tuning of video impairments is required for psychophysical measurements of human visual sensitivity to picture impairments induced by MPEG-2 compression. Because the visual sensitivity data will be used to calibrate objective and subjective video quality models and scales, the stimuli must contain realistic representations of actual encoder-induced video impairments. That is, both the visual and objective spatio-temporal response to the stimuli must be similar to the response to impairments induced directly by an encoder. The first method builds a regression model of the Peak Signal-To-Noise Ratio (PSNR) of the output sequence as a function of the bit rate specification used to encode a given video clip. The experiments find that for any source sequence, a polynomial function can be defined by which to predict the encoder bit rate that will yield a sequence having any targeted PSNR level. In a second method, MPEG-2-processed sequences are linearly combined with their unprocessed video sources. Linear regression is used to relate PSNR to the weighting factors used in combining the source and processed sequences. Then the 'synthetically' adjusted impairments are compared to those created via an encoder. Visual comparison is made between corresponding I-, B-, and P-frames of the synthetically generated sequences and those processed by the codec. Also, PSNR comparisons are made between various combinations of source sequence, the MPEG-2 sequence used for mixing, the mixed sequence, and the codec-processed sequence. Both methods are found to support precision adjustment of impairment level adequate for visual threshold measurement. The authors caution that some realism may be lost when using the weighted summation method with highly compression-impaired video.
The present investigation compares performance of two objective video quality metrics in predicting the visual threshold for the detection of blocking impairments associated with MPEG-2 compression. The visibility thresholds for both saturated color and gray-scale targets are measured. The test material consists of image sequences in which either saturated color or gray-scale targets exhibiting blocking are varied in luminance contrast from -44 dB to -5 dB against a constant gray background. Stimulus presentation is by the 'method of limits' under International Telecommunications Union Rec. 500 conditions. Results find the detection of blocking impairments at Michelson contrast levels between -28 dB and -33 dB. This result is consistent with values reported by other investigators for luminance contrast detection thresholds. A small, but statistically significant difference is found between the detection threshold of saturated color patterns versus luma-only images. The results suggest, however, that blocking impairment detections controlled mainly by display luminance. Two objective metrics are applied to gray-scale image sequences, yielding measures of perceptible image blocking for each frame. A relatively simple blocking detector and a more elaborate discrete cosine transform error metric correlate well over the contrast range examined. Also, the two measures correlate highly with measured image contrast. Both objective metrics agree closely with visual threshold measurements, yielding threshold predictions of approximately -29 dB.
Lossy video compression systems such as MPEG2 introduce picture impairments such as image blocking, color distortion and persistent color fragments, 'mosquito noise,' and blurring in their outputs. While there are video test clips which exhibit one or more of these distortions upon coding, there is need of a set of well-characterized test patterns and video quality metrics. Digital test patterns can deliver calibrated stresses to specific features of the encoder, much as the test patterns for analog video stress critical characteristics of that system. Metrics quantify the error effects of compression by a computation. NIST is developing such test patterns and metrics for compression rates that typically introduce perceptually negligible artifacts, i.e. for high quality video. The test patterns are designed for subjective and objective evaluation. The test patterns include a family of computer-generated spinning wheels to stress luminance-based macro-block motion estimation algorithms and images with strongly directional high-frequency content to stress quantization algorithms. In this paper we discuss the spinning wheel test pattern. It has been encoded at a variety of bit rates near the threshold for the perception of impairments. We have observed that impairment perceptibility depends on the local contrast. For the spinning wheel we report the contrast at the threshold for perception of impairments as a function of the bit rate. To quantify perceptual image blocking we have developed a metric which detects 'flats:' image blocks of constant (or near constant) luminance. The effectiveness of this metric is appraised.
The National Institute of Standards and Technology (NIST) has initiated a new program on performance measurements for flat panel displays. Prior to this progress, NIST completed an assessment of industry needs for measurements and standards to assist in the development of high-resolution displays. As a result of this study, a new laboratory has been established to characterize the electrical and optical performance of flat panel displays. The services of the laboratory will be available to commercial panel manufacturers and users. NIST, as a neutral third party, intends to provide technical assistance in the development of standards and measurement practices for flat panel display characterization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.