A significant obstacle to developing high performance Deep Learning algorithms for Automated Threat Detection (ATD) in security X-ray imagery, is the difficulty of obtaining large training datasets. In our previous work, we circumvented this problem for ATD in cargo containers, using Threat Image Projection and data augmentation. In this work, we investigate whether data scarcity for other modalities, such as parcels and baggage, can be ameliorated by transforming data from one domain so that it approximates the appearance of another. We present an ontology of ATD datasets to assess where transfer learning may be applied. We define frameworks for transfer at the training and testing stages, and compare the results for both methods against ATD where a common data source is used for training and testing. Our results show very poor transfer, which we attribute to the difficulty of accurately matching the blur and contrast characteristics of different scanners.
Previously, we investigated the use of Convolutional Neural Networks (CNNs) to detect so-called Small Metallic Threats (SMTs) hidden amongst legitimate goods inside a cargo container. We trained a CNN from scratch on data produced by a Threat Image Projection (TIP) framework that generates images with realistic variation to robustify performance. The system achieved 90% detection of containers that contained a single SMT, while raising 6% false positives on benign containers. The best CNN architecture used the raw high energy image (single-energy) and its logarithm as input channels. Use of the logarithm improved performance, thus echoing studies on human operator performance. However, it is an unexpected result with CNNs. In this work, we (i) investigate methods to exploit material information captured in dual-energy images, and (ii) introduce a new CNN training scheme that generates ‘spot-the-difference’ benign and threat pairs on-the-fly. To the best of our knowledge, this is the first time that CNNs have been applied directly to raw dual-energy X-ray imagery, in any field. To exploit dual-energy, we experiment with adapting several physics-derived approaches to material discrimination from the cargo literature, and introduce three novel variants. We hypothesise that CNNs can implicitly learn about the material characteristics of objects from the raw dual-energy images, and use this to suppress false positives. The best performing method is able to detect 95% of containers containing a single SMT, while raising 0.4% false positives on benign containers. This is a step change improvement in performance over our prior work
Existing approaches to automated security image analysis focus on the detection of particular classes of threat. However, this mode of inspection is ineffectual when dealing with mature classes of threat, for which adversaries have refined effective concealment techniques. Furthermore, these methods may be unable to detect potential threats that have never been seen before. Therefore, in this paper, we investigate an anomaly detection framework, at X-ray image patch-level, based on: (i) image representations, and (ii) the detection of anomalies relative to those representations. We present encouraging preliminary results, using representations learnt using convolutional neural networks, as well as several contributions to a general-purpose anomaly detection algorithm based on decision-tree learning.
The current infrastructure for non-intrusive inspection of cargo containers cannot accommodate exploding com-merce volumes and increasingly stringent regulations. There is a pressing need to develop methods to automate parts of the inspection workﬂow, enabling expert operators to focus on a manageable number of high-risk images. To tackle this challenge, we developed a modular framework for automated X-ray cargo image inspection. Employing state-of-the-art machine learning approaches, including deep learning, we demonstrate high performance for empty container veriﬁcation and speciﬁc threat detection. This work constitutes a signiﬁcant step towards the partial automation of X-ray cargo image inspection.
We present a novel framework for describing intensity-based multi-modal similarity measures. Our framework is
based around a concept of internal, or self, similarity. Firstly the locations of multiple regions or patches which
are "similar" to each other are identified within a single image. The term "similar" is used here to represent
a generic intra-modal similarity measure. Then if we examine a second image in the same locations, and this
image is registered to the first image, we should find that the patches in these locations are also "similar", though
the actual features in the patches when compared between the images could be very different. We propose that
a measure based on this principle could be used as an inter-modal similarity measure because, as the two
images become increasingly misregistered then the patches within the second image should become increasingly
dissimilar. Therefore, our framework results in an inter-modal similarity measure by using two intra-modal
similarity measures applied separately within each image.
In this paper we describe how popular multi-modal similarity measures such as mutual information can be
described within this framework. In addition the framework has the potential to allow the formation of novel
similarity measures which can register using regional information, rather than individual pixel/voxel intensities.
An example similarity measure is produced and its ability to guide a registration algorithm is investigated. Registration
experiments are carried out using three datasets. The pairs of images to be registered were specifically chosen as they were expected to challenge (i.e. result in misregistrations) standard intensity-based measures, such as mutual information. The images include synthetic data, cadaver data and clinical data and cover a range of modalities. Our experiments show that our proposed measure is able to achieve accurate registrations where standard intensity-based measures, such as mutual information, fail.
We report progress on an approach (Geometric Texton Theory - GTT) that like Marr's 'primal sketch' aims to describe
image structure in a way that emphasises its qualitative aspects. In both approaches, image description is by labelling
points using a vocabulary of feature types, though compared to Marr we aim for a much larger feature vocabulary.
We base GTT on the Gaussian derivative (DtG) model of V1 measurement. Marr's primal sketch was based on DtG
filters of derivative order up to 2nd, for GTT we plan to extend to the physiologically plausible limit of 4th. This is how
we will achieve a larger feature vocabulary (we estimate 30-150) than Marr's 'edge', 'line' and 'blob'. The central
requirement of GTT then is for a procedure for determining the feature vocabulary that will scale up to 4th order. We
have previously published feature category systems for 1-D 1st order, 1-D 2nd order, 2-D 1st order and 2-D pure 2nd order.
In this paper we will present results of GTT as applied to 2-D mixed 1st + 2nd order features.
We will review various approaches to defining the feature vocabulary, including ones based on (i) purely geometrical
considerations, and (ii) natural image statistics.
See-through augmented reality (AR) systems for image-guided surgery merge volume rendered MRI/CT data directly with the surgeon’s view of the patient during surgery. Research has so far focused on optimizing the technique of aligning and registering the computer-generated anatomical images with the patient’s anatomy during surgery. We have previously developed a registration and calibration method that allows alignment of the virtual and real anatomy to ~1mm accuracy. Recently we have been investigating the accuracy with which observers can interpret the combined visual information presented with an optical see-through AR system. We found that depth perception of a virtual image presented in stereo below a physical surface was misperceived compared to viewing the target in the absence of a surface. Observers overestimated depth for a target 0-2cm below the surface and underestimated the depth for all other presentation depths. The perceptual error could be reduced, but not eliminated, when a virtual rendering of the physical surface was displayed simultaneously with the virtual image. The findings suggest that misperception is due either to accommodation conflict between the physical surface and the projected AR image, or the lack of correct occlusion between the virtual and real surfaces.
We present a graph-based approach to the production of hierarchical segmentations of grey- level images. The technique is designed to be as responsive as possible to the structure of the image but general in the sense that it is independent of the exact edge measure used. We present results using a novel edge measure which combines the first and second derivatives of the image to calculate the phase of the image with respect to scale. We combine this with the gradient to produce a value which reflects both the strength and the stability of candidate edges.
An efficient scheme for representation of the shape of anatomical and pathological structures is required for intelligent computer interpretation of medical images. We present an approach to the extraction and representation of shape which, unlike previous shape representations, does not require complete boundary descriptions. It is based on the `Delaunay triangulation' and its dual the `Voronoi diagram.' Our method of using this dual leads to both a skeleton description and a boundary description. The basic step in the algorithm is that of deciding whether to treat any pair of neighboring points as adjacent (lying next to each other on the same boundary) or opposite (lying on opposing sides of a skeleton separating two boundaries). The duality of the skeleton and boundary descriptions produced means that the splitting of one object into two separate objects, or the merging of two objects into one, can be easily accomplished.