Recently, boundary information has gained more attention in improving the performance of semantic segmentation. This paper presents a novel symmetrical network, called BASNet, which contains four components: the pre-trained ResNet-101 backbone, semantic segmentation branch (SSB), boundary detection branch (BDB), and aggregation module (AM). More specifically, our BDB only focuses on processing boundary-related information using a series of spatial attention blocks (SABs). On the other hand, a set of global attention blocks (GABs) are used in SSB to further capture more accurate object boundary information and semantic information. Finally, the outputs of SSB and BDB are fed into AM, which merges the features from SSB and BDB to boost performance. The exhaustive experimental results show that our method not only predicts the boundaries of objects more accurately, but also improves the performance of semantic segmentation.
Recent research in motion detection has shown that various outlier detection methods could be used for efficient
detection of small moving targets. These algorithms detect moving objects as outliers in a properly defined attribute
space, where outlier is defined as an object distinct from the objects in its neighborhood. In this paper, we compare the
performance of two incremental outlier detection algorithms, namely the incremental connectivity-based outlier factor
and the incremental local outlier factor to modified Stauffer-Grimson algorithm. Each video sequence is represented
with spatial-temporal blocks extracted from the raw video. Principal component analysis (PCA) is applied on these
blocks in order to reduce the dimensionality of extracted data. Extensive experiments performed on several data sets,
including infrared sequences from OSU Thermal Pedestrian Database repository, and data collected at Delaware State
University from FLIR Systems PTZ cameras have shown promising results in using outlier detection for detection of
small moving targets.
KEYWORDS: Shape analysis, Control systems, Medical imaging, Detection and tracking algorithms, Stars, Image analysis, Brain, Medical diagnostics, Magnetic resonance imaging, Information science
In this work, we introduce a new representation technique of 2D contour shapes and a sequence similarity measure to
characterize 2D regions of interest in medical images. First, we define a distance function on contour points in order to
map the shape of a given contour to a sequence of real numbers. Thus, the computation of shape similarity is reduced to
the matching of the obtained sequences. Since both a query and a target sequence may be noisy, i.e., contain some outlier
elements, it is desirable to exclude the outliers in order to obtain a robust matching performance. For the computation of
shape similarity, we propose the use of an algorithm which performs elastic matching of two sequences. The contribution
of our approach is that, unlike previous works that require images to be warped according to a template image for
measuring their similarity, it obviates this need, therefore it can estimate image similarity for any type of medical image
in a fast and efficient manner. To demonstrate our method's applicability, we analyzed a brain image dataset consisting
of corpus callosum shapes, and we investigated the structural differences between children with chromosome 22q11.2
deletion syndrome and controls. Our findings indicate that our method is quite effective and it can be easily applied on
medical diagnosis in all cases of which shape difference is an important clue.
KEYWORDS: Visualization, Object recognition, Image segmentation, Chemical elements, Databases, Visual system, 3D image reconstruction, Visual process modeling, 3D image processing, 3D modeling
We will provide psychophysical evidence that recognition of parts of object contours is a necessary component of object recognition. It seems to be obvious that the recognition of parts of object contours is performed by applying a partial shape similarity measure to the query contour part and to the known contour parts. The recognition is completed once a sufficiently similar contour part is found in the database of known contour parts. We will derive necessary requirements for any partial shape similarity measure based on this scenario. We will show that existing shape similarity measures do not satisfy these requirements, and propose a new partial shape similarity measure.
We propose a novel approach to retrieve similar images from image databases that works in the presence of significant illumination variations. The most common method to compensate for illumination changes is to perform color normalization. The existing approaches to color normalization tend to destroy image content in that they map distinct color values to identical color values in the transformed color space. From the mathematical point of view, the normalization transformation is not reversible. In this paper we propose to use a reversible illumination normalization transformation. Thus, we are able to compensate for illumination changes without any reduction of content information. Since natural illumination changes affect different parts of images in different amounts, we apply our transformation locally to sub-images. Basic idea is to divide an image into sub-images, normalize each one separately, and then project it to an n-dimensional reduced space using principal component analysis. This process yields a normalized texture representation as a set of n-vectors. Finding similar images is now reduced to computing distances between sets of n-vectors. Results were compared with a leading image retrieval system.
Although a tremendous effort has been made to perform a reliable analysis of images and videos in the past fifty years, the reality is that one cannot rely 100% on the analysis results. The only exception is applications in controlled environments as dealt in machine vision, where closed world assumptions apply. However, in general, one has to deal with an open world, which means that content of images may significantly change, and it seems impossible to predict all possible changes. For example, in the context of surveillance videos, the light conditions may suddenly fluctuate in parts of images only, video compression or transmission artifacts may occur, a wind may cause a stationary camera to tremble, and so on. The problem is that video analysis has to be performed in order to detect content changes, but such analysis may be unreliable due to the changes, and thus fail to detect the changes and lead to "vicious cycle".
The solution pursuit in this paper is to monitor the reliability of the computed features by analyzing their general properties. We consider statistical properties of feature value distributions as well as temporal properties. Our main strategy is to estimate the feature properties when the features are reliable computed, so that any set of features that does not have these properties is detected as being unreliable. This way we do not perform any direct content analysis, but instead perform analysis of feature properties related to their reliability.
A 3D binary digital image is said to be well-composed if and only if the set of points in the faces shared by the voxels of foreground and background points of the image is a 2D manifold. Well-composed images enjoy important topological and geometric properties; in particular, there is only one type of connected component in any well-composed image, as 6-, 14-, 18-, and 26-connected components are equal. This implies that several algorithms used in computer vision, computer graphics, and image processing become simpler. For example, thinning algorithms do not suffer from the irreducible thickness problem if the image is well-composed, and the extraction of isosurfaces from well-composed images using the Marching Cubes (MC) algorithm or some of its enhanced variations can be simplified, as only six out of the fourteen canonical cases of cube-isosurface intersection can occur. In this paper, we introduce a new randomized algorithm for making 3D binary digital images that are not well-composed into well-composed ones. We also analyze the complexity and convergence of our algorithm, and present experimental evidence of its effectiveness when faced with practical medical imaging data.
It is obvious that image histograms are of very limit use in video analysis. For example, two images containing the same objects at different positions are mapped to the same histograms.
We show that a simple extension of image histograms to include the position information of the centroids of histograms bins leads to a useful representation for video analysis. This extension must be done carefully in order to obtain a representation that is stable with respect to noise. Moreover, the carefully extended histograms also add stability and reliability to the retrieval of still images.
Recently Latecki and Lakamper (Computer Vision and Image Understanding 73:3, March 1999) reported a novel process for a discrete curve evolution. This process has various application possibilities, in particular, for noise removal and shape simplification of boundary curves in digital images. In this paper we prove that the process of the discrete curve evolution is continuous: if polygon Q is close to polygon P, then the polygons obtained by their evolution remain close. This result follows directly from the fact that the evolution of Q corresponds to the evolution of P if Q approximates P. This intuitively means that first all vertices of Q are deleted that are not close to any vertex of P, and then, whenever a vertex of P is deleted, then a vertex of Q that is close to it is deleted in the corresponding evolution step of Q.
A similarity measure for silhouettes of 2D objects is presented, and its properties are analyzed with respect to retrieval of similar objects in an image database. Our measure profits from a novel approach to subdivision of objects into parts of visual form. To compute our similarity measure, we first establish the best possible correspondence of visual parts, which is based on a correspondence of convex boundary arcs. Then the similarity between correspondence arcs is computed and aggregated. We applied our similarity measure to shape matching of object contours in various image databases and compared it to well-known approaches in the literature. The experimental results justify that our shape matching procedure gives an intuitive shape correspondence and is stable with respect to noise distortions.
We introduce a class of planar arcs and curves, called tame arcs, which is general enough to describe the boundaries of planar real objects. A tame arc can have smooth parts as well as sharp corners; thus a polygonal arc is tame. On the other hand, this class of arcs is restrictive enough to rule out pathological arcs which have infinitely many inflections or which turn infinitely often: a tame arc can have only finitely many inflections, and its total absolute turn must be finite. In order to relate boundary properties of discrete objects obtained by segmenting digital images to the corresponding properties of their continuous originals, the theory of tame arcs is based on concepts that can be directly transferred from the continuous to the discrete domain. A tame arc is composed of a finite number of supported arcs. We define supported digital arcs and motivate their definition by the fact that hey can be obtained by digitizing continuous supported arcs. Every digital arc is tame, since it contains a finite number of points, and therefore it can be decomposed into a finite number of supported digital arcs.
One of the main tasks of digital image analysis is to recognize the properties of real objects based on their digital images. These images are obtained by some sampling device, like a CCD camera, and are represented as finite sets of points that are assigned some value in a gray-level or color scale. A fundamental question in image understanding is which features in the digital image correspond, under a given set of conditions, to certain properties of the underlying objects. In many practical applications this question is answered empirically by visually inspecting the digital images. In this paper, a mathematically comprehensive answer is presented to this question with respect to topological properties. In particular, conditions are derived relating properties of real objects to the grid size of the sampling device which guarantee that a real object and its digital image are topologically equivalent. Moreover, we prove that a topology preserving digitization must result in well-composed or strongly connected sets. Consequently, only certain local neighborhoods are realizable for such a digitization. Using the derived topological model of a well-composed digital image, we demonstrate the effectiveness of this model with respect to the digitization, thresholding, correction, and compression of digital document images.
A special class of subsets of binary digital 3D pictures called `well-composed pictures' is defined by two simple conditions on a local voxel level. The pictures of this class have very nice topological and geometrical properties; for example, a very natural definition of a continuous analog leads to regular properties of surfaces, a digital version of the 3D separation theorem has a simple proof, and there is only one connectedness relation in a well-composed picture, since 6-, 18-, and 26-connectedness are equivalent. This implies that many algorithms used in computer vision and computer graphics and their descriptions can be simpler, and the algorithms can be faster.
In this paper we present conditions which guarantee that every digitization process preserves important topological and differential geometric properties. These conditions also allow us to determine the correct digitization resolution for a given class of real objects. Knowing that these properties are invariant under digitization, we can then use them in feature-based recognition. Moreover, these conditions imply that only a few digital patterns can occur as neighborhoods of boundary points in the digitization. This is very useful for noise detection, since if the neighborhood of a boundary point does not match one of these patterns, it must be due to noise. Our definition of a digitization approximates many real digitization processes. The digitization process is modeled as a mapping from continuous sets representing real objects to discrete sets represented as digital images. We show that an object A and the digitization of A are homotopy equivalent. This, for example, implies that the digitization of A preserves connectivity of the object and its complement. Moreover, we show that the digitization of A will not change the qualitative differential geometric properties of the boundary of A, i.e. a boundary point which is locally convex cannot be digitized to a locally concave pixel and a boundary point which is locally concave cannot be digitized to a locally convex pixel.
As was noted early in the history of computer vision, using the same adjacency relation for the entire digital picture leads to so-called `paradoxes' related to the Jordan Curve Theorem. The most popular idea to avoid these paradoxes in binary images was using different adjacency relations for the foreground and the background: 8-adjacency for black points and 4-adjacency for white points, or vice versa. This idea cannot be extended in a straightforward way to multicolor pictures. In this paper a solution is presented which guarantees avoidance of the connectivity paradoxes related to the Jordan Curve Theorem for all multicolor pictures. Only one connectedness relation is used for the entire digital picture, i.e., for every component of every color. The idea is not to allow a certain `critical configuration' which can be detected locally to occur in digital pictures; such pictures are called `well-composed.' Well-composed pictures have very nice topological properties. For example, the Jordan Curve Theorem holds and the Euler characteristic is locally computable. This implies that properties of algorithms used in computer vision can be stated and proved in a clear way, and that the algorithms themselves become simpler and faster. Moreover, if a digitization process is guaranteed to preserve topology, then the obtained digital pictures must be well-composed.
Starting with the intuitive concept of `nearness' as a binary relation, semi-proximity spaces (sp-spaces) are defined. The restrictions on semi-proximity spaces are weaker than the restrictions on topological proximity spaces. Nevertheless, semi-proximity spaces generalize classical topological spaces. Moreover, it is possible to describe all digital pictures used in computer vision and computer graphics as non-trivial semi-proximity spaces, which is not possible in classical topology. Therefore, we use semi-proximity spaces to establish a formal relationship between the `topological' concepts of digital image processing and their continuous counterparts in Rn. Especially interesting are continuous functions in semi- proximity spaces. The definition of a `nearly' bicontinuous function is given which does not require the function to be one-to-one. A nearly bicontinuous function preserves connectedness in both directions. Therefore, nearly bicontinuous functions can be used for characterizing well-behaved operations on digital images such as thinning. Further, it is shown that the deletion of a simple point can be treated as a nearly bicontinuous function. These properties and the fact that a variety of nearness relations can be defined on digital pictures indicate that nearly continuous functions are a useful tool in the difficult task of shape description.
In this paper a solution is presented which guarantees we avoid the connectivity paradoxes related to the Jordan Curve Theorem for all multicolor images. Only one connectedness relation is used for the entire digital image. We use only 4-connectedness (which is equivalent to 8-connectedness) for every component of every color. The idea is not to allow a certain `critical configuration' which can be detected locally to occur in digital pictures; such pictures are called `well-composed.' Well-composed images have very nice topological properties. For example, the Jordan Curve Theorem holds and the Euler characteristic is locally computable. This implies that properties of algorithms used in computer vision can be stated and proved in a clear way, and that the algorithms themselves become simpler and faster.
A special class of subsets of binary digital images called `well-composed sets' are defined. The sets of this class have very nice topological properties; for example, the Jordan Curve Theorem holds, the Euler characteristic is locally computable, and there is only one connectedness relation, since 4- and 8-connectedness are equivalent. This implies that properties of algorithms used in Computer Vision can be stated and proved in a clear way, and that the algorithms themselves become simpler and faster.
Shape recognition plays a central role in object recognition and while a large number of technical papers on shape representation and similarity exist, shape recognition still remains an unsolved problem. The main goal of this course is to present results on human shape perception, including 2D as well as 3D shape representation and similarity. This course will provide needed background knowledge about important features of shape recognition from the point of view of human visual perception. It will include a tutorial about human shape perception with an emphasis on the most important psychophysical experiments on shape recognition and reconstruction that have been performed during the last 100 years, as well as on computational models of human shape perception. The course will also include an overview of computational approaches to shape similarity and to shape-based retrieval in multimedia databases. We will report an experimental evaluation of their performance on the dataset used in MPEG-7 Core Experiment CE-Shape-1. This dataset provides a unique opportunity to compare various shape descriptors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.