Since most prior studies on similar image retrieval focused on the category level, image similarity learning at the finegrained level remains challenge, which often leads to a semantic gap between the low-level visual features and highlevel human perception. To solve the problem, we proposed a Mahalanobis and kernel-based similarity (Mah-Ker) method combined with features developed by the Convolutional Neural Network (CNN). Firstly, triplet constraints are introduced to characterize the fine-grained image similarity relationship which the Mahalanobis metric is trained upon. Then a kernel-based metric is proposed in the last layer of model to devise nonlinear extensions of Mahalanobis metric and further enhance the performance. Experiments based on the real VIP.com dress dataset showed that our proposed method achieved a promising higher retrieval performance than both the state-of-art fine-grained similarity model and the hand-crafted visual feature based approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.