Abstract:Most of the traditional approaches for image annotation based on distance metric learning generally suppose that the constraints on training data is explicit, which only works on small datasets with exact labels. As the scale of the dataset becomes larger and most images are accompanied by noisy labels, this ideal assumption will be not efficient. In this paper, we propose a novel distance metric learning method based on probabilistic topic model. The uncertain and latent side information can be mined by a probabilistic topic model and afterwards used in distance metric learning. The learned semantic distance metric then can be used in the searchbased image annotation. Experiments on Flickr dataset demonstrate that the proposed model outperforms the stateoftheart annotation methods.