Semantic Model Vectors
Michele Merler, Liangliang Cao, John R Smith, Bert Huang, Lexing Xie, Gang Hua, Apostol Natsev
Description
We propose an intermediate semantic layer between low-level features and high-level concepts in order to bridge the notorious semantic gap. This representation, named Semantic Model Vectors, consists of hundreds of discriminative semantic detectors
and is used as a basis for modeling and
detecting complex concepts in unconstrained images and/or videos,
such as those from a social media feed or YouTube. Each discriminative semantic classifier in the Semantic Model Vectors
is trained from thousands of
labeled images and/or videos,and organized in a visual taxonomy. Our experiments
reveal that the proposed Semantic Model Vectors representation
outperforms and is complementary to other low-level visual
descriptors such as deep embeddings. We demonstrated the effectiveness of Semantic Model Vectors, both alone and in combination with
other low-level descriptors, for multiple hihg-level recognition tasks, including:
- estimation of user attributes from social media visual feeds
- complex video events recognition
- action recognition
- medical image modality classification.
Michele Merler, Bert Huang, Lexing Xie, Gang Hua, Apostol Natsev. Semantic model vectors for complex video event recognition. IEEE Transactions on Multimedia (TMM) 2012.
PDF
BibTeX
Project and Data
@article{Merler_TMM12,
author = {Michele Merler and Bert Huang and Lexing Xie and Gang Hua and Apostol Natsev},
title = {Semantic Model Vectors for Complex Video Event Recognition},
journal = {IEEE Transactions on Multimedia},
volume = {14},
number = {1},
pages = {88-101},
year = {2012}
}
Michele Merler, Liangliang Cao, John R Smith. You are what you tweet… pic! gender prediction based on semantic analysis of social media images. IEEE International on Conference on Multimedia and Expo (ICME) 2015.
PDF
BibTeX
Slides
Xiaolong Wang, Guodong Guo, Michele Merler, Noel CF Codella, MV Rohith, John R Smith, Chandra Kambhamettu. Leveraging multiple cues for recognizing family photos. Image and Vision Computing(IVC) 2017.
PDF
BibTeX
@article{Wang_IVC17,
author = {Wang, Xiaolong and Guo, Guodong and Merler, Michele and C. F. Codella, Noel and MV, Rohith and Smith, John R. and Kambhamettu, Chandra},
title = {Leveraging Multiple Cues for Recognizing Family Photos},
journal = {Image Vision Computing},
issue_date = {February 2017},
volume = {58},
number = {C},
year = {2017},
pages = {61--75}
}
Junjie Cai, Michele Merler, Sharath Pankanti, Qi Tian. Heterogeneous semantic level features fusion for action recognition. IEEE International on Conference on Multimedia Retrieval (ICMR) 2015.
PDF
BibTeX
@inproceedings{Cai_ICMR15,
author = {Cai, Junjie and Merler, Michele and Pankanti, Sharath and Tian, Qi},
title = {Heterogeneous Semantic Level Features Fusion for Action Recognition},
booktitle = {Proceedings of the 5th ACM on International Conference on Multimedia Retrieval},
series = {ICMR '15},
pages = {307--314},
year = {2015}
}
Noel Codella, Jonathan Connell, Sharath Pankanti, Michele Merler, John R Smith. Automated medical image modality recognition by fusion of visual and text information. International Conference on Medical Image Computing and Computer-Assisted Intervention(MICCAI) 2014.
PDF
BibTeX
CLEF13 Slides
@inproceedings{Codella_MICCAI14,
author = {Noel Codella and Jonathan Connell and Sharath Pankanti and Michele Merler and John R Smith},
title = {Automated Medical Image Modality Recognition by Fusion of Visual and Text Information},
booktitle = {Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2014: 17th International Conference, Boston, MA, USA, September 14-18, 2014, Proceedings, Part II},
pages = {487--495},
year = {2014}
}