Ciao! I am a Senior Research Scientist at IBM Research AI. Since 2012 I have been at the T. J. Watson Research Lab in New York. Before that, I did my PhD at Columbia University with Professor John Kender. My interests are in Multimedia and applications of Deep Learning in Computer Vision and NLP, with a focus on neural architecture search for vision and language models, and sports video analysis. I have publications in several peer-reviewed journals and conferences, including IEEE TMM, CVPR, ACM Multimedia, AAAI, ICMR, MICCAI, etc. Among other professional activities, I have served as an Associate Editor for the IEEE Transactions on Multimedia (2021-2023), as Area Chair for ECCV in 2024 and for ACM Multimedia in 2016 and 2017, and local organization chair and web chair for ICMR in 2016. My work has been recognized in the popular press (including New York Times, Fortune, NBC News) and I have been fortunate to win some awards, including the 2023 Tech Emmys. You can look at my CV here.
PhD in Computer Science, 2013
Columbia University
MS in Computer Science, 2008
Columbia University
MEng in Telecommunications Engineering, 2007
University of Trento, Italy
- We won Best Paper Award at IEEE ICDH! paper July 2024
- The (Computer) Vision of Sports book chapter is out! June 2024
- Granite Code Models paper and models are released opensource! May 2024
- Code Lingua leaderboard is out! April 2024
- 10th Workshop on Computer Vision in Sports (CVSports) @CVPR 2024
- I am serving as Area Chair for ECCV 2024
- Code Lingua paper on evaluating CodeLLMs for translation accepted at ICSE 2024
- We helped IBM win a Tech Emmy Award for AI-ML curation of Sports Highlights @EMMYS 2023!
- 9th Workshop on Computer Vision in Sports (CVSports) @CVPR 2023
- 4th Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2023
- 8th Workshop on Computer Vision in Sports (CVSports) @CVPR 2022
- Nominated Outstanding Reviewer @CVPR 2021
- I have been appointed Associate Editor for IEEE Transactions on Multimedia (TMM) (2021-2023)!
- 7th Workshop on Computer Vision in Sports (CVSports) @CVPR 2021
- 2nd Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2021
- NASTransfer paper accepted to AAAI (Feb 2021)
- 6th Workshop on Computer Vision in Sports (CVSports) @CVPR 2020
- Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2020
- 2nd Workshop on Bias Estimation in Face Analytics (BEFA) @CVPR 2019
- The Diversity in Faces dataset is out. Check it out here (Jan 2019)
- AI Self Portrait in the art gallery of the NIPS Workshop on Machine Learning for Creativity (Dec 2018)
- Our AI Self Portait has been published in the New York Times! (Oct 2018)
- Cognitive Highlights work accepted to IEEE TMM (Sep 2018)
- Cognitive Highlights wins 2018 Best Digital Development at the Yahoo Sports Tech Awards for Wimbledon!
- Workshop on Bias Estimation in Face Analytics (BEFA) @ECCV 2018
- New Face Attributes models coming to WDC Visual Recognition mitigating bias (Feb 2018)
- Cognitive Highlights @US Open! (Sep 2017)
- Presenting Cognitive Highlights (@Wimbledon July 2017) in the CV in Sports Workshop @CVPR 2017
- Food recognition model launched in beta in WDC Visual Recognition (May 2017)
- Cognitive Highlights @Golf Masters! (Apr 2017)
- I am Area Chair for ACM Multimedia 2017 in Multimedia Search and Recommendation track
- Presenting our delicious Food analytics papers @ACM Multimedia 2016
- I am Local Organization Chair and Webmaster for ICMR 2016
- I acted as a Guest Editor for the Neurocomputing Special Issue on Advanced Learning for Large-Scale Heterogeneous Computing 2016
- I am Area Chair for ACM Multimedia 2016 in Multimedia Search and Recommendation track
- Received the outstanding reviewer award from the 2015 International Conference on Multimedia Retrieval (ICMR)
Carla Agurto, Michele Merler, Esteban Roitberg, Alan Taitz, Marcos A. Trevisan, Diego E. Shalom, Julian Peller, Lyle W. Ostrow, Indu Navar, Ernest Fraenkel, James Berry, Guillermo A. Cecchi and Raquel Norel. Harnessing Remote Speech Tasks for Early ALS Biomarker Identification. IEEE International Conference on Digital Health (ICDH) 2024. PDF BibTeX
Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Yi Zhou, Chris Johnson, Aanchal Goyal, Hima Patel, Yousaf Shah, Petros Zerfos, Heiko Ludwig, Asim Munawar, Maxwell Crouse, Pavan Kapanipathi, Shweta Salaria, Bob Calio, Sophia Wen, Seetharami Seelam, Brian Belgodere, Carlos Fonseca, Amith Singhee, Nirmit Desai, David D. Cox, Ruchir Puri, Rameswar Panda. Granite Code Models: A Family of Open Foundation Models for Code Intelligence. arXiv (arXiv) 2024. arXiv BibTeX
Rangeet Pan, Ali Reza Ibrahimzada, Rahul Krishna, Divya Sankar, Lambert Pouguem Wassi, Michele Merler, Boris Sobolev, Raju Pavuluri, Saurabh Sinha and Reyhaneh Jabbarvand. Lost in translation: A study of bugs introduced by large language models while translating code. International Conference on Sofware Engineering (ICSE) 2024. arXiv BibTeX
Rikke Gade, Michele Merler, Graham Thomas and Thomas B Moeslund. The (Computer) Vision of Sports: Recent Trends in Research and Commercial Systems for Sport Analytics. Computer Vision: Challenges, Trends, and Opportunities (CRC Press) 2024. book preview BibTeX
Masayasu Muraoka, Bishwaranjan Bhattacharjee, Michele Merler, Graeme Blackwood, Yulong Li, and Yang Zhao. Cross-Lingual Transfer of Large Language Model by Visually-Derived Supervision Toward Low-Resource Languages. ACM Multimedia (MM ) 2023. PDF BibTeX
Takuma Udagawa, Aashka Trivedi, Michele Merler and Bishwaranjan Bhattacharjee. A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models . EMNLP Industry Track (EMNLP) 2023. arXiv BibTeX
Jiaqing Yuan, Michele Merler, Mihir Choudhury, Raju Pavuluri, Munindar P. Singh and Maja Vukovic. CoSiNES: Contrastive Siamese Network for Entity Standardization . ACL Matching Workshop (ACLW) 2023. arXiv code BibTeX
Aashka Trivedi, Takuma Udagawa, Michele Merler, Rameswar Panda, Yousef El-Kurdi and Bishwaranjan Bhattacharjee. Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models. arXiv (arXiv) 2022. arXiv BibTeX
Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio S Feris, Bishwaranjan Bhattacharjee. NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search. 35th AAAI Conference on Artificial Intelligence (AAAI) 2021. arXiv BibTeX
Ulrich Finkler, Michele Merler, Rameswar Panda, Mayoore S Jaiswal, Hui Wu, Kandan Ramakrishnan, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee. Large Scale Neural Architecture Search with Polyharmonic Splines. arXiv (arXiv) 2020. arXiv BibTeX
Michele Merler, Cicero Nogueira dos Santos, Mauro Martino, Alfio M Gliozzo, John R Smith. Covering the News with (AI) Style. arXiv (arXiv) 2020. arXiv BibTeX
Michele Merler, Nalini Ratha, Rogerio S Feris, John R Smith. Diversity in Faces . arXiv (arXiv) 2019. arXiv BibTeX Project
Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, Jinjun Xiong, Minh N. Do, John R Smith, Rogerio S Feris. Automatic Curation of Sports Highlights using Multimodal Excitement Features. IEEE Transactions on MultiMedia (TMM) 2018. PDF BibTeX Project
Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R Smith, Rogerio S Feris. Automatic Curation of Golf Highlights using Multimodal Excitement Features. 3rd Workshop of Computer Vision in Sports @CVPR (CVPRW) 2017. PDF BibTeX Slides Project
Dhiraj Joshi, Michele Merler , Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R Smith, Rogerio S Feris. IBM High-Five: Highlights From Intelligent Video Engine. ACM Multimedia (MM) 2017. PDF BibTeX Project
Xiaolong Wang, Guodong Guo, Michele Merler, Noel CF Codella, MV Rohith, John R Smith, Chandra Kambhamettu. Leveraging multiple cues for recognizing family photos. Image and Vision Computing(IVC) 2017. PDF BibTeX
Michele Merler, Hui Wu, Rosario Uceda-Sosa, Quoc-Bao Nguyen, John R Smith. Snap, Eat, RepEat: a food recognition engine for dietary logging. 2nd International Workshop on Multimedia Assisted Dietary Management @ACM Multimedia (MADIMA) 2016. PDF BibTeX Project Slides Poster
Hui Wu, Michele Merler, Rosario Uceda-Sosa, John R Smith. Learning to make better mistakes: Semantics-aware visual food recognition. ACM Multimedia (MM) 2016. PDF BibTeX Project
Michele Merler, Liangliang Cao, John R Smith. You are what you tweet… pic! gender prediction based on semantic analysis of social media images. IEEE International on Conference on Multimedia and Expo (ICME) 2015. PDF BibTeX Slides
Junjie Cai, Michele Merler, Sharath Pankanti, Qi Tian. Heterogeneous semantic level features fusion for action recognition. IEEE International on Conference on Multimedia Retrieval (ICMR) 2015. PDF BibTeX
Mani Abedini, Noel CF Codella, Jonathan H Connell, Rahil Garnavi, Michele Merler, Sharath Pankanti, John R Smith, Tanveer Syeda-Mahmood A generalized framework for medical image classification and recognition. IBM Journal of Research and Development(IBM-JRD) 2015. PDF BibTeX
Felix X Yu, Liangliang Cao, Michele Merler, Noel Codella, Tao Chen, John R Smith, Shih-Fu Chang. Modeling attributes from category-attribute proportions. ACM Multimedia (MM) 2014. PDF BibTeX
Noel Codella, Jonathan Connell, Sharath Pankanti, Michele Merler, John R Smith. Automated medical image modality recognition by fusion of visual and text information. International Conference on Medical Image Computing and Computer-Assisted Intervention(MICCAI) 2014. PDF BibTeX CLEF13 Slides
Michele Merler, Bert Huang, Lexing Xie, Gang Hua, Apostol Natsev. Semantic model vectors for complex video event recognition. IEEE Transactions on Multimedia (TMM) 2012. PDF BibTeX Project
Michele Merler, Rong Yan, John R Smith. Imbalanced rankboost for efficiently ranking large-scale image/video collections. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2009. PDF BibTeX Poster
Rong Yan, Marc-Olivier Fleury, Michele Merler, Apostol Natsev, John R. Smith. Large-Scale Multimedia Semantic Concept Modeling using Robust Subspace Bagging and MapReduce. First ACM workshop on Large-scale multimedia retrieval and mining @ACM Multimedia (LS-MMRM) 2009. PDF BibTeX
Michele Merler, John R Kender. Semantic keyword extraction via adaptive text binarization of unstructured unsourced video. IEEE International Conference on Image Processing (ICIP) 2009. PDF BibTeX Poster
Michele Merler, Carolina Galleguillos, Serge Belongie. Recognizing groceries in situ using in vitro training data. 2nd International Workshop on Semantic Learning Applications in Multimedia @CVPR (SLAM) 2007. PDF BibTeX Project
Jiaqing Yuan, Michele Merler, Mihir Choudhury, Venkata Nagaraju Pavuluri, Maja Vukovic. Entity standardization for application modernization. US Patent App. 18/160,301 2024 GooglePatents
Michele Merler, Paul Pritz. Attribute-based calibration for machine learning. US Patent App. 17/977,880 2024 GooglePatents
Anup Kalia, Mihir Choudhury, Jin Xiao, Divya Sankar, John Rofrano, Venkata Nagaraju Pavuluri, Lambert Pouguem Wassi, Maja Vukovic, Michele Merler. Adaptable and explainable application modernization disposition. US Patent App. 18/071,911 2024 GooglePatents
Michele Merler, Dhiraj Joshi, Apurv Gupta, Sebastien Gilbert, Shyama Prosad Chowdhury, Chidansh Amitkumar Bhatt, Nirmit V Desai. AI System and Method for Automatic Analog Gauge Reading. US Patent App. 17/936,519 2024 GooglePatents
Sebastien Gilbert, Michele Merler, Dhiraj Joshi, Apurv Gupta, Shyama Prosad Chowdhury, Chidansh Amitkumar Bhatt, Nirmit V Desai. Oblique Image Rectification. US Patent App. 18/048,975 2024 FreePatentsOnline
Dinesh C Verma, Franck Vinh Le, Michele Merler, Dhiraj Joshi, Supriyo Chakraborty, Seraphin Bernard Calo. Knowledge expansion for improving machine learning. US Patent App. 17/935,198 2024 GooglePatents
Raghu Kiran Ganti, Mudhakar Srivatsa, Shreeranjani Srirangamsridharan, Jae-Wook Ahn, Michele Merler, Dean Steuer. Transparent and controllable topic modeling. US Patent 11,941,038 2024 GooglePatents
Michele Merler, Aashka Trivedi, Rameswar Panda, Bishwaranjan Bhattacharjee, Taesun Moon, Avirup Sil. Neural architecture search of language models using knowledge distillation. US Patent App. 17/075,963 2022 GooglePatents
Ulrich Alfons Finkler, Michele Merler, Mayoore Selvarasa Jaiswal, Hui Wu, Rameswar Panda, Wei Zhang. Configuring a neural network using smoothing splines. US Patent App. 17/075,963 2022 GooglePatents
Michele Merler, Mauro Martino, Cicero Nogueira dos Santos, Alfio Massimiliano Gliozzo, John R. Smith. Automatic generation of content using multimedia. US Patent 11,170,270 2021 GooglePatents
Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen C Hammer, John Joseph Kent, John R Smith, Rogerio Feris. Auto-curation and personalization of sports highlights. US Patent 10,595,101 2020 GooglePatents
Michele Merler, Jae-Eun Park, John R Smith, Rosario Uceda-Sosa. Individual and user group attributes discovery and comparison from social media visual content. US Patent 10,282,677 2019 GooglePatents
Michele Merler, John R Smith, Rosario Uceda-Sosa, Hui Wu. Image classification utilizing semantic relationships in a classification hierarchy. US Patent 9,928,448 2018 GooglePatents
Liangliang Cao, Michele Merler, John R Smith. Systems and methods for inferring gender by fusion of multimodal content. US Patent 9,684,852 2017 GooglePatents
Michele Merler, John R Kender. Kalman filter approach to augment object tracking. US Patent 9,177,229 2015 GooglePatents