Hi! This is Hitesh, welcome to my space! I am a Research Fellow at Microsoft Research India, working with Saikat Guha to digitize traditional newspapers, making them more accessible to visually challenged people by enabling easy navigation and access to the content. I also collaborate with Jianwei Yang at Microsoft Research Redmond to investigate the capabilities of diffusion models in image editing and video/GIF generation.

In 2022, I graduated from Indian Institute of Technology Bombay (IITB), with a Bachelors in Electrical Engineering. I also completed two minor degrees in Computer Science and Engineering, and Machine Learning and Data Science. I did my Bachelor Thesis on Compositional Zero Shot Learning, where I was advised by Professor Biplab Banerjee.

I'm deeply fascinated by the synergy between different types of data including text, images, videos, audio, and more, mirroring a subset of human senses. Eventually, I want to work at this big intersection. As a stepping stone towards this goal, I am currently studying the interaction between vision and language.

Publications

model figure
Pix2Gif: Motion-Guided Diffusion for GIF Generation
Hitesh Kandala, Jianfeng Gao, Jianwei Yang
pdf| abstract| cite

teaser figure
Beyond Boundaries: A Novel Data-Augmentation Discourse for Open Domain Generalization
Shirsha Bose, Ankit Jha, Hitesh Kandala, Biplab Banerjee TMLR | Transactions on Machine Learning Research
paper| cite

model figure
Exploring Transformer and Multi Label Classification for Remote Sensing Image Captioning
Hitesh Kandala, Sudipan Saha, Biplab Banerjee, Xiao Xiang Zhu IEEE GRSL | IEEE Geoscience and Remote Sensing Letters
paper| cite

results figure
Multi-Stage Semantic Graph Embeddings for Compositional Zero-Shot Learning
Hitesh Kandala, Ruchika Chavhan, Ushasi Chaudhuri, Biplab Banerjee
paper| cite

IIT Bombay
2018 - 2022
Endimension Technology
S2020
University of Tuebingen
S2021
Microsoft Research
2022 - Present