Dr. Ayan Kumar Bhunia

I completed my Doctor of Philosophy (PhD), focusing on Computer Vision and Deep Learning in 2022, from SketchX Lab, Centre for Vision, Speech and Signal Processing (CVSSP) , University of Surrey, England, United Kingdom under the supervision of Prof. Yi-Zhe Song and Prof. Tao(Tony) Xiang .

Currently, I am working as a Senior Research Scientist at Sony PlayStation, London. I specialize in Computer Vision/Deep Learning, contributing to the dynamic field of the gaming industry. Formerly, I played an important role at iSIZE, a London-based deep-tech startup focused on AI for Video Delivery, where I led the Compressed Video Denoising project. During this time, I designed and developed a deep model from the ground up - a low-cost neural solution aimed at eliminating compressed video artifacts. Following iSIZE's acquisition by Sony PlayStation, I transitioned into my current role, continuing to innovate in the Sony PlayStation R&D team, with a focus on Machine Learning.

I am also deeply passionate about leveraging Free-hand Sketch as a user-friendly and interactive medium for a diverse range of computer vision tasks. This includes but is not limited to fine-grained image retrieval, image generation, image editing, 3D Shape generation/editing, object detection (CVPR'23 Top-12 best paper candidate), as well as few-shot learning. I believe in the potential of Free-hand Sketch to serve as an effective and intuitive tool across various applications within the realm of computer vision.

Top-venue Publications (Feb 2024): 25xCVPR, 4xICCV, 3xECCV, 1xSiggraph Asia.

Google Scholar  /  GitHub  /  LinkedIn  /  DBLP

profile photo
Recent Updates

  • New!! [Feb 2024]: Eight papers got accepted in CVPR'24. (More details coming soon!)
  • [July 2023]: One work on Sketch-Based 3D Shape Retrieval is accepted in ICCV'23!
  • [March 2023]: Our paper What Can Human Sketches Do for Object Detection? (CVPR'23) has been selected among 12 award candidates , out of 9155 submissions and 2360 accepted papers in CVPR, 2023 .
  • [March 2023]: Seven papers got accepted in CVPR 2023.
  • [Oct 2022]: Defended my PhD Thesis before Prof. Stella Yu and Prof. Adrian Hilton — with No corrections
  • [July 2022]: Two papers got accepted in ECCV 2022.
  • [March 2022]: Four papers got accepted in CVPR 2022.
  • Selected Publications

    2024
    DemoCaricature: Democratising Caricature Generation with a Rough Sketch

    Dar-Yen Chan, Ayan Kumar Bhunia, Subhadeep Koley, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2024 (New!)

    Abstract / Code / arXiv / BibTex

    It's All About Your Sketch: Democratising Sketch Control in Diffusion Models

    Subhadeep Koley, Ayan Kumar Bhunia, Deeptanshu Sekhri, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2024 (New!)

    Abstract / Code / arXiv / BibTex

    You’ll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

    Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2024 (New!)

    Abstract / Code / arXiv / BibTex

    How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

    Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2024 (New!)

    Abstract / Code / arXiv / BibTex

    SketchINR: A First Look into Sketches as Implicit Neural Representations

    Hmrishav Bandyopadhyay, Ayan Kumar Bhunia , Pinaki Nath Chowdhury, Aneeshan Sain, Tao Xiang , Timothy Hospedales Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2024 (New!)

    Abstract / Code / arXiv / BibTex

    Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers

    Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2024 (New!)

    Abstract / Code / arXiv / BibTex

    What Sketch Explainability Really Means for Downstream Tasks

    Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Tao Xiang, Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2024 (New!)

    Abstract / Code / arXiv / BibTex

    Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes

    Hmrishav Bandyopadhyay, Subhadeep Koley, Ayan Das, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2024 (New!)

    Abstract / Code / arXiv / BibTex

    2023
    Democratising 2D Sketch to 3D Shape Retrieval Through Pivoting

    Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang , Yi-Zhe Song .
    International Conference on Computer Vision ( ICCV ), 2023.

    Abstract / Code / arXiv / BibTex

    Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings

    Ayan Kumar Bhunia , Subhadeep Koley, Amandeep Kumar, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2023

    Abstract / Code / arXiv / BibTex

    Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

    Subhadeep Koley, Ayan Kumar Bhunia , Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2023

    Abstract / Code / arXiv / BibTex

    SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text

    Pinaki Nath Chowdhury, Ayan Kumar Bhunia , Aneeshan Sain, Subhadeep Koley, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2023

    Abstract / Code / arXiv / BibTex

    Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR

    Aneeshan Sain, Ayan Kumar Bhunia , Subhadeep Koley, Pinaki Nath Chowdhury, Soumitri Chattopadhyay, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2023.

    Abstract / Code / arXiv / BibTex

    What Can Human Sketches Do for Object Detection?

    Pinaki Nath Chowdhury, Ayan Kumar Bhunia , Aneeshan Sain, Subhadeep Koley, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2023 [Top 12 Award Candidates]

    Abstract / Code / arXiv / BibTex

    CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not

    Aneeshan Sain, Ayan Kumar Bhunia , Pinaki Nath Chowdhury, Subhadeep Koley, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2023

    Abstract / Code / arXiv / BibTex

    Data-Free Sketch-Based Image Retrieval

    Abhra Chaudhuri, Ayan Kumar Bhunia , Yi-Zhe Song, Anjan Dutta .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2023

    Abstract / Code / arXiv / BibTex

    2022
    FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context

    Pinaki Nath Chowdhury, Aneeshan Sain, Yulia Gryaditskaya, Ayan Kumar Bhunia , Tao Xiang , Yi-Zhe Song .
    European Conference on Computer Vision( ECCV ), 2022

    Abstract / Code / arXiv / BibTex

    Adaptive Fine-Grained Sketch-Based Image Retrieval

    Ayan Kumar Bhunia , Aneeshan Sain, Parth Hiren Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang , Yi-Zhe Song .
    European Conference on Computer Vision( ECCV ), 2022

    Abstract / Code / arXiv / BibTex

    Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches

    Ayan Kumar Bhunia , Viswanatha Reddy Gajjala, Subhadeep Koley, Rohit Kundu, Aneeshan Sain, Tao Xiang , Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2022

    Abstract / Code / arXiv / Marktechpost Blog / BibTex

    Sketching without Worrying: Noise-Tolerant Sketch-Based Image Retrieval

    Ayan Kumar Bhunia , Subhadeep Koley, Abdullah Faiz Ur Rahman Khilji , Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2022

    Abstract / Code / arXiv / BibTex

    Partially Does It: Towards Scene-Level FG-SBIR with Partial Input

    Pinaki Nath Chowdhury, Ayan Kumar Bhunia , Viswanatha Reddy Gajjala, Aneeshan Sain, Tao Xiang, Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2022

    Abstract / Code / arXiv / BibTex

    Sketch3T: Test-time Training for Zero-Shot SBIR

    Aneeshan Sain, Ayan Kumar Bhunia , Vaishnav Potlapalli , Pinaki Nath Chowdhury , Tao Xiang, Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2022

    Abstract / Code / arXiv / BibTex

    2021
    Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation

    Ayan Kumar Bhunia , Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song .
    IEEE International Conference on Computer Vision ( ICCV ), 2021

    Abstract / arXiv / BibTex

    Towards the Unseen: Iterative Text Recognition by Distilling from Errors

    Ayan Kumar Bhunia , Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song .
    IEEE International Conference on Computer Vision ( ICCV ), 2021

    Abstract / arXiv / BibTex

    Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition

    Ayan Kumar Bhunia , Aneeshan Sain, Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, Yi-Zhe Song .
    IEEE International Conference on Computer Vision ( ICCV ), 2021

    Abstract / arXiv / BibTex

    Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting

    Ayan Kumar Bhunia , Pinaki Nath Chowdhury, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2021

    Abstract / Code / arXiv / BibTex

    More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

    Ayan Kumar Bhunia , Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2021

    Abstract / Code / arXiv / BibTex

    StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

    Aneeshan Sain, Ayan Kumar Bhunia , Yongxin Yang and , Tao Xiang, Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2021

    Abstract / arXiv / BibTex

    MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition

    Ayan Kumar Bhunia , Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song .
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2021

    Abstract / arXiv / BibTex

    2020
    Pixelor: A Competitive Sketching AI Agent. So you think you can beat me?

    Ayan Kumar Bhunia* , Ayan Das*, Umar Riaz Muhammad*, Yongxin Yang, Timothy M. Hospedalis, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song .
    SIGGRAPH Asia , 2020.

    Abstract / Code / arXiv / BibTex / Try Online Demo (*equal contribution)

    Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval

    Aneeshan Sain, Ayan Kumar Bhunia , Yongxin Yang, Tao Xiang, Yi-Zhe Song .
    British Machine Vision Conference ( BMVC ), 2020.

    Abstract / arXiv / BibTex (Oral Presentation)

    Fine-grained visual classification via progressive multi-granularity training of jigsaw patches

    Ruoyi Du, Dongliang Chang, Ayan Kumar Bhunia , Jiyang Xie, Zhanyu Ma, Yi-Zhe Song , Jun Guo .
    European Conference on Computer Vision ( ECCV ), 2020.

    Abstract / Code/ arXiv / BibTex

    Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval

    Ayan Kumar Bhunia , Yongxin Yang, Timothy M. Hospedalis, Tao Xiang, Yi-Zhe Song.
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2020.

    Abstract / Code / arXiv / BibTex (Oral Presentation)

    2019
    Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning

    Ayan Kumar Bhunia , Abhirup Das, Ankan Kumar Bhunia, Perla Sai Raj Kishore, Partha Pratim Roy.
    IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ), 2019

    Abstract / Code / arXiv / BibTex

    Improving Document Binarization via Adversarial Noise-Texture Augmentation

    Ankan Kumar Bhunia, Ayan Kumar Bhunia , Aneeshan Sain, Partha Pratim Roy.
    IEEE Conference on Image Processing ( ICIP ), 2019

    Abstract / Code / arXiv / BibTex (Top 10% Papers)

    A Deep One-Shot Network for Query-based Logo Retrieval

    Ayan Kumar Bhunia , Ankan Kumar Bhunia, Shuvozit Ghose, Abhirup Das, Partha Pratim Roy, Umapada Pal
    Pattern Recognition ( PR ), 2019

    Abstract / Code / Third Party Implementation / arXiv / BibTex

    User Constrained Thumbnail Generation Using Adaptive Convolutions

    Perla Sai Raj Kishore, Ayan Kumar Bhunia , Shovozit Ghose, Partha Pratim Roy
    International Conference on Acoustics, Speech and Signal Processing ( ICASSP ), 2019

    Abstract / Code / arXiv / BibTex (Oral Presentation)

    Texture Synthesis Guided Deep Hashing for Texture Image Retrieval

    Ayan Kumar Bhunia , Perla Sai Raj Kishore, Pranay Mukherjee, Abhirup Das, Partha Pratim Roy
    IEEE Winter Conference on Applications of Computer Vision ( WACV ), 2019

    Abstract / arXiv / BibTex / Video Presentation

    Script identification in natural scene image and video frames using an attention based Convolutional-LSTM network

    Ankan Kumar Bhunia, Aishik Konwer, Ayan Kumar Bhunia , Abir Bhowmick, Partha Pratim Roy, Umapada Pal
    Pattern Recognition ( PR ), 2019

    Abstract / Code / arXiv / BibTex



    Template credits : Dr. Jon Barron