CVPR 2024 Workshop on
Representation Learning with Very Limited Images

-Zero-shot, Unsupervised, and Synthetic Learning in the Era of Big Models-


June 18th (PM), 2024 at Summit 324

<Related Workshops by Organizers>

The 2nd Workshop on LIMIT

We propose the 2nd workshop on “Representation Learning with Very Limited Images: Zero-shot, Unsupervised, and Synthetic Learning in the Era of Big Models” in conjunction with CVPR 2024. At this very moment, the era of ‘foundation models’ heavily relies on a huge amount (>100M-order data) of samples inside of a training dataset. We have witnessed that this kind of large-scale dataset tends to incur ethical issues such as societal bias, copyright, and privacy, due to the uncontrollable big data. On the other hand, the setting of very limited data such as self-supervised learning with a single image or synthetic pre-training with generated images are free of the typical issues. Efforts to train visual/multi-modal models on very limited data resources have emerged independently from various academic and industry communities around the world. This workshop aims to bring together these various communities to form a collaborative effort and find brave new ideas.

Broader Impact

It was an established fact that learning representations were acquired by “human-annotated labels” from “large amount of real data”. The trained models were additionally fine-tuned and applied to each visual task. However, recent problems with large-scale datasets are: (i) biased datasets could lead to e.g., gender and racial discrimination, (ii) suspension of public dataset access due to offensive labels, (iii) ethically problematic images are mixed in a large-scale dataset. As long as a large-scale dataset consisting of real images are used, the situation is endlessly problematic. Here, a recent study reveals that a learning strategy even with very few real images [1] and supervision from a mathematical formula [2] successfully acquires a learned representation of how to see the real world. Moreover, a pre-trained model with artificially generated data outperformed ImageNet-21k pre-training [3] and was found to acquire higher robustness [4]. Thus, it is clear that self-supervised learning (SSL), formula-driven supervised learning (FDSL), and synthetic training in the very limited data setting can also develop a DNN model with high accuracy and safety. Moreover, in the era of foundation models, while these critical issues remain unresolved, there’s growing attention on how pre-training can be achieved with very limited data, whether it’s possible with synthetic images or generative models without any real images, and how adaptation can be carried out using zero/one/few-shot or very limited data. Although these topics have not yet attracted much attention in the computer vision field, these research topics must be focused on since they are expected to be a means to replace learning with real data and resolve ethical issues in the future. * The listed papers are proposed by organizers or invited speakers

Invited Talk 1: Phillip Isola (MIT)


Title: N=0: Learning Vision with Zero Visual Data [Slide]
Bio: Phillip Isola is the Class of 1948 Career Development associate professor in EECS at MIT. He studies computer vision, machine learning, robotics, and AI. He completed his Ph.D. in Brain & Cognitive Sciences at MIT, and has since spent time at UC Berkeley, OpenAI, and Google Research. His work has particularly impacted generative AI and self-supervised representation learning. Dr. Isola's research has been recognized by a Google Faculty Research Award, a PAMI Young Researcher Award, a Samsung AI Researcher of the Year Award, a Packard Fellowship, and a Sloan Fellowship. His teaching has been recognized by the Ruth and Joel Spira Award for Distinguished Teaching. His current research focuses on trying to scientifically understand human-like intelligence. (Refer from: http://web.mit.edu/phillipi/www/bio.html)
References:

Invited Talk 2: Zeynep Akata (Helmholtz Munich/TUM)


Title: Learning with Small Number of Images in Multimodal Large Language Models [Video]
Bio: Zeynep Akata is a Liesel Beckmann Distinguished professor of Computer Science at Technical University of Munich and the director of the Institute for Explainable Machine Learning at Helmholtz Munich. After completing her PhD at the INRIA Rhone Alpes with Prof Cordelia Schmid (2014), she worked as a post-doctoral researcher at the Max Planck Institute for Informatics with Prof Bernt Schiele (2014-17) and at University of California Berkeley with Prof Trevor Darrell (2016-17) and as an assistant professor at the University of Amsterdam with Prof Max Welling (2017-19). Before moving to Munich in 2024, she was a professor of computer science (W3) within the Cluster of Excellence Machine Learning at the University of Tübingen. She received a Lise-Meitner Award for Excellent Women in Computer Science from Max Planck Society in 2014, a young scientist honour from the Werner-von-Siemens-Ring foundation in 2019, an ERC-2019 Starting Grant from the European Commission, The DAGM German Pattern Recognition Award in 2021, The ECVA Young Researcher Award in 2022 and the Alfried Krupp Award in 2023. Her research interests include multimodal learning and explainable AI. (Refer from: https://www.eml-unitue.de/people/zeynep-akata)
References:

Program (Date: June 18th PM, Room: Summit 324)

Oral Session (14:20 - 15:00; Room: Summit 324)

Oral presenters have 10 minutes including questions. Oral presenters are also required to present at the poster session.

Poster Session (16:30 - 17:30; Room: Arch Building 4E)

Posters will be 84” x 42” = 213 cm x 107cm (WxH, aspect ratio 2:1, landscape format). If you want to use the CVPR logo on your poster, you can download them as a zip file here (Refer from: CVPR Official Website)

Paper Submission / Call For Papers


Important Dates


Organizers


Hirokatsu Kataoka
AIST

Yuki M. Asano
University of Amsterdam

Christian Rupprecht
University of Oxford

Rio Yokota
Tokyo Tech/AIST

Nakamasa Inoue
Tokyo Tech/AIST

Dan Hendrycks
Center for AI Safety

Xavier Boix
Fujitsu Research

Manel Baradad
MIT

Connor Anderson
BYU

Ryo Nakamura
Fukuoka University/AIST

Ryosuke Yamada
University of Tsukuba/AIST

Risa Shinoda
Kyoto University/AIST

Ryu Tadokoro
Tohoku University/AIST

Erika Mori
Keio University/AIST