#CVPR2023
Our "Prompting in Vision" Tutorial was a huge success. Thanks so much to our amazing speakers and all the participants!
- The tutorial slides and recordings will be uploaded to our tutorial website:
Looking for PhD/MPhil/RA to work together on cutting-edge research on foundation models (LLM, VLM, ...). Feel free to reach out via email if you are interested
spent a long time trying to figure out why my neural network doesn't learn anything
eventually found the values coming out from relu are zeros ... changed to leakyrelu and everything works just fine
is there a good toolbox to use for debugging neural networks quickly?
Our survey paper on domain generalization, titled "Domain Generalization: A Survey," has been accepted for publication at TPAMI, the flagship journal in AI!
Paper:
If you aim to pursue a PhD in
#computervision
,
#machinelearning
, and
#AI
, the Nvidia-HKBU Joint PhD Fellowship Scheme provided by our Dept would be a great program to join!
Web: .
Deadline: 31 October 2023/.
If you're working on open-vocabulary detection, you might be interested in this
#ECCV2022
work, which shows how the Transformer-based detector, DETR, is turned into an open-vocabulary detector using large vision-language models like CLIP.
NEWS: CoOp has been accepted to IJCV, the flagship journal in computer vision!
The paper presents comprehensive studies of adapting large, pre-trained vision-language models like CLIP using *prompt learning*
Have spare time when attending
#ICML2022
? Take a look at our paper :)
Learning to Prompt for Vision-Language Models
pdf:
abs:
a differentiable approach that focuses on continuous prompt learning to facilitate deployment of pre-trained vision language models in downstream datasets
#ICLR2023
: "This year, we are introducing a new program for ACs to (virtually) meet and discuss with reviewers only for borderline cases ..."
ICLR
@iclr_conf
has made a significant move towards building a better (fairer) community. Good for researchers choosing ICLR
#CVPR2022
: Conditional Prompt Learning for Vision Language Models
w/
@JingkangY
@ccloy
@liuziwei7
TL;DR: A simple conditional prompt learning approach that addresses the generalizability issue of static prompts
paper:
code:
Interested in learning the field of
#DomainGeneralization
? Check our recently released survey in this topic at , with coverage on the history, related fields, datasets, methodologies, potential directions, and so on.
Joint work with
@ccloy
@liuziwei7
Tired of seeing too many tweets talking about their papers getting accepted to
#ECCV2022
? (Congrats!)
This one is different: I'm happy that 3/3 papers I gave weak accept & above are accepted to
#ECCV2022
!
@ak92501
Interesting work👏
Foundation models are becoming a trend! And adapting them to downstream applications with as low cost as possible is equally important!
You might be interested to read our recent work on adapting CLIP-like models
The special issue has been published online at . If you'd like to have an overview of all accepted papers, you can read our Guest Editorial at . Big thanks to the guest editor team, reviewers, EICs and the editorial team at Springer!
#CVPR2023
Our "Prompting in Vision" Tutorial was a huge success. Thanks so much to our amazing speakers and all the participants!
- The tutorial slides and recordings will be uploaded to our tutorial website:
We have a paper to present at the conference: “what makes good examples for visual in-context learning?”
If you’re interested in visual in-context learning, talk to
@zhang_yuanhan
I’ll be at
#NeurIPS2023
from December 11th to 16th. Feel free to DM me if you’re interested in discussing multi-modal models, visual prompting, and related areas!
#CVPR2022
is approaching!
Our paper, CoCoOp, shows conditional prompt learning is more (i) generalizable to wider unseen classes, (ii) transferable across problems/tasks, and (iii) robust to domain shift.
paper:
code:
@ComputerSociety
Congrats!
Would like to promote our recently accepted survey at TPAMI, Domain Generalization: A Survey (), which gives a comprehensive summary of and an outlook for the emerging field of domain generalization, aka out-of-distribution generalization :-)
Just submitted my PC (reviewer) nominations for
#AAAI2023
Have seen/experienced low-quality reviews in the past (some even complained about "undergrad" reviewers who don't have publications)
All reviewers I recommend are qualified. Together we make the community better
🥳 Attending my first-ever in-person computer vision conference
#ECCV2022
!
Let's talk about scene understanding, OOD detection & generalization, and also prompt engineering in
@eccvconf
🇮🇱
looking for object detectors that use fewer labeled examples and have the capability to leverage unlabeled data as well as to cope with long-tail data distributions?
take a look at our recent paper published in IJCV this year (arxiv coming soon)
Thanks
@ak92501
.
Tired of tuning prompts for vision-language models like CLIP?
Why not use CoOp to learn prompts! It's both data-efficient and domain-generalizable😎
Joint work w/
@yangtafrog
@ccloy
@liuziwei7
Thanks,
@ak92501
. The main idea of CoOp is to model context in prompts using continuous representations and perform end-to-end learning from data. CoOp shows strong data-efficient learning capability as well as robustness to distribution shift.
Code: .
**Upcoming talk**
Robust and accurate fine-tuning for large neural networks
by Mitchell Wortsman from the University of Washington
@Mitchnw
Join us via
Interesting to see that our MixStyle method () has recently been applied to analyzing remote sensing images (), audio data (), and medical images ().
#MachineLearning
Our paper, "Domain Generalization with MixStyle", got acccepted to
#ICLR2021
.
The idea is simple: we mix the instance-level CNN style statistics between samples.
Openreview:
@GoogleAI
Very interesting and inspiring work!
Would like to share (i.e. self-promote) a relevant work of ours: "Learning to Prompt for Vision-Language Models" (), which shows that prompt learning works very well for adapting CLIP-like vision foundation models
𝗨𝗽𝗰𝗼𝗺𝗶𝗻𝗴 𝗧𝗮𝗹𝗸
Generalist Embodied AI in an Open World
by Xiaojian Ma
@jeasinema
from Beijing Institute for General Artificial Intelligence (BIGAI).
Join us via
@ylzou_Zack
Agreed!
Adapting large vision foundation models (w prompt learning) is critical and becoming a trend
Not aware of any talks, but the following works might be of interest to you
- : adapt CLIP to downstream image recognition
Luckily 6/8
#ECCV2022
accepted submissions. Most of them are rejected once, with good suggestions to help improve our papers. Thanks to our reviewers and more importantly my students and collaborators for their hard work! Ps: I'm looking for interns and researchers to join us :)
The submission system has been set up
When submitting a paper to this SI, please select the "S.I.: The Promises and Dangers of Large Vision Models" article type
looking for generalizable, off-the-shelf human features?
try our OSNet or OSNet-AIN (ICCV'19, TPAMI'21), which was originally developed for person re-id but has since been proven effective beyond re-id
model zoo:
see below for some use cases:
"MixStyle Neural Networks for Domain Generalization and Adaptation" (published in IJCV 2023)
a more comprehensive study of our previously proposed MixStyle approach—a simple plug-and-play module for domain generalization and adaptation
pdf:
We will host the
#ICLR2022
Workshop on Socially Responsible Machine Learning on Friday at 9:20 AM EDT.
A line of outstanding speakers will discuss emerging topics on machine learning
#security
,
#fairness
,
#ethics
and
#privacy
.
Workshop website:
not sure if you've noticed the bunny bot, which also appeared in the Talks's new year poster
it was synthesized by a generative AI!
please subscribe to the newsletter to join our talks and interact with world-leading experts in AI
suggestions are also welcome
Check out the recording of
@RinonGal
's talk on
personalizing your image with the
#DIFFUSION
model
Stay tuned for
@DrJimFan
's talk on 16 Feb (Thur) :D
Don't forget to subscribe from for zoom link!
Have spare time while attending
#CVPR2022
?
Would like to recommend our survey paper on the topic of domain generalization, which discusses a wide spectrum of methods for improving generalization of deep neural networks
Interested in learning the field of
#DomainGeneralization
? Check our recently released survey in this topic at , with coverage on the history, related fields, datasets, methodologies, potential directions, and so on.
Joint work with
@ccloy
@liuziwei7
We indeed failed to "cherry-pick" examples that make sense😂
As discussed in the paper, using nearest words for interpretation might be misleading as the found words are still distant from the learned vectors and nearby vectors do not necessarily have the same meaning
looking for a simple, efficient, plug-and-play module to improve your CNNs' generalization?
try MixStyle, a param-free layer proved effective on domain generalization and adaptation
paper:
code:
#machinelearning
#computervision
Some updates: Adds an "Evaluation" subsection; Includes more datasets in Tab.1 (w/ categorization based on applications); Discusses test-time training (related topic), RL methods (Sec.3.8), theories (Sec.4) & more recent work; etc.
A relevant codebase:
Interested in learning the field of
#DomainGeneralization
? Check our recently released survey in this topic at , with coverage on the history, related fields, datasets, methodologies, potential directions, and so on.
Joint work with
@ccloy
@liuziwei7
𝗨𝗽𝗰𝗼𝗺𝗶𝗻𝗴 𝗧𝗮𝗹𝗸
Multimodal Representation Learning with Deep Generative Models
by Shweta Mahajan
@Matewhs
from University of British Columbia
Join us via
#CVPR2022
is approaching!
Our paper, CoCoOp, shows conditional prompt learning is more (i) generalizable to wider unseen classes, (ii) transferable across problems/tasks, and (iii) robust to domain shift.
paper:
code:
I always tell all my students: do not take outcome of *any* conferences seriously, no matter what others tell you. Focus on doing significant research and writing good papers. Treat conference submissions as a drill for sharpening your academic skills - that is all what they are
I'm extremely honored to work with four established researchers, and together, I believe we can provide the community with a timely collection of research addressing the emerging issues in LVMs
Also thank
@kmoretticompsci
for the help :-)
Interesting to see that our MixStyle method () has recently been applied to analyzing remote sensing images (), audio data (), and medical images ().
#MachineLearning
We SAIC Cambridge still have multiple internships available for 2022 (official link stays tuned) for possibly working on one/more of the following topics,
-Self-supervision,
-AutoML,
-Multi-modal,
-Federated Learning.
Please reach me if you or anyone you know have any interest.
@3scorciav
@TheBMVA
@RealAAAI
Hi Victor, also heard your name while I was in Samsung. I haven't thought about whether to attend but certainly miss London a lot! Will see.
"the simpler models seemed to fare better on the corrected data than the more complicated models ... In other words, we may have an inflated sense of how great these complicated models are because of flawed testing data"
We maintained a webpage, "CV-Highlight-Papers", which contains all Oral ("Highlight" at CVPR 2023) papers in CVPR/ICCV/ECCV/NeurIPS/ICLR from 2017 to now. And each paper owns github link, project page link, stars and citations. Hope it can help.
The paper illustrates the problem definition, discusses the history, relates the problem to neighboring fields like domain adaptation and transfer learning, ...
@TheAITalksOrg
@ShuaiYang1991
@MMLabNTU
If you're interested in style transfer and image editing, please join the talk and interact with our speaker. Dr. Yang will "demystify" his recent work, VToonify, which has caused a sensation on social media (meaning that the community loves it!)
@louis_sherren
hi, the centers are updated along with the model param, please check the train() func. pls ask directly on github next time cause i don't usually check twitter :)
@YapengTian
@bypark___
bad luck, perhaps other papers in the same batch have higher scores and the AC has to throw away some in order to keep the overall acceptance rate within a certain range, just a guess
not sure if ACs were asked to maintain a certain acceptance rate
gives a comprehensive survey on methods developed in the last decade (with an intuitive & concrete categorization), and also points out promising directions for future work.
"Transformers is a new architecture that enables computers to understand language with unprecedented accuracy. Unlike prior language models that processed words sequentially, Transformers can discern connections between and among words in a sentence."
📄: