Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ Profile Banner
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ Profile
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ

@ducha_aiki

18,326
Followers
591
Following
3,285
Media
17,421
Statuses

Marrying classical CV and Deep Learning. I do things, which work, rather than being novel, but not working.

Ukraine
Joined July 2017
Don't wanna be here? Send us removal request.
Pinned Tweet
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
โ—๏ธ๐Ÿ‡บ๐Ÿ‡ฆ๐Ÿ‡บ๐Ÿ‡ฆ๐Ÿ‡บ๐Ÿ‡ฆโ—๏ธ My friends have created Notion wiki with many ways how you can help #Ukraine - donation, petitions, housing, DDOS and many others #HelpUkraine Please, share.
Tweet media one
9
175
302
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Dear friends. I realize that you followed me for the news in computer vision, image matching, machine learning. Not for political bullshit. But my country is in grave danger. It is not politics, it is life&death. I will continue to write about ML&CV. And also about ๐Ÿ‡บ๐Ÿ‡ฆ.
58
185
4K
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
I have submitted my PhD thesis, yay!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
37
11
594
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Yesterday Facebook blackout shows why I am afraid of self-driving cars. Not because of deep learning: I believe we will solve robust perception at some point. But mega-centralization...one wrong update and millions of self-driving cars become a weapons of mass self-destruction.
24
82
535
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
If you are interested in doing PhD in computer vision and/or robotics, consider CTU in Prague. We are bad in self-PR, but good in science. Also, Prague is a great and affordable city to live in.
Tweet media one
Tweet media two
24
70
445
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
That is ๐Ÿ”ฅ Linformer: Self-Attention with Linear Complexity Sinong Wang, Belinda Li, Madian Khabsa, Han Fang, Hao Ma
Tweet media one
Tweet media two
Tweet media three
6
104
379
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Torch.manual_seed(3407) is all you need: On the influence of random seeds in deep learning architectures for computer vision @david_picard tl;dr: one can easily get +0.2 pp accuracy on ImageNet just by changing random seed. And much more on CIFAR10
Tweet media one
Tweet media two
Tweet media three
Tweet media four
8
82
359
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
ReLU Fields: The Little Non-linearity That Could @AnimeshKarnewar , Tobias Ritschel, Oliver Wang, Niloy J. Mitra tl;dr: grid-based 3D representation + ReLU are almost NERFs, but much faster.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
10
58
347
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
Fully Differentiable RANSAC Tong Wei, @yash_patel2307 , Jiri Matas, @majti89 tl;dr: another (now working better?) differentiable RANSAC with clever pretraining (KL-div for inlier probabilities, then regular losses). Eval is good :) 1/2
Tweet media one
Tweet media two
Tweet media three
Tweet media four
7
60
298
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Good practice, which I forget to do every other time. #OpenSource
Tweet media one
4
33
288
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
PhD defence done! Thank you all for support and coming online for the presentation!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
29
4
266
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
6 months
It seems that RoMA by @Parskatt almost solved WxBS. I have added it to eval colab here
Tweet media one
Tweet media two
Tweet media three
Tweet media four
9
51
259
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
10 months
โ€œDiffusion models are general purpose learning machineriesโ€, not only generation. Talk by @YasutakaFuruka1 #CVPR2023 1/
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
41
254
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Good news: Pytorch works on M1 GPU :) Bad news: not all Pytorch operations work there. I guess we should create a separate test case for device='mps' to evaluate that at @kornia_foss
Tweet media one
@PyTorch
PyTorch
2 years
Weโ€™re excited to announce support for GPU-accelerated PyTorch training on Mac! Now you can take advantage of Apple silicon GPUs to perform ML workflows like prototyping and fine-tuning. Learn more:
Tweet media one
79
711
3K
6
41
248
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
SaaS: Schmidhubering as a Service
Tweet media one
3
13
241
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
We are happy to introduce our paper, rejected from #CVPR2020 "Image Matching across Wide Baselines: From Paper to Practice". It has so many messages, that I will create separate tweet-thread for each part.
Tweet media one
@kwangmoo_yi
Kwang Moo Yi
4 years
We are hosting a challenge/workshop at #CVPR2020 . Please see for more info. Here's also a paper related to it that you WON'T see at CVPR ;) TLDR: SIFT Keypoint+HardNet is still the best!
1
6
34
7
50
236
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
5 months
Bag of Image Patch Embedding Behind the Success of Self-Supervised Learning @Yubei_Chen , @AdrienBardes , @LiZengy , @ylecun tl;dr: bag of patches and compositional structure of the horse is all you need. (except where is Dinov2 comparison?) and
Tweet media one
Tweet media two
Tweet media three
3
41
232
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 months
Deep Networks Always Grok and Here is Why @imtiazprio , @randall_balestr , @rbaraniuk tl;dr: early stopping is bad, instead you should train much longer than you thought to have more robust model. Bonus: BatchNorm hurts
Tweet media one
Tweet media two
Tweet media three
Tweet media four
6
39
231
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
Ouch. I haven't read this full story about the ChatGPT virtual machine. That is incredible.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
8
38
224
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
My slides on "How to navigate ML research literature" for @ucu_apps Winter ML school. How to read papers? How to filter out? Where to get? What does peer review mean? Unfortunately, no part about re-implementing - this is covered by other speaker.
7
44
227
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
Wow, there is huge revolution on #kaggle happening right now. More and more competitions are "kernel-only-submisson". Which means you should upload your model to cloud and then all the inference is done in submission kernel with runtime limit. Farewell 100500 models ensembles!
3
37
225
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Tl;d: Pytorch team is working on M1 GPU support and plan to release alpha in 4 month.
@timothy_lkh_
Timothy Liu
2 years
Itโ€™s coming!
1
40
250
2
21
216
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
Faster AutoAugment - first paper, which uses @kornia_foss Just in case if anyone was wondering why one needs differentiable augmentation :)
Tweet media one
2
63
206
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
11 months
VanillaNet: the Power of Minimalism in Deep Learning Hanting Chen, Yunhe Wang, Jianyuan Guo, Dacheng Tao tl;dr: 4x4conv/4->n x {1x1conv->{seriesAct}->MaxPool2x2}. seriesAct = stack of BN(ReLU(BN(ReLU)))
Tweet media one
Tweet media two
Tweet media three
Tweet media four
6
46
205
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Dear colleagues. Could you please not hardcode "cuda" device in your example code? Some of us running your code on a laptop :)
8
9
204
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
ML theory twitter, here is a question: why classification works so much better that regression? Bonus question: is it true only for DL, or for XGBoost as well?
@eric_brachmann
Eric Brachmann
1 year
@ducha_aiki No clue, happy for pointers! I suspect there is some nice theory in the ML community. Stability of soft max? Robustness to multi-modality? Networks preferring over-parametrizations?
4
1
9
34
30
202
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
5 years
How good are classic navigation algorithms vs fancy RL? Spoiler: much better Read our paper "Benchmarking Classic and Learned Navigation in Complex 3D Environments" Paper: Source code (most of it) :
5
61
201
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
NeRF-SLAM: Real-Time Dense Monocular SLAM with Neural Radiance Fields @RosinolToni , John J. Leonard, @lucacarlone1 tl;dr: DroidSLAM + Instant-NGP.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
46
196
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
"RaspberryPI for mosquito neutralization by power laser" "The system can neutralize 2 mosquitos per sec and this result can be easily improved. "
Tweet media one
Tweet media two
Tweet media three
10
34
196
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
Tormentor : pyTORch augMENTOR. New augmentation library based on the #pytorch and #kornia @kornia_foss
Tweet media one
Tweet media two
0
60
194
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
NeRN -- Learning Neural Representations for Neural Networks Maor Ashkenazi, Zohar Rimon, Ron Vainshtein, Shir Levi, Elad Richardson, Pinchas Mintz, Eran Treister tl;dr: NERFs to compress the CNN classifiers. Not hypernetworks, although seem similar
Tweet media one
Tweet media two
Tweet media three
3
38
191
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Kubric: A scalable dataset generator 34 co-authors. tl;dr: (Blender+PyBullet)-based framework for the dataset generation, suitable for many tasks code:
Tweet media one
Tweet media two
Tweet media three
Tweet media four
5
29
191
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
Metric learning: cross-entropy vs. pairwise losses Tl;dr: paper derives that the most of metric learning losses are kind of equivalent to cross-entropy. And CE is not worse than them.
Tweet media one
7
42
187
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Identical Initialization: A Universal Approach to Fast and Stable Training of Neural Networks Anon #ICLR2023 submission tl;dr: initialize your weights with (almost) identity.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
26
186
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
LERF: Language Embedded Radiance Fields Justin Kerr, Chung Min Kim, @Ken_Goldberg , @akanazawa , Matthew Tancik tl;dr: if NERF is trained to predict DINO+CLIP features in addition to RGB and depth, you can use it later for volumentric segmentation.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
41
187
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
My PhD defence will be on Thursday Dec 16, 14:00 CET. Available online at this MS Teams link:
27
8
185
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Wow, #CVPR2023 reviewing will be not on CMT, but on OpenReview!
Tweet media one
5
23
184
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 months
CLIP Can Understand Depth Dunam Kim, Seokju Lee tl;dr: learn input embedding for the monodepth, then use it for the input -> not sota, but quite good, unlike previous CLIP based depth.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
24
183
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
One could learn more about good software engineering from this 1 hour talk than from a university course, tbh. Incredible presentation of nbdev, jupyter notebook and live coding environments by @jeremyphoward
3
20
185
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
We announce "Affine Correspondences (AC) and their Applications in Practice" #CVPR2022 tutorial by @majti89 @prittjam , me and Levente Hajder. โœ… AC applications: rectification, surface normals, fast pose โœ… How to get AC @CVPR P.S. Dolls and photos by Olha Mishkina.
Tweet media one
3
29
182
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
Image Matching Challenge 2023 starts NOW! Task: 3D reconstructions from 10-100 images Entry Deadline: June 6, 2023. Prize Money: $50,000 #IMC2023 #CVPR2023 @CVPR
3
41
176
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
6 months
If you are into image matching and want to get feeling for different algorithms - try this amazing webdemo. It has tons of modern local features, as well as many version of RANSAC (from OpenCV)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
31
177
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs Authors: all tagged + Neil D. B. Bruce tl;dr: CNNs with global average pooling encode the position in channel order. Nicely formulated and tested hypothesis
Tweet media one
Tweet media two
Tweet media three
Tweet media four
7
40
177
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
ArXiving before submission : - helps professional identity building; -protects against idea re-discovery/theft/gate-keeping - facilitates open research distribution - reduces inequality. our refined statement, with @amy_tabb & Jiri Matas, now on arXiv
10
40
175
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 months
Short guide for modern local features: - if you are GPU and CPU-rich, use RoMA - if you are allergic to learned matching, use DeDoDe. - otherwise use ALIKED+LightGlue.
Tweet media one
Tweet media two
Tweet media three
7
27
174
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
Wow, fastai is on the front-page of top repositories selected for the github.archive project: together with Linux, Ruby, Postgres, jquery. Congrats!
Tweet media one
0
31
173
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
This comment gives me some hope. Context: M1 GPU support in PyTorch.
Tweet media one
6
12
168
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
OpenGlue: Open Source Graph Neural Net Based Pipeline for Image Matching @OstapViniavskyi , @DobkoMaria @ducha_aiki @dobosevych tl;dr: MIT license impl+additional geom input. Caveat: same as SGMNet, fails to match official ML perf
Tweet media one
Tweet media two
Tweet media three
Tweet media four
5
34
164
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 month
Track Everything Everywhere Fast and Robustly Yunzhou Song, Jiahui Lei, Ziyun Wang @LingjieLiu1 @KostasPenn tldr: track in (optimized)3D and use DINOv2 for long-term matching.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
28
166
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Can Vision Transformers Learn without Natural Images? It turns out that one can pretrain vision networks solely on the synthetic images - "formula supervision", fractal-based.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
35
161
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
@CSProfKGD Here is my opinion
Tweet media one
3
4
163
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
I am preparing for my lectures on How to navigate CV/ML literature, how to read papers in an efficient way, re-implement then, etc. Which issues in this area do you personally have? Questions? Whar is the most misleading advice you ever get? Please write in comments below
13
8
160
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
10 months
Enjoy Image Matching Challenge 2023 recap: tl;dr: - SfM is not solved - global desc similarity is hard - orientation invariance for SuperGlue is easy - PixSfM is good idea, but need follow-ups - KeyNetAffNet @kornia_foss rocks #IMC2023 #CVPR2023 @CVPR
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
32
158
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 month
We are launching Structured Semantic 3D Reconstruction (S23DR) Challenge at @CVPR @structGeom workshop on @huggingface ! Prizes: 1: $10,000 2: $7,000 3: $5,000 Deadline: June 4, 2024 Task: point cloud+segmentations+monodepth ->roof wireframe. 1/2
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
36
160
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
So true :)
Tweet media one
@colinraffel
Colin Raffel
4 years
Hot take: Mathiness [1] is like an adversarial patch [2] for ML conference reviewers: Mathiness causes a reviewer to classify the paper as "accept" regardless of whether the math is useful/valid and the paper is any good. [3] Fig. 6 has some empirical evidence of this. (refs โฌ‡๏ธ)
14
82
510
1
35
159
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 month
Image Matching Challenge 2024 is live on #Kaggle ! Prize fund: $50k. Entry deadline: May 28 2024. Goal: reconstruct camera poses given images. Catch: many nuisance factors: illumination, nature, season, rotation, doppelgรคngers etc. Examples from train(easy) #IMC2024 @CVPR 1/3
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
37
157
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
5 years
Our 7th place solution to Whale competition. Great thanks to @Ana_Geneva and Igor for team, @radekosmulski for starter kit and @fastdotai for great library.
3
32
154
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
5 years
Just to make clear: Kornia () is a differentiable OpenCV in Pytorch, ambitious project led by @edgarriba . It has filters, geometry, local features, losses. And much more in plans.
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
5 years
@michalwols What exactly, THFeat, library? kornia is much more than local features, it is more like "pytorch differentiable OpenCV".
Tweet media one
1
2
6
2
33
150
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Image as Set of Points #ICLR2023 anon submission tl;dr: SuperPixel meet ViT
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
33
151
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 months
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao tl;dr: DINOv2 meets MIDAS and ZoeDepth and unlabeled images with strong augmentation like CutMix.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
26
150
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
SLAM people, I have couple of questions. I understand that ORBSLAM(1,2,3) is a great and well engineered piece of software. But it is a paper from 2015. Many things have changed, e.g. hardware, we have Full HD (GPU)-SIFT in realtime now. Why don't we have SuperSLAM at least?
Tweet media one
21
18
150
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
DeepMatcher: A Deep Transformer-based Network for Robust and Accurate Local Feature Matching Tao Xie, Kun Dai, Ke Wang, Ruifeng Li, Lijun Zhao tl;dr: include position embeddings everywhere,not only beginning, better "fine" refinement -> better LoFTR 1/2
Tweet media one
Tweet media two
Tweet media three
Tweet media four
7
26
147
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Personal announcement: I am joining @HOVER3D to work with @giridhar89 on the WxBS application in real life -- house 3D reconstruction from several photos. I am also keeping my CTU in Prague affiliation and will continue to work on #IMC2022 and @kornia_foss .
Tweet media one
20
1
147
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Tune It or Donโ€™t Use It: Benchmarking Data-Efficient Image Classification tl;dr: When carefully tuned (hyperparams), cross entropy works as good, or better than specialized "low-data" losses/methods. P.S. Why am I not surprised? #ICCV2021
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
25
143
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
The map of occupied Ukrainian territory. The enemy is not as fast as they dreamed. But they are here.
Tweet media one
4
16
135
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 month
Benchmarking Object Detectors with COCO: A New Path Forward Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai tl;dr: SAM-refined masks for MS-CoCo -> re-evaluated benchmark -> all methods score higher.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
25
135
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning @janundnik , Neil Band, @clarelyle , @AidanNGomez , @tom_rainforth , @yaringal tl;dr: deep learning makes kNN 2.0 by modeling interactions between data points.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
30
135
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Exploring Simple Siamese Representation Learning Xinlei Chen, Kaiming He Tl;dr: Stop-grad + siamese similarity is all you need for semi-supervised learning (in agreement with recent FROST paper by @lnsmith613 )
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
28
130
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Brief reminder to myself and #IMC2021 #CVPR2020 paper reviewers. Our paper now has 80 citations and become one of the standard image matching benchmarks. Ofc, weak reject :)
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
We are happy to introduce our paper, rejected from #CVPR2020 "Image Matching across Wide Baselines: From Paper to Practice". It has so many messages, that I will create separate tweet-thread for each part.
Tweet media one
7
50
236
1
14
133
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 months
The premise of DeDoDe paper: you don't need learned matcher, mutual nearest neighbor is enough. LightGlue: hold my beer DeDoDe: soon in @kornia_foss thanks to @Parskatt . DeDoDe-LightGlue - trained by @kornia_foss
Tweet media one
Tweet media two
3
18
131
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 month
FeatUp: A Model-Agnostic Framework for Features at Any Resolution @xkungfu , Mark Hamilton, Laura Brandt, Axel Feldman, Zhoutong Zhang, William T. Freeman tl;dr: in title. Upscale DINOv2, anyone? #ICLR2024
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
30
129
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
5 months
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting Chi Yan, Delin Qu, Dong Wang, Dan Xu, Zhigang Wang, Bin Zhao, Xuelong Li tl;dr: NERF-SLAM, now with Gaussian splatting. 8.34 FPS, of course from RGB-D, not just monocular.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
17
129
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
Out-Of-Distribution Detection Is Not All You Need @jorisguerin , Kevin Delmas, @raulsferreira , Jรฉrรฉmie Guiochet tl;dr: ability to predict if model can make mistake >> good OOD performance.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
22
125
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Right now, when civilians in Mariupol are being killed by Russian bombs, @IEEESpectrum spreads Russian propaganda in their recent article. WTF? @CVPR , we are IEEE/CVF conf. Can we have a public stance regarding war and spreading Russian propaganda?
Tweet media one
Tweet media two
8
35
128
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 months
Image matching folks, I know you couldnโ€™t celebrate Christmas without knowing that Image Matching Workshop 2024 has been accepted to #CVPR2024 :) Kudos to Eduard, @kwangmoo_yi @fabio_bellavia , Jiล™i, @lcmorelli3 @3DOMFBK Fabio @CVPR
Tweet media one
3
17
128
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
5 years
I wrote a starter pack for @kaggle Landmark Retrieval 2019 competition. Since the competition data is too huge to iterate quickly, I have used SfM120k dataset from GeM paper. All the validation, fast NN-search code is already there. Library is @fastdotai
1
30
124
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Pros of @weights_biases : everything Cons: now you can do loss watching even from your phone :(
8
9
127
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Convolutional Hough Matching Networks for Robust and Efficient Visual Correspondence tl;dr: 6D (x-y-scale1, x-y-scale2) correlation ->scale max-pool -> refine -> softargmax -> correspondences. Juhong Min, Seungwook Kim, Minsu Cho
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
25
126
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Dear computer vision colleagues, I am really sorry that I am not working against social media ban at #CVPR now. Once we done with Russia and the war they started in Ukraine, we will resume our call for open science.
3
3
127
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Laugh is a weapon too. Today I will be posting news in meme format, please share :)
Tweet media one
3
13
124
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
13 days
Probing the 3D Awareness of Visual Foundation Models @_mbanani , Amit Raj, @kmaninis , Abhishek Kar, Yuanzhen Li, Michael Rubinstein, Deqing Sun, Leonidas Guibas, @jcjohnss , @jampani_varun tl;dr: DINOv2 rules, but read paper. SD is good for semantic cores
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
26
125
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 months
Rethinking Inductive Biases for Surface Normal Estimation @BaeGwangbin , @AjdDavison tl;dr: fighting bitter lesson with geometry for better normals. Code is available, but non-commercial
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
20
125
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
10 months
Detector-Free Structure from Motion Xingyi He, @JiamingSuen , Yifan Wang, Sida Peng, @qixing_huang , Hujun Bao, @XiaoweiZhou5 tl;dr: LoFTR-like->SfM -> lightweight PixSfM + BA interleaved. #IMC2023 1st place
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
30
121
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
Designing BERT for convolutional networks: sparse and hierarchical masked modeling Keyu Tian, Yi Jiang, Qishuai Diao, Chen Lin, Liwei Wang, Zehuan Yuan tl;dr: create image, which looks to CNN same, as transformers -> MIM starts working
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
27
119
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
10 months
You think can Segment Anything? Try segmenting this trees then! #CVPR2023
Tweet media one
1
17
120
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Everyone in Ukraine understands that and thus not agree to just "peace". It means that in 5-10 years we will be completely destroyed by reformed Russian army. We need to win. Don't wish us "may this end soon". Wish us victory.
@kamilkazani
Kamil Galeev
2 years
The best formula of institutional evolution is: 1. Scare them 2. Don't finish them It skyrockets the chance that they evolve. Right now the regime is very scared. So they're working fervently on integration with China and "deescalation" will buy them time they need desperately
Tweet media one
25
328
2K
0
26
118
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Efficient Initial Pose-graph Generation for Global SfM Our new paper without deep learning invoved! Tl;dr: Bag of tricks for 7x speed-up camera pose graph generation (29 hours for 402 130 image pairs vs 202 hours originally) Details in the thread 1/6
Tweet media one
Tweet media two
7
25
119
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
3 years
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions tl;dr: Pyramid makes everything better.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
28
120
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
A short survey+benchmark of SuperPoint family: - SuperPoint, - eric-yyjau-superpoint, - Reinforced SP by @eric_brachmann et al - KP2D - LANet. tl;dr: original is mostly better then the rest except LANet. They also differ in keypoints they detect.
11
27
120
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
Other people: release arXiv paper and empty repo. Some FB guys release as code with "as described in our paper [TODO link]" Screenshots in repo look very promising: Better than SuperGlue on #IMC2022
Tweet media one
Tweet media two
5
22
116
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
5 years
I have summed up all cool and not so cool things about @fastdotai library from first-time-lazy-enough to not-watch-lessons user point of view
2
31
119
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
10 months
PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment Jianyuan Wang @chrirupp , @davnov134 tl;dr:PoseRegression network + refinement via camera pose diffusion. Uses SuperGlue to get the correspondences guiding the diffusion
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
25
116
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 months
So far: - Our evaluation metrics become a common in many papers, - Popularity of better RANSACs, etc. - more awareness of the proper evaluation. - 300 citations. - IJCV journal paper is not much different from CVPR submission. Reviewers, are you happy with the rejection?)
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
We are happy to introduce our paper, rejected from #CVPR2020 "Image Matching across Wide Baselines: From Paper to Practice". It has so many messages, that I will create separate tweet-thread for each part.
Tweet media one
7
50
236
2
2
118
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation Abhijit Kundu et 8 al. tl;dr: Make NERF predict not only RGB, but also semseg, depth, instance seg. 1 MLP for background, several for dynamic obj. It proposes learned init. 1/2
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
27
119
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
4 years
I am struggling with finding any sense in the PhD thesis as information unit form-factor. I see use-cases,value,advantages of: - blogpost - paper - textbook - course - reference book. - source code. I see no advantages and use-cases (except PhD req satisfaction) for the thesis.
23
14
117
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
Deep Learning for Camera Calibration and Beyond: A Survey Kang Liao, Lang Nie, Shujuan Huang, Chunyu Lin, Jing Zhang, Yao Zhao, Moncef Gabbouj, Dacheng Tao tl;dr: well, it is a good survey :) updated at :
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
32
116
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
2 years
2nd Place Solution to Facebook AI Image Similarity Challenge Matching Track SeungKee Jeon tl;dr: horizontally cat two images -> feed into (SimCLR-pretrained) ViT -> classifier (match/non-match). As simple as is. Comp:
Tweet media one
Tweet media two
6
24
114
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
1 year
@vboykis What are you all talking about? That is clear linear regression problem, they are typically using it for job interviewsโ€ฆ
Tweet media one
1
3
111
@ducha_aiki
Dmytro Mishkin ๐Ÿ‡บ๐Ÿ‡ฆ
6 months
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model Ruoxi Shi, Hansheng Chen, Zhuoyang Zhang, Minghua Liu, Chao Xu, Xinyue Wei,Linghao Chen,Chong Zeng, Hao Su tl;dr: generate 2x3 image views, change noise schedule, reference attention
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
23
113