๐ขA new learning-based approach to SfM:
#ACEZero
No img-to-img matching, optimises image-to-scene correspondences directly. Needs no pose priors. Works on unordered image sets. Efficiently handles thousands of images.
Paper:
Page:
โจ Presenting MicKey (
#CVPR2024
, Oral) โจ
We regress and match 3D camera coordinates rather then 2D key points, all in metric space. Gives you a scaled relative pose between two RGB images.
Paper:
Page:
A
#CVPR2023
Highlightโจ:
โฅ๏ธโ ๏ธACE: Accelerated Coordinate Encodingโฃ๏ธโฆ๏ธ
We learn implicit, neural maps in minutes that let us relocalize with SOTA accuracy.
๐:
๐ป:
๐ฝ๏ธ:
work by
@NianticLabs
Research
We've published
@pytorch
code of Differentiable RANSAC for a toy problem: fitting lines. A CNN learns to predict points (middle) to which we robustly fit lines, trained end2end with DSAC. Right: A CNN which learns to predict line parameters directly. Code:
Read papers to write good papers.
Review papers to write great papers.
Read rebuttals to write good rebuttals.
Discuss rebuttals to write great rebuttals.
Great papers and great rebuttals are no guarantee to get accepted. Iterations are normal.
Today, we launch Map-Free Visual Relocalisation. A twist on the usual relocalisation formula. A new dataset. A new benchmark.
Check out our video!
#MapFreeReloc
to be presented at
#ECCV2022
Major update and code release for our
#relocalization
pipeline, DSAC*
(Updated) paper:
Code:
Use RGB or RGB-D, train with an SfM model or a 3D scan. Train from images and poses alone and DSAC* discovers coarse geometry by itself:
We dedicate our oral presentation at
#CVPR2024
in Seattle to R2, who - in a great moment of integrity and reason - raised their initial "weak reject" rating to "strong reject" after rebuttal.
We hope they enjoy every minute of our 15 minute talk.
โจ Presenting MicKey (
#CVPR2024
, Oral) โจ
We regress and match 3D camera coordinates rather then 2D key points, all in metric space. Gives you a scaled relative pose between two RGB images.
Paper:
Page:
RGB image in, set of 3D primitives out. A
#CVPR2021
paper with
@florian_kluger
, H. Ackermann, M. Yang and B. Rosenhahn!
#ComputerVision
abs:
code:
We take RANSAC out of its comfort zone into scene understanding territory. ๐
With the
#CVPR2023
content being public now, so are the recordings of the talks I gave.
Firstly, I gave a talk at the
#IMC2023
workshop:
"Pose Estimation Beyond Feature Matching"
I recommend watching the whole session of this awesome workshop!
1/2
I really like the Twitter format for presenting papers. So I thought I replicate that for my personal website (neglected for years...)
It's a simple static HTML, feel free to use as template.
Due to requests at
#ECCV2022
and to make our
#MapFreeReloc
dataset useful for more tasks, we make the SfM reconstructions of our train set publicly available.
๐ฅ460 SfM models of outdoor scenes all around the world ๐ฅ
Want to train 460 NeRFs? Go ahead.
All talks of our
#ICCV2021
tutorial on visual localisation are available on YouTube.
In my talk about learning-based localisation, I discuss a challenge, a promise and a compromise.
Watch it here:
Links to all the amazing talks in ๐งต๐
Third and last code package for NG-RANSAC (
#ICCV2019
) is online: NG-DSAC++ for camera re-localization, a re-implementation of DSAC++ in
@PyTorch
, extended with neural guidance.
Code:
Paper:
w/
@Carsten_Rother
@LabHeidelberg
๐ฅMap-relative Pose Regression๐ฅ(
#CVPR2024
highlight)
For years absolute pose regression did not work. There was some success by massively synthesising scene-specific data. We train scene-agnostic APR and it works.
Paper:
Page:
We are hiring! Want to work with me and an amazing team at
@NianticEng
on re-localisation at a global scale? Bring cutting-edge research into the hands of millions of users?
Consider applying for our Mapping and Localisation MLE role:
Niantic research has a strong presence at
#CVPR2023
with 5 papers (one highlight) and various contributions throughout workshops and tutorials.
Diffusion, NeRF, relocalisation, object pose, feature matching, depth and occlusions.
Here is where you can catch us:
2 papers accepted to
#cvpr2020
๐ฅณ
Reinforced Feature Points: Use classic REINFORCE to optimize feature detection and description for the task you care about
CONSAC: fit multiple parametric models by learned sequential search (w/
@florian_kluger
)
More info soon!
#ComputerVision
.
@SattlerTorsten
talking about "Old School" methods at the
#ICCV2021
tutorial on visual localization. "Old" but not outdated!
Join us here:
I will talk about the "New School" later.
When you try to solve difficult image pairs, it's important that you do not overshoot and start to hallucinate connections between unrelated images. The
#MapFreeReloc
benchmark checks for that. The inlier count of MicKey is pretty good in separating solvable and unsolvable cases.
Some motivation if you didn't make it into
#CVPR
:
- our first object coordinates paper: rejected from CVPR14, barely made it into ECCV14 (>250 citations)
- follow up: rejected from ICCV15, made it into CVPR16 (~200 citations)
- ESAC: rejected from CVPR19, made it into ICCV19
To
#CVPR2023
reviewers: Remember there are humans on the other end. Be strict in the matter but respectful in tone. Consider even being friendly in tone. Someone worked hard on this, and is proud. Don't bend but sweeten the pill.
With the
#ACErelocalizer
, we reduced mapping times from 15 hours to 5 minutes. You can imagine that further speed improvements are exponentially harder. Yet, we managed to squeeze out another 15% speed-up. Let me walk you through the steps:
1. Upgrade to
@pytorch
2.x
Fin.
A
#CVPR2023
Highlightโจ:
โฅ๏ธโ ๏ธACE: Accelerated Coordinate Encodingโฃ๏ธโฆ๏ธ
We learn implicit, neural maps in minutes that let us relocalize with SOTA accuracy.
๐:
๐ป:
๐ฝ๏ธ:
work by
@NianticLabs
Research
New retro wave of
#ComputerVision
: NG-RANSAC brings you the greatest hits of the 80s: RANSAC, multi-layer perceptrons, (classic) reinforcement learning. Lens flare for visualization of "cool", only. The paper:
#ICCV2019
#ICCV19
#ICCV
#DeepLearning
Automatic generation of ground truth is great but caution is advised.
Upcoming for
#ICCV2021
: For vis. relocalisation, we show that depending on how you generate GT, the ranking of relocalisers flips upside down:
@SattlerTorsten
@martinhu
@Carsten_Rother
Submitting relocalisation via pose regression to
#ICCV2023
? Rejected from
#CVPR2023
? Consider the
#MapFreeReloc
benchmark.
No need to beat DSAC*, hLoc, AS... they do not apply. This benchmark was created just for you ๐ซต
Also if you work on:
- features
- depth
- uncertainty
๐งต
We uploaded alternative "ground truth" and full SfM models for the
#relocalisation
datasets 7Scenes and 12Scenes.
Working on reloc towards
#ECCV2022
and having trouble beating SOTA? The "ground truth" might play a role.
#ICCV2021
#betterlatethannever
Stumbled across this
#ICCV2023
paper extending
#MapFreeReloc
to panoramic indoor views. Makes a lot of sense: A panorama serves as a one-shot map with wide coverage.
Calibrating Panoramic Depth Estimation for Practical Localization and Mapping, Kim et al
The moment has finally come. I tried to get the best education possible to prepare for this. Worked hard. Let's see whether it all payed off.
My son learned the word "why".
Personally, I find these results all the more remarkable considering that MicKey does not use any cross attention. Key points and descriptors are predicted per image, without considering the other view. Just vanilla matching and RANSAC after that.
Beyond providing metric estimates, MicKey can also deal with extreme view point changes, up to opposing shots. Here are a few examples of MicKey correspondences.
Learning-based visual relocalisation has made quite some progress over the years.
I'm giving an overview in our
#CVPR2023
tutorial on:
๐ข"Large-Scale Visual Localisation"
Mark June 19th in your calendar. Check out the full schedule of amazing talks:
Strange. My Twitter feed somehow suppresses all the tweets of PhD students from small labs that complain about the
#CVPR2022
social media motion. I see, however, plenty tweets by 1k+ follower accounts complaining that the former group will be at a disadvantage now ๐
What's
#MapFreeReloc
?
A learned model induces a scale-metric space, conditioned on a single reference frame. We localise new queries in that space.
I will talk about "Pose Estimation Beyond Feature Matching" on Monday morning at
#CVPR2023
at the Image Matching Workshop
#IMC2023
Dear
#ECCV2024
reviewer,
"see weaknesses" is not a good "justification of rating". Every paper has weaknesses. The AC needs to understand why you think the weaknesses outweigh the strengths (or vice versa).
Best,
Our call for research interns
@NianticEng
for 2022 is live! Interested in pushing the frontiers of AR? So are we. Work with us on a wide range of topics in the spectrum of
#ComputerVision
and
#MachineLearning
!
Interested? More info and application forms:
You would assume that since 2012
#DeepLearning
was applied to everything and their mother. But, as far as we know, CONSAC is the first learned multi-model robust estimator.
By
@florian_kluger
#CVPR2020
arXiv:
code:
Today, Wednesday morning, at
#ICCV2019
, I will present NG-RANSAC at poster stand 143 (far end of the expo hall). I will try to explain the method on a simple toy problem, such that even I could understand it :) Come by, say hello!
#ICCV
#ICCV19
#DeepLearning
#MachineLearning
Two papers accepted to
@ICCV19
! Neural-Guided RANSAC (NG-RANSAC): A neural network guiding RANSAC data point selection, and Expert Sample Consensus (ESAC): An ensemble of scene coordinate experts for scalable camera re-localization.
#ICCV2019
#ComputerVision
#DeepLearning
I keep reading in papers that RANSAC is so excruciatingly slow that getting rid of it is a contribution.
Watch DSAC* estimating poses from thousands of correspondences in 30ms per frame, including network execution and RANSAC. ๐
Brrr.
You know these annoying 16bit PNGs, eg depth maps, that your default image viewer probably does not show correctly? Did you know that
#ImageJ
can open them? Did you also know that
#ImageJ
has been ported to JavaScript and runs in your browser!? ๐ฒ
Extending on my earlier post about research trends at
#CVPR2018
, I wrote a little python script that plots topic popularity (measured by key word matches against paper titles) over time. It's a simple Jupyter notebook, so you can play around yourself:
Thanks
@ducha_aiki
for sharing our work faster than we could get it to arXiv. Its available there as well, now.
๐
TL;DR? Upload the gist of it into your brain via YouTube:
โถ๏ธ
w/
@axelbarrosotw
,
@viprad
, Gabe, and
@dantkz
Two-view Geometry Scoring Without Correspondences
Axel Barroso-Laguna,
@eric_brachmann
Victor Adrian Prisacariu Gabriel Brostow
@dantkz
tl;dr: in title + nice analysis of the metrics for the epipolar geometry.I almost want to write a blogpost-review :)
๐ข๐ข Announcing the 8th workshop on Recovering 6D Object Pose (
#R6D
) at
#ICCV2023
๐ข๐ข CFP: We invite paper submissions of unpublished works covering object pose estimation and related topics. Deadline: July 24
#BOP
challenge 2023 also in the works!
I really like that I'm not able to rate "borderline" in the final
#ICCV2021
assessment. I'm split regarding several submissions, now I have to think twice as I have to lean in some direction. An unexpectedly enjoyable inner fight.
6/6
#CVPR2022
reviews done. Strong stack this one: worst score is borderline - never had this before. Innovative ideas and above-average writing. I'm impressed.
The slides for my talk at the R6D workshop at
#iccv2019
are available at the workshop website:
A summary of our work on differentiating PnP, differentiable RANSAC, differentiable correspondence selection and differentiable expert selection.
#DeepLearning
Finally, there is an extensive benchmark for 6D object pose estimation, presented at
#ECCV2018
. No learned method in top 3. Best learned method uses Random Forests instead of CNNs. Want to defend the honor of deep learning? It's a running competition ;)
๐ข Benchmark for 6D Object Pose Estimation ๐ข
#BOP
challenge 23 opened!
Results to be presented at the
#R6D
workshop at
#ICCV2023
.
The challenge has always pushed the field forward. This year: โunseen objectsโ. Can you onboard a new object in 5mins?
A small thread on 3D rotations: Both log-quaternions (log-q) and axis-angles (aa) represent rotations with 3 parameters. But they are not the same, related by a factor of 2. The length of aa gives you the rotation angle, the length of log-q gives you half that angle.
This summer I've been working to finally understand Lie Theory, the basis for proper estimation on over-parameterised manifolds like SE(3). There are some great tutorials for the roboticist out there; I especially like Micro Lie Theory by Solร et al.
On Wednesday I will be giving a talk at
@naverlabseurope
on visual localization. I will talk about differentiable RANSAC and new stuff, including learning to guide RANSAC and scalable re-localization. If you are in Grenoble, come by!
To add to my list of motivating failures: I had 2 submissions lined up for
#ECCV2020
. One stopped a few weeks ago, one stopped a few hours before deadline. So that happens, too!
At the same time, this is my contribution to
#slowscience
. You are welcome ๐
Some motivation if you didn't make it into
#CVPR
:
- our first object coordinates paper: rejected from CVPR14, barely made it into ECCV14 (>250 citations)
- follow up: rejected from ICCV15, made it into CVPR16 (~200 citations)
- ESAC: rejected from CVPR19, made it into ICCV19
One of our drafts was rejected from
#ICCV2021
. While Im not happy with the reasoning of the ACs, I do appreciate the transparency. It is clear the draft was flagged by the AC as needing discussion, that the disc. had room, and what the disc. was.
@ICCV_2021
setting new standards.
Hot off the arXiv press: Results of the
#BOPchallenge
of this years
#ECCV20
in handy paper format:
What's the best 6D object pose estimator in town? Do we still need depth for high accuracy? Thanks
@tomhodan
and
@ma_sundermeyer
for putting it together.
๐ขThe
#BOPchallenge2022
report is here! How did object pose estimation evolve since
#BOP
2020?
- Avg precision 69.8% (2020) -> 83.7% (2022) ๐
- Training on synthetic data costs you merely 1% ๐ค
- top RGB method of 2022 beats top RGB-D method of 2020 ๐คผ
In my visual relocalization tutorial talks, I always had an up-to-date slide measuring the progress of APR methods compared to classical approaches (dotted line). Took us less than 10 years ๐ช
๐ฅMap-relative Pose Regression๐ฅ(
#CVPR2024
highlight)
For years absolute pose regression did not work. There was some success by massively synthesising scene-specific data. We train scene-agnostic APR and it works.
Paper:
Page:
If you are interested in visual camera re-localization: We improved DSAC++ a bit, and called it DSAC*. I still have to prepare the code but you can have a peak at the tech report below. Trains faster, tests faster, less memory, more accuracy. Also, RGB-D if you want.
Quick glimpse at our
#CVPR2020
oral on end-to-end learning of sparse feature detection, including discrete matching and RANSAC-based model fitting. But don't waste your time on the 1min version, you will anyway want to learn more... The full sweet 5mins:
Looks like we have a new top entry for
#MapFreeReloc
with huge improvements over SuperGlue and LoFTR.
@Parskatt
wasted no time after my
#IMC2023
talk at
#CVPR2023
.
Judging from the entry title, it's this paper:
The 2023 BOP challenge report is out, measuring progress in 6DoF instance pose estimation:
Accuracy is excellent if objects are known in advance. For unseen objects, still pretty good but slow!
Middle: best method for known objects, right: unseen obj.
Join us at
#3DV2021
, oral session 3 (9am/10pm GMT) where we are about to present
Visual Camera Re-Localization using Graph Neural Networks and Relative Pose Supervision
Stellar work by
@0ozgur0
during an internship at
@NianticEng
.
We should rethink the distinction of hand-crafted vs. learning-based. Learning-based methods are hand-crafted (architecture, flow of information, domain knowledge), and hand-crafted methods have free parameters. Often the distinction is: Who optimizes the parameters, you or ADAM?
Ommm. I will not become Schmidhuber.
Ommm. I will not become Schmidhuber.
Ommm. I will not become Schmidhuber.
Ommm. I will not become Schmidhuber.
Ommm. I will not become Schmidhuber.
Ommm. I will not become Schmidhuber.
...
@ducha_aiki
@ixtiyoruz312
@yash_patel2307
@majti89
If the paper claims that there is no differentiable RANSAC yet in the first paragraph while citing a paper that has differentiable RANSAC in the name, I certainly have questions that require more than 7 lines.
Today, Thursday afternoon, at
#ICCV2019
I'll present our 2nd poster. ESAC, a variant of (differentiable) RANSAC, allows you to train an ensemble of expert networks. I'll explain it using lines and circles but it can also solve useful problems - like re-localization. Poster
#153
Doing my CVPR reviews, I realize how desperately we need a unified evaluation framework for 6D object pose estimation from RGB images. Several great papers have been written on the topic in the last few years, but all of them evaluate slightly differently.
The abstract: Convince a potential reviewer, who has zero time, zero attention span, and might see his previous work diminished by your claims, that your research has some worth. In 10 lines. No pressure.
#CVPR2020
A new iteration of the
#R6D
workshop is in the making! What object pose estimation
#magic
have you been cooking up? How did the community progress on this task? Tell us with a paper submission! And get your GPUs warmed up, a new BOP
#challenge
will be organized as well.
#ECCV2020
Just wasted an hour trying to create a mesh from a noisy point cloud... Clearly, 2019 is not ready for this. Any hints highly appreciated. I was using MeshLab but found it quite useless for this task. On the other hand, I was just pressing random buttons like a monkey.