Robot Web: breakthrough many-robot localisation. Uses efficient, general message passing over dynamic graphs. Accurate and highly robust to sensor/comms failure.
@rmurai0610
@joeaortiz
@SaeediG
@paulhjkelly
Full video demo:
Paper:
Many cool demos in the exhibition hall at
#ICRA2023
of advanced robots of all types. I liked this one which seems quite simple in comparison but just very clearly showing very fast and precise motor control.
New: 3D neural fields like NeRF do automatic, unsupervised semantic scene decomposition. We reveal it with tiny interactions. Real-time SOTA room segmentation from 140 clicks; zero prior data! iLabel, Dyson Robotics Lab,
@SucarEdgar
@Shuaifeng_Zhi
et al.
This summer I've been working to finally understand Lie Theory, the basis for proper estimation on over-parameterised manifolds like SE(3). There are some great tutorials for the roboticist out there; I especially like Micro Lie Theory by Solà et al.
Very proud, and still a bit shocked, to share that I've been elected Fellow of the Royal Society. Thank you to my students and collaborators from Imperial College, Oxford, AIST, Dyson, Slamcore and beyond, and of course to my supportive family and friends!
We are very happy to announce the eighty exceptional scientists elected as Fellows of the Royal Society this year, selected for their outstanding contributions to the advancement of science. Meet the new Fellows and find out more about their research:
What an honour to win a PAMI Helmholtz Award at
#ICCV2021
. DTAM (ICCV 2011) is a paper with few results, and barely got in as a poster. But once
@rapideRobot
and
@stevenjl
got the big laptop to Barcelona we showed people a live dense SLAM and AR demo they had never seen before!
iMAP is a new way to do SLAM: we learn an implicit neural representation *in real time* and track an RGB-D camera against it. The implicit map fills holes; completes the unseen backs of objects; and maps a whole room in only 1MB of weights. From the Dyson Robotics Lab, Imperial.
Excited to share iMAP, first real-time SLAM system to use an implicit scene network as map representation.
Work with:
@liu_shikun
,
@joeaortiz
,
@AjdDavison
Project page:
Paper:
The thing I'm proudest of in my career is the work I've done with the PhD students I've supervised at Imperial College. Their final PhD theses are sometimes hard to find so I've gathered links to all of them on my homepage at ... please have a look! 1/n
Yes! Zero to One research is also something I'm always trying to do and explain to my students. I strongly feel it's too early for benchmarking in much of vision/robotics/AI, when basic things are still not possible. We need demos of new capability, not tables!
#ICCV2021
, Dyson Robotics Lab at Imperial: iMAP is the first SLAM system based on continual, real-time learning of an implicit neural representation. In 3 minutes a 1MB MLP model captures global shape and detail, with convincing scene completion despite no prior training data.
We will be presenting our new real-time SLAM system iMAP at
#ICCV2021
! With a neural implicit scene representation it can map scenes efficiently, fill holes, and jointly optimise the 3D map and camera poses
work with:
@liu_shikun
,
@joeaortiz
,
@AjdDavison
Who else can smell the end of big-data supervised learning in the air?
@ylecun
😃 Certainly densely hand-labelled image datasets don't make much sense to me after using iLabel.
New: 3D neural fields like NeRF do automatic, unsupervised semantic scene decomposition. We reveal it with tiny interactions. Real-time SOTA room segmentation from 140 clicks; zero prior data! iLabel, Dyson Robotics Lab,
@SucarEdgar
@Shuaifeng_Zhi
et al.
All researchers should fight against this. Every week I try to persuade my students that top papers often have few quantitative results. With work that's new, important, and clearly qualitatively different (zero to one!), you don't need quantitative results. Demos not tables!
Using Gaussian Belief Propagation as in Robot Web, we now show dynamic multi-robot *planning* via p2p comms, no central solver needed. In a scaled simulation, cars slide closely past each other at motorway speeds.
@AalokPat
@rmurai0610
, Dyson Robotics Lab,
Just an implementation of the Dynamic Window Approach planner (essentially sampling-based short-range MPC) I did for teaching. A reminder of how cool simple non-learned planning can be when perception is assumed solved so the planner has full state knowledge.
New: Feature-realistic neural fusion for real-time, open set scene understanding. Our neural field renders to feature space, enabling real-time grouping and segmentation of similar objects or parts from ultra-sparse, online interaction. Dyson Robotics Lab.
Can anyone explain if there is a difference between unsupervised and self-supervised learning? To me they seem the same and I find myself using both terms interchangeably (I prefer unsupervised), but I feel like I'm confusing people who understand them to mean different things.
Great live demos in Marco Hutter's keynote at
#ICRA2022
. And they had a video of the robot doing the Rocky steps and celebration here in Philadelphia!
We add semantics outputs to NeRF models of 3D occupancy/colour. Joint representation allows very sparse or noisy in-place supervision to generate high quality dense prediction.
Dyson Robotics Lab
#ICCV2021
Oral
@Shuaifeng_Zhi
@tlaidlow
@StefanLeuteneg1
.
New demo, with turns and swerves, of distributed real-time multi-agent planning/MPC; no central control needed. Uses GBP and p2p message passing over the joint factor graph so arbitrarily scalable and robust.
@AalokPat
@rmurai0610
Dyson Robotics Lab
So according to new research from DeepMind, using 3D vision/SLAM tools to build an explicit representation of a scene is useful for robotics...who would have thought it?
Creating photorealistic simulations of unstructured scenes is hard. Using NeRFs we turn 5min videos into simulations, train vision-guided policies for humanoid robots, and show zero-shot transfer to the real world!
abs:
project:
Semantic-NeRF! Super simple, just add semantic outputs to a NeRF network and you can label a full 3D scene from highly sparse or noisy annotations.
#ICCV2021
Happy to introduce Semantic-NeRF.
Multi-view consistency and smoothness make NeRF-training a label fusion process, supervised by sparse or noisy labels only!
Work with:
@tlaidlow
,
@StefanLeuteneg1
,
@AjdDavison
Project page:
Paper:
Semantic labels are highly correlated with geometry and appearance. When we add semantic outputs to a neural implicit representation, very sparse or noisy supervision is enough to generate good quality labels for the whole scene. From the Dyson Robotics Lab at Imperial College.
Happy to introduce Semantic-NeRF.
Multi-view consistency and smoothness make NeRF-training a label fusion process, supervised by sparse or noisy labels only!
Work with:
@tlaidlow
,
@StefanLeuteneg1
,
@AjdDavison
Project page:
Paper:
📢 Code release for 𝗗𝗦𝗜𝗡𝗘 (
#CVPR2024
- Oral)
DSINE gives you surface normal prediction (+ uncertainty) in real-time. We have released the code for training, testing, and running real-time demos. Try it yourself!
Can anyone from explain to me how a (kestrel?) is able to achieve this stabilisation accuracy? It is high; ground not that textured; effectively monocular vision; why is rot/trans ambiguity not higher? Of course it has inertial and other cues too but still just surprised/amazed.
Not usually one to just agree with Elon, but I think he's saying the same thing here as I was in my last tweet: the hardest part of AI is perception: going from real sensor data to an efficient, but explicit, scene representation --- then your robot can do pretty much anything.
We will be live demoing iMAP on Monday at
#CORL2021
, which trains an MLP neural implicit model from scratch in seconds as a SLAM representation for both reconstruction and tracking. Come and try it! From the Dyson Robotics Lab at Imperial College.
Code now available for Gaussian Splatting SLAM from
@HideMatsu82
and
@rmurai0610
, Dyson Robotics Lab at Imperial. Includes real-time monocular demo with various interactive visualisations. Also supports RGB-D. Looking forward to seeing what people will do with it!
#CVPR2024
Code release of Gaussian Splatting SLAM!
#CVPR2024
As of now, our method is the only Monocular SLAM solely based on 3DGS. No depth information needed.
Work with
@rmurai0610
*
@paulhjkelly
@AjdDavison
.
(*Equal Contribution)
Details in the thread:
My (probably controversial) idea to improve the state of publishing/reviewing in computer vision's overloaded main conferences: a limit (e.g. 3) on the number of papers that any individual can submit as co-author to one conference.
Robot Web: distributed, asynchronous message-passing for simple, accurate multi-robot localisation, at last officially published in IEEE Transactions on Robotics. Towards the inter-operable robot future! For me maybe my most important work since MonoSLAM.
Robot Web: breakthrough many-robot localisation. Uses efficient, general message passing over dynamic graphs. Accurate and highly robust to sensor/comms failure.
@rmurai0610
@joeaortiz
@SaeediG
@paulhjkelly
Full video demo:
Paper:
I also couldn't believe how good that illusion was so I had to make one for myself tonight... it really works!
@ankurhandos
@SergeBelongie
(It's called the Ames Window if you want to download your own template to print out.)
It's impressive what can be done with two robot arms teleoperated by a human brain, showing again (as was done 10+ years ago , though now with even more dexterity) that perception and intelligent planning are holding robotics back more than hardware.
Mobile ALOHA's hardware is very capable. We brought it home yesterday and tried more tasks! It can:
- do laundry👔👖
- self-charge⚡️
- use a vacuum
- water plants🌳
- load and unload a dishwasher
- use a coffee machine☕️
- obtain drinks from the fridge and open a beer🍺
- open…
Neural scene models like NeRF can encode other properties, such as semantic maps. Joint representation means these maps share the coherence of occupancy/colour, allowing dense 3D prediction from very sparse or noisy in-place supervision (e.g. a fast 2D CNN, or clicks).
#ICCV2021
Happy to introduce Semantic-NeRF.
Multi-view consistency and smoothness make NeRF-training a label fusion process, supervised by sparse or noisy labels only!
Work with:
@tlaidlow
,
@StefanLeuteneg1
,
@AjdDavison
Project page:
Paper:
DreamFusion is remarkable and I'm trying to understand how it works. If I understand correctly, the key thing is that a pre-trained diffusion model can take in some starting image and a text prompt and output a new image which is more like what the text describes. 1/n
Happy to announce DreamFusion, our new method for Text-to-3D!
We optimize a NeRF from scratch using a pretrained text-to-image diffusion model. No 3D data needed!
Joint work w/ the incredible team of
@BenMildenhall
@ajayj_
@jon_barron
#dreamfusion
Super-accurate CAD model fitting to single RGB-D images, built into a real-time object-level SLAM system with scene graph optimisation and camera tracking. AR examples show the value of using semantically tagged object models. SLAM++
@nazcaspider
etal:
My talk today is available to watch in full here. It was a great experience and thanks again to all of the organisers. Looking forward to the upcoming talks!
What do future
#SpatialAI
systems have to do, and how will they work, as we bring together probabilistic and geometric computer vision with deep learning and the ongoing developments in sensing and processing hardware. Read about FutureMapping at .
Announcing that I’m getting into the humanoid robot space! Going to surpass all competitors instantly with our new robot Asimo which was designed by Honda over 20 years ago.
The enormous power of explicit 3D visual scene understanding is to enable varied, precise manipulation via standard motion planning. Works for many variations of object size/shape/placement with no demos or RL needed! Dyson Robotics Lab: NodeSLAM
New from the Dyson Robotics Lab at Imperial at
@3DVconf
by
@tlaidlow
: SLAM with Quadric Surfaces. Many scene elements can be represented accurately and efficiently with quadrics. Our new minimal representation enables their use in a standard factor graph.
New on arXiv:
FutureMapping 2: Gaussian Belief Propagation for Spatial AI,
with
@joeaortiz
GBP is ready for the new generation of super-parallel AI chips and edge networks, for general, graph-based
#SpatialAI
.
Try my Python demos and see what you think!
Everyone knows AlexNet (2012), but earlier pioneers of GPUs for vision were the gpu4vision project from TU Graz (Tom Pock and others): incredible real-time variational optical flow, denoising, range image fusion, etc. from 2008 onwards. More vids at:
Regular re-tweet of this from Bill Freeman... I spend a lot of my time trying to persuade students of this; they often don't believe me. There is very little to be gained by publishing an average, "pretty good" paper. Better to wait and work on something deep and long-term.
*It doesn't matter much.*
Vast majority of the papers won't matter in the long run. Your career will be shaped only by a few good ones. Instead of getting an "okay" paper accepted, it could be a blessing in disguise to revise and strengthen your paper.
Fig credit: Bill Freeman
Just did this paper in our reading group (thanks
@alzugarayign
) and it's impressive. A reminder about how the right representation lets you get back to just optimising and using all of the rich photometric data in multi-camera video with just basic priors and no neural networks!
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis
We model the world as a set of 3D Gaussians that move & rotate over time. This extends Gaussian Splatting to dynamic scenes, with accurate novel-view synthesis and dense 3D trajectories.
Fit-NGP: millimetre-accurate 3D object model fitting from a single RGB robot-mounted camera rapidly scanning a scene. Super simple, auto-optimises camera poses, works even for tiny shiny objects like screws (that depth cameras can't see)! Dyson Robotics Lab at Imperial College.
Excited to announce Fit-NGP which will be presented in
#ICRA2024
!
Fit-NGP accurately estimates 6-DoF object poses (~ 1.6mm) leveraging Instant-NGP's density field.
With
@alzugarayign
&
@AjdDavison
.
Project page:
Video:
(1/3)
Large language models and web-scale data have some use in robotics as a user interface as nicely demonstrated here, but in my opinion they are not what we need to help with perception, object representation and precise planning which are the real current barriers in robotics.
Computers have long been great at complex tasks like analysing data, but not so great at simple tasks like recognizing & moving objects. With RT-2, we’re bridging that gap by helping robots interpret & interact with the world and be more useful to people.
I don't believe that computer vision needs big data. E.g. I bet that pretty soon someone will come up with something which can segment scenes as well as SAM but only needs a few unsupervised images for training (because segmentation is all about self-similarity).
This looks great; hash encoding for ultrafast training of neural fields. As
@zzznah
said, it's a lesson in going back to the basics of what runs well on today's ridiculously powerful parallel processors. I think we should be embarrassed when algorithms need hours on a modern GPU.
This presents vSLAM history of the past 20 years or so in the way I like to think of it, with new real-time demo systems as the main markers of progress. Onwards to
#SpatialAI
!
"Premature optimisation" is a familiar sin in programming (Knuth), but I think my field of CV/robotics is increasingly suffering from "premature evaluation", with every component benchmarked to death but little thought given to how they could combine into something bigger! 1/2
A variation with multiple robots running DWA planning, all trying to reach the same target. I always wondered about making a game out of this but couldn't work out what the player would control to make it fun. Any ideas?
Premature optimisation was always a problem in research, but I do think it's got worse with deep learning. Students are often obsessed with the details of trendy networks when basic decisions about what's the input, what's the output, etc. are still up in the air.
Common issue in deep learning projects is that complex method designs are adopted before basic debugging.
Often this leads to situations where the data loader still loads black images, but we've already tried 10 loss function.... cuz a paper claimed these would improve results.
Officially announced today, the Dyson Robotics Laboratory at Imperial College which I direct has received £5M+ new funding from Dyson and EPSRC under the Prosperity Partnerships scheme. Thanks to all involved in the lab, and to
@Dyson
for the long-term support and collaboration.
Engineering and Physical Sciences Research Council
I liked the DeepMapping paper from Ding and Feng at
#CVPR2019
. A bit similar to DIP, they use deep learning machinery to solve a surprising optimisation problem (no learning on a dataset): pose graph alignment for a set of pose scans.
@czarnowskij
We are releasing interactive code for SuperPrimitives, a simple new way to do dense monocular SfM and visual odometry using strong generic segmentation and normal priors. Dyson Robotics Lab at Imperial College.
Code release for SuperPrimitives, and it comes with an interactive GUI!
#CVPR2024
SuperPrimitive is a new 3D representation which enables solving many 3D tasks at the level of image segments.
Three years ago today, the project that eventually became NeRF started working (positional encoding was the missing piece that got us from "hmm" to "wow"). Here's a snippet of that email thread between Matt Tancik,
@_pratul_
,
@BenMildenhall
, and me. Happy birthday NeRF!
Bundle Adjustment on a Graph Processor, by
@joeaortiz
, Mark Pupilli,
@StefanLeuteneg1
and me, CVPR 2020. Using Gaussian Belief Propagation we show breakthrough 20x speed for BA on a single
@graphcoreai
IPU compared to CPU/Ceres.
If you liked Semantic-NeRF, this is the next step where it really gets interesting. Train a geometric/semantic neural field in real-time and add ultra sparse open set labels as clicks to densely segment a room in a few minutes. No pre-trained networks or prior data needed at all.
New: 3D neural fields like NeRF do automatic, unsupervised semantic scene decomposition. We reveal it with tiny interactions. Real-time SOTA room segmentation from 140 clicks; zero prior data! iLabel, Dyson Robotics Lab,
@SucarEdgar
@Shuaifeng_Zhi
et al.
Live Monocular SLAM using 3DGS as the only scene representation. From a SLAM perspective, Gaussian blobs are an explicit, efficient scene representation similar to points or surfels, but with much better properties for optimisation.
Dyson Robotics Lab at Imperial College.
Happy to share Gaussian Splatting SLAM
We show the first 3DGS-based Monocular RGB SLAM, the hardest SLAM setting.
Using 3D Gaussians as a unified representation, the method only requires RGB images - No need for SfM, depth sensor, or learned prior.
Also from the Oxford Active Vision Lab, same era (2003), Walterio Mayol-Cuevas and David Murray's amazing wearable robot running MonoSLAM. Real-time active camera control enables stable object fixation and saccades as the user moves. Video and paper links:
I like the ideas in this paper... says that the early feature layers of a CNN can not only be learned in an unsupervised way, but very effectively just from a single image. Seems to confirm the strong generality of low level natural image statistics.
Our new work by
@y_m_asano
and Andrea Vedaldi is on arXiv now. We investigate the surprising effectiveness of unsupervised learning using only one single image. We can learn early layers using one image + heavy augmentations just as well as with ImageNet.
Lin Yen-Chen, Pete Florence, Jonathan T. Barron, Alberto Rodriguez, Phillip Isola, Tsung-Yi Lin, iNeRF: Inverting Neural Radiance Fields for Pose Estimation, arXiv, 2020
Paper:
Project page:
Monocular normal prediction used for highly robust real-time camera orientation estimation (and one of the best acronyms we've ever come up with...) New from the Dyson Robotics Lab at Imperial.
𝗜𝗠𝗨? How about 𝗨-𝗔𝗥𝗘-𝗠𝗘?
In this work, we show how monocular surface normal cues can be used for rotation estimation.
collab w/
@AalokPat
, Callum Rhodes,
@AjdDavison
Aalok will present Gaussian Belief Propagation Planning at the Multi-Agent Path-Finding workshop at
#AAAI23
next week (also at
#ICRA2023
in May). We believe this is the first truly distributed method for collaborative planning with general cost functions and dynamics constraints.
Q: How can many robots plan to *safely and smoothly* move around each other?
A: They collaborate and negotiate paths!
Find out at my talk at
#AAAI23
on Tuesday 14th Feb!
Link to paper/poster/video:
@AjdDavison
@rmurai0610
Inspired by the new wave of interactive publishing pioneered by
@distillpub
, and thanks to
@joeaortiz
's massive effort to become a Javascript ninja, we are very proud to share this article which explains how Gaussian Belief Propagation works, and why we think it's so important.
Very excited to share our interactive article:
A visual introduction to Gaussian Belief Propagation!
It's part proposition paper, part tutorial with interactive figures throughout to give intuition.
Article:
Work with:
@talfanevans
,
@AjdDavison
1/n
I doubt that robotics needs big data right now. As we continue to improve scene reconstruction+representation, motion planning plus local learning becomes very powerful. E.g. no pre-trained networks in this demo; just a unified neural field scene representation, trained live.
If you're at CORL this week in NZ come and meet
@iainhaughton
and
@Ed__Johns
and see Iain's presentation in the oral session on Saturday. Dense, fully automatic segmentation of scene properties like softness via real-time neural field training; no priors! It's iLabel for robots.
Many tasks in robotics/AI involve *scene rearrangement*. How do we define a goal state or measure success? New environments with realistic simulation of perception and physics enable systematic research. We discuss in this major new collaborative report!
#CORL2022
oral: mapping non-visual properties (material, softness, force) from very few point sensor tests. iLabel-like neural field produces dense maps + guides actions. Live, autonomous, no priors!
@iainhaughton
et al,
@Dyson
@ICComputing
.
Paper/video:
📢 Our
#ECCV2022
paper (and code) on fast accurate depth estimation and reconstruction is out now!
SimpleRecon: 3D Reconstruction without 3D Convolutions
(1/4)
Wow.. I guess we don't know how Sora works yet, but assuming there is no explicit 3D consistency check built into the generation pipeline then I am definitely surprised this is possible.
Normal prediction from a single image is something that neural networks are incredibly good at, and is extremely widely useful. See the new level of performance in this new work with
@BaeGwangbin
, Dyson Robotics Lab at Imperial College London.
Excited to introduce 𝗗𝗦𝗜𝗡𝗘! (
#CVPR2024
)
We push the limits of single-image surface normal estimation by rethinking the inductive biases needed for the task.
See you in Seattle!
Very nice papers at the 3D session at
#CVPR2019
, including DeepSDF, BAD-SLAM (real-time dense BA on surfel maps) and this remarkable one on showing that you can recover realistic images from 3D points clouds (best results with SIFT descriptors + colour):
DeepFactors with
@czarnowskij
, Tristan Laidlow,
@ronnieclark__
from the Dyson Robotics Lab.
Unified real-time monocular SLAM with a general factor graph formulation (GTSAM), pushing what's possible combining deep networks with probabilistic optimisation.
Getting excited for ICRA, the main international robotics conference, held in London for the first time next week. I'll be there all week; see you there if you're interested in *real* AI that's actually making contact with the world ;)
As an academic, honestly, this doesn't worry me in the slightest. I feel like we're just getting started in AI and there are so many interesting problems out there to work on. I'm sure that all you still need to do important long-term AI research is pen, paper and a laptop.
Do you feel anxious that AI's emphasis on large-scale language models (LLMs) will crowd out academic labs? Few can afford 1000 GPUs drawing a small country's worth of electricity
Live iLabel object segmentation, as demoed at CORL recently. Hand-held camera, zero training data or hand-designed rules, network trained in real-time. Highly accurate object boundaries emerge from sparse clicks, despite object similarity or contact.
Nice to see my friends from Zaragoza back with ORB-SLAM3 --- presumably continuing their line of the best-engineered academic visual SLAM systems you can get!
Cats negotiating obstacle courses. It's impressive that they can place their front paws so precisely; what really amazes me, though, is that they can do it with their back paws, when the obstacles are no longer in sight.
If you've seen the "brain with modules" picture from
@ylecun
's cool new paper, a reminder of my version. I agree the key challenge in AI is updating a persistent world model, and emphasize the route to efficiency by matching algorithm/representation graphs to computing hardware.
What do future
#SpatialAI
systems have to do, and how will they work, as we bring together probabilistic and geometric computer vision with deep learning and the ongoing developments in sensing and processing hardware. Read about FutureMapping at .
AI is clearly lacking this kind of intuitive spatial/physics reasoning, but I don't see enough research on what for me is the biggest challenge: building general, *efficient* composable 3D world models from real-time vision + sensors. My views:
#SpatialAI
Great to see there's more vision going on in the parkour work than I've seen before in Atlas demos; depth cams for real-time model-based tracking of boxes and structures, allowing some
on-line modification of planned motion trajectories.
@czarnowskij
this must be fun to work on!
Robot perception algorithms convert data from cameras and sensors into something useful for decision making and planning physical actions. See how perception and adaptability enable varied, high-energy behaviors like parkour.
Scalable and resilient computation in robotics should be distributed, whether over many-robot graphs or within single chips. We present the new Workshop on Distributed Graph Algorithms for Robotics at
#ICRA2023
in London ; please submit paper and demos!
ISMAR has been very important to me over the years, inspiring in particular my love of real-time demos as the highest form of academic presentation! Thanks for this recognition and congrats to
@rapideRobot
and the other authors, now leading research on AR all over the world.
Congratulations to the recipients of the ISMAR 2021 Impact Paper Award! It's been 10 years since this paper was published and has been cited countless of times since. Wow!
Thirty years of Computer Vision research at work here. While the public does not hear much about basic science research, it’s the generations of scientists and their passionate work that bring moments like this to live 🌹👍🙏
iSDF uses the main incremental neural field training methods of iMAP, but interprets the MLP output as a signed distance field rather than occupancy. Similar reconstruction quality, with auto hole-filling. Directly building an SDF could be useful for some robot planning cases.
Excited to share iSDF! Real-time mapping with neural (implicit) signed distance fields for robot navigation and manipulation.
Project page:
Paper:
Work with: Alex Clegg, Jing Dong,
@SucarEdgar
@davnov134
@MZollhoefer
@mhmukadam
The Joint CVPR workshop on Localisation, VO, SLAM is on Sunday and Monday; full programme here:
Looks like anyone can stream the talks live at this YouTube link. I'm speaking at 4pm UK time on Sunday.
Great results; for me this is what we should mean by the term "optical flow", not just 2-view correspondence. I'm interested in how to do this incrementally (rather than batch) with efficient distributed compute --- crucial general early vision for
#SpatialAI
.
h/t
@ronnieclark__
Very happy to share our
#ECCV2022
oral “Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories”
Fine-grained tracking of anything, outperforming optical flow.
project:
abs:
code:
The Raspberry Pi project is the UK at its best, opening up creative interest in computers and hardware for over 10 years now. And most of them are actually made here, in the Sony factory in Wales. I've bought hundreds and we use them every year for teaching robotics.
How Raspberry Pis are made (Factory Tour)
Love watching videos like this.
Stumbled by while researching the new Pi 5.
Pis help build Pis!
One Pi gets built every ~3.14 seconds :D
I want to play Factorio now.
It's tricky to use deep learning in multi-view SLAM. New idea: learn a depth covariance function, predicting pixel depth correlations from a single image. Useful in many optimisation settings; e.g. real-time dense monocular VO; note the precise small details. See it live at CVPR!
Excited to announce "Learning a Depth Covariance Function" with
@AjdDavison
. A flexible framework for a variety of geometric vision tasks, such as dense monocular visual odometry shown below.
Dyson Robotics Lab, Imperial College
Project page:
#CVPR2023
We have a Dyson Fellow (post-doc) position in computer vision and robotics available in the Dyson Robotics Lab at Imperial College London. Come and work on cutting edge SLAM, scene understanding and manipulation with me and the rest of our team. Details:
If you want to learn more about Gaussian Belief Propagation and its properties for distributed computation, estimation and learning on general graphs, you can play with the demos in our interactive
@distillpub
-style article here.
Very excited to share our interactive article:
A visual introduction to Gaussian Belief Propagation!
It's part proposition paper, part tutorial with interactive figures throughout to give intuition.
Article:
Work with:
@talfanevans
,
@AjdDavison
1/n
Congratulations to Edgar who passed his PhD viva today, and thanks to examiners
@tolga_birdal
and José María Montiel! A reminder of Edgar's iMAP, a landmark as the first real-time neural field SLAM system from
#ICCV2021
.
Excited to share iMAP, first real-time SLAM system to use an implicit scene network as map representation.
Work with:
@liu_shikun
,
@joeaortiz
,
@AjdDavison
Project page:
Paper:
Real Time Height Map Fusion using Differentiable Rendering, with
@jz4411
@StefanLeuteneg1
, Dyson Robotics Lab, single RGB camera. Here used for dense, geometric drivable ground segmentation at <1cm height. (no learning needed).
Nice blog; I really agree with the main message. I hope people can link this to with why I'm obsessed with distributed optimisation; especially Gaussian Belief Propagation with its `magic' properties of convergence despite ad-hoc, noisy, message passing.
Finally done with my first blog post "The Future of Artificial Intelligence is Self-Organizing and Self-Assembling"!
Covering work from our group and others on the combination of ideas from deep learning and self-organizing systems.
You can fuse arbitrary features (e.g. DINO) into 3D via real-time neural field SLAM, with all geometry and coherent feature maps held in a single neural field. This allows highly efficient open set object classification and scene segmentation.
#ICRA2023
New: Feature-realistic neural fusion for real-time, open set scene understanding. Our neural field renders to feature space, enabling real-time grouping and segmentation of similar objects or parts from ultra-sparse, online interaction. Dyson Robotics Lab.
I strongly agree that:
- 3D object graphs are the right (efficient, semantically optimal) representation for intelligence.
- Message passing is the computation pattern to focus on.
The biggest challenge is how to actually *build* scene graphs from real sensor data.
#SpatialAI
Semantic and Geometric Modeling with Neural Message Passing in 3D Scene Graphs for Hierarchical Mechanical Search
by Andrey Kurenkov et al. including
@ken_goldberg
#NeuralNetwork
#Vector