Personalization means a lot. We thus introduce the lightweight motion LoRA into
#AnimateDiff
. Fine-grained motions like the following camera control are enabled. Models are available now at GitHub. Just create more motions as you want.
Update of
#AnimateDiff
: now can avoid watermarks and more importantly, support
#SDXL
i.e., produce high resolution videos (ie 1024x1024x16 frames with various aspect ratios) with/without personalized models (~13GB VRAM for inference). Try it out on .
Yuwei (
@GuoywGuo
) just released
#AnimateDiff
v3 and
#SparseCtrl
which allows to animate ONE keyframe, generate transition between TWO keyframes and interpolate MULTIPLE sparse keyframes. RGB images and scribbles are supported for now.
Github:
Controllability matters for video generation. We thus propose SparseCtrl that enables keyframe (sketch, depth, img) animation/transition/prediction/interpolation by feeding TEMPORALLY SPARSE condition maps. It’s also compatible with
#AnimateDiff
. Page:
If your GAN related research (e.g., image, video and 3D) is also bottlenecked by its scale, just like me, try our Aurora, a GAN based text-to-image generator. Surely, code and model will be publicly available soon.
ArXiv:
Thank
@ak92501
for sharing our recent work! Our GenDA could substantially improve the synthesis quality and diversity with few-shot even one-shot target image. Code is coming soon. Stay tuned!
One-Shot Generative Domain Adaptation
abs:
project page:
method works well even with large domain gaps, and robustly converges within a few minutes for each experiment
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
paper page:
This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a pre-trained text-to-image (T2I) model as a basis. It is a highly desirable
Glad to share our
#NeurIPS2022
work 'Improving GANs with A Dynamic Discriminator' which could easily improve 2D and 3D-aware image synthesis under various data regimes.
Paper:
Work with Yujun Shen,
@YinghaoXu1
, Deli Zhao,
@doubledaibo
,
@zhoubolei
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
paper page:
propose DMV3D, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion. Our reconstruction model
Check our
#CVPR2020
work 'Temporal Pyramid Network for Action Recognition', a generic module to capture the visual tempos of action instances
Paper:
Github:
Joint work with
@YinghaoXu1
Jianping Shi
@doubledaibo
and
@zhoubolei
Check out Qihang
@QihangZhang0224
's previous work BerfScene that designs BEV-conditioned equivariant radiance fields for large-scale 3D scene generation and manipulation.
Page:
Check our work 'Data-Efficient Instance Generation from Instance Discrimination', which advances image synthesis with limited data by a clear margin.
Github:
Paper:
Joint work with Yujun Shen,
@YinghaoXu1
,
@zhoubolei
Glad to share our Generative Hierarchical Features (GHFeat) got accepted by
#CVPR2021
as an ORAL presentation, which learns hierarchical representations from a well-trained GAN.
Many thanks to my collaborators
@YinghaoXu1
, Yujun Shen, Jiapeng Zhu and
@zhoubolei
.
Our Instance localization is accepted by
#CVPR2021
😆. Code and models are coming soon. Stay tuned!
Joint work with Zhirong Wu,
@zhoubolei
, Stephen Lin.
The latest version of diffusers 🧨 ships with AnimateDiff and Motion LoRAs from
@CeyuanY
,
@GuoywGuo
and team. And integrates with PEFT
Create animations using text prompts with AnimateDiff and any SD 1.5 checkpoint. Control the movements using Motion LoRAs + PEFT.
Details👇🏽
Hi All, with
@GuoywGuo
’s hardworking, our animatediff now requires around 12G memory, that is, can run on a single RTX3090 without sacrificing quality due to reduced
#frames
or resolutions.
If your GAN related research (e.g., image, video and 3D) is also bottlenecked by its scale, just like me, try our Aurora, a GAN based text-to-image generator. Surely, code and model will be publicly available soon.
ArXiv:
Excited to share our recent attempt for
#3D
scene generation🪐. Given text prompt, "SceneWiz3D" can generate high-fidelty 3D scenes and support flexible object control.
Project webpage:
Paper:
Code:
@camenduru
@raoanyi
@doubledaibo
The latest animatediff optimizes memory, requiring around 12G memory for now. So we don’t have reduce the number of frames or resolutions :)
Check our work 'Data-Efficient Instance Generation from Instance Discrimination', which advances image synthesis with limited data by a clear margin.
Github:
Paper:
Joint work with Yujun Shen,
@YinghaoXu1
,
@zhoubolei
The devil is in the details. It is really takes our time and efforts to design the framework, figure out the differences between various platforms, and debug day and night. Great teamwork! Hope this codebase could help everyone in generative modeling in future🎉🎉🎉
No accepted papers, I am still happy to announce we open-source GenForce, an efficient PyTorch library for generative modeling. StyleGAN and PG-GAN are perfectly reproduced. We also collect a zoo of 60+ pretrained StyleGAN models, with Colab to play.
@CiaraRowles1
Very impressive! Btw, a new checkpoint supporting 512x320 resolution and the same number of frames, based on SD1.5) will be available tomorrow. Stay tuned!
3D shape generation and 3D-aware image generation have recently drawn a lot of attention. Check out the reading list collected by
@YinghaoXu1
, which includes awesome 3D generation papers.
Remote Sensing Image Scene Classification Meets Deep Learning: Challenges, Methods, Benchmarks, and Opportunities
by Gong Cheng et al.
#Autoencoder
#ConvolutionalNeuralNetworks
Check our work 'Data-Efficient Instance Generation from Instance Discrimination', which advances image synthesis with limited data by a clear margin.
Github:
Paper:
Joint work with Yujun Shen,
@YinghaoXu1
,
@zhoubolei