📸 4.0 Launch Livestream -- Tuesday, Oct. 31 🎃
In <1 week, you'll be able to do MUCH more with your machine learning apps:
- change UI/UX of any component
- control event processing
- send server-side events
- generate custom share links 🔥
Join us:
Deploying a machine learning model...
Old way:
- read a bunch of blogs on serializing & dockerizing models
- write a Flask app to serve
- google how to create a simple front end
- set up an EC2 to host
New way:
<3 lines of code with Gradio
🚨Today's Hot Research Alert: Replace-Anything from Alibaba
It can be used in many scenes, such as human replacement, clothing replacement, background replacement, and so on as shown in these examples.
Excited to have this available soon on Spaces for our community😍
The fastest way to deploy a machine learning model:
- No Flask
- No HTML, JS, or CSS
- No Docker, EC2, or any hosting necessary
Just pip install gradio and get a shareable link in <3 lines of Python
The way ML is taught:
1. pick static dataset (MNIST)
2. split into train/test
3. train until high test acc
4. move on
How about instead of moving on,
5. deploy the model
6. get users to break models
7. adapt dataset to fix issues
That's why we built
MagicAnimate from ByteDance Inc team allows you to animate a human image following a given motion sequence.
Great research from team
@MikeShou1
et al.
Code released with official
@Gradio
demo -
Spaces demo coming out soon, stay tuned!
BIG NEWS 🥳🎈
Building Chatbots apps just got wayyy easier: announcing the new 𝙲𝚑𝚊𝚝𝙸𝚗𝚝𝚎𝚛𝚏𝚊𝚌𝚎 class 🙌
The *fastest* way to build to build a Chatbot UI in Python -- including streaming, undo/retry, API, all out of the box!
Let's take a look at a few examples...
🎨 Introducing Follow-Your-Click: Transforming the way we animate images with just a click and a short prompt!
New ImageToVideo model. Say goodbye to moving entire scenes and hello to precision and creativity.
🤯Text-to-Sing
@Gradio
demo.
🔥Results are unbelievably melodious! [attached]
Upload a melody of your choice, enter your own lyrics, and have the computer sing back your lyrics in the given melody!
Demo on
@huggingface
Spaces -
💫 We're excited to launch a NEW Python library 💫
The 𝚐𝚛𝚊𝚍𝚒𝚘_𝚌𝚕𝚒𝚎𝚗𝚝 library lets you run any Gradio app as an API 🚀
See a cool Hugging Face Space? Use it programmatically instantly:
🌟 Introducing Gradio-Lite! 🌟
Use Gradio right in your web browser with our new library that leverages Pyodide to run full machine learning apps entirely in your browser.
🤯 Now, you can build AI apps without server-side infrastructure!
🚀Finally A Mixture of Experts for Large Vision-Language Models, called 𝐌𝐨𝐄-𝐋𝐋𝐚𝐕𝐀 !
🌋At just 3Billion sparsely activated params MoE-LLaVA perf is comparable to LLaVA1.5-7B on various visual understanding datasets and surpasses LLaVA1.5-13B in object hallucination bm.
OOTDiffusion official Gradio demo is now available on 🤗Spaces.
🕺SOTA in virtual try-on (VTON)
Merges the garment features accurately with the target human image within the model's self-attention layers that help the model focus on relevant parts of the image.
⭐️Model on hub.
Now that Twitter has open-sourced their algorithm, we've been hard at work building a Gradio demo that lets you score your own Tweets 🧮
Try it out below ⤵️
We're excited to release something new: the Gradio Discord Bot 🤖
Allows you to use any demo from🤗Spaces as a bot in your own Discord Server. For example, you can use the same Bot to:
- Translate between languages 🌎
- Text-to-speech 🗣️
- Do math 🔢
- Generate images 🖼️
...
Excited to share that we have migrated our entire front end from React to
@Sveltejs
🚀
Based on our experience, Svelte is:
⚡️ Much faster at rendering the GUI (due to a lot of compile-time optimizations)
🥰A "joy to use" -- one of our front-end engineers
🎉 Introducing 𝐌𝐮𝐬𝐞𝐕 : latest diffusion-based virtual human video generation framework!🎥
🤯MuseV supports infinite-length generation using Visual Conditioned Parallel Denoising and delivers high-fidelity results.
💡Compatible with SD - base models, LoRA, ControlNet, more
Gradio 3.0 allows you to easily build Dashboards with dropdowns, plots, and interactive widgets, all in Python 💫
Take a look at this one we built to show PyPi installs for all the 🤗 open-source libraries in < 30 lines of code
Meet LaVague, an AI-powered tool that automates repetitive web tasks, freeing up your time for more meaningful endeavors.🤖⏰
Built on open-source projects and models, LaVague turns natural language queries into Selenium code, making web workflow automation a breeze.💬🌐
It takes way too long to deploy an ML model:
- Wrangle with Flask
- Write HTML/JS/CSS
- Get an EC2 instance for hosting
What if you could:
- Generate an interface with a shareable link.
- Only use python.
- Only add 3 lines of code.
Meet Gradio
𝐒𝐔𝐏𝐈𝐑: An advanced image restoration method that combines generative prior and model scaling,using a 20-million image dataset, and can manipulate restorations with textual prompts.
Kudos to Fanghua Yu
@JasonGUTU
@xinntao
et al🚀
SUPIR supports Gradio demo in repo [links👇]
📢New Powerful Image Restoration Model Update:
𝐈𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐈𝐑- High-Quality Image Restoration Following Human Instructions
Novel approach that uses human-written instructions to guide the image restoration model🗣️
Gradio demo available on Spaces & on GitHub. Links below👇
High-Quality Image Restoration Following Human Instructions
demo:
paper page:
Image restoration is a fundamental problem that involves recovering a high-quality clean image from its degraded observation. All-In-One image…
A demo for
@StabilityAI
's Stable Diffusion in <65 lines of code using
@huggingface
diffusers==0.2.2 and Gradio Blocks (academic access needed to use model)
diffusers:
get started with gradio:
For all you weekend builders, we just shipped gradio 3.27, with a brand new component: 𝙰𝚗𝚗𝚘𝚝𝚊𝚝𝚎𝚍𝙸𝚖𝚊𝚐𝚎
Designed specifically for image segmentation and object detection demos 📸🖼️🚗
Docs:
Apple's🤩MGIE is now opensource! Edit your images with natural language
💡𝐌𝐮𝐥𝐭𝐢𝐦𝐨𝐝𝐚𝐥 𝐋𝐋𝐌-𝐆𝐮𝐢𝐝𝐞𝐝 𝐈𝐦𝐚𝐠𝐞 𝐄𝐝𝐢𝐭𝐢𝐧𝐠 or MGIE derives more expressive instructions for the model to follow the edits & leads to notable improvements in editing
Project links👇
😱Massive 3D release!!
🥳Introducing 𝐌𝐕𝐄𝐝𝐢𝐭: A powerful 3D toolbox for creating stunning 3D objects from text and images!
Watch the supercool and feature-rich app (akin to Auto1111🔥) in action!
MVEdit generates high-quality textured meshes from multi-view images. 🖼️➡️🗿
Introducing 𝐆𝐞𝐨𝐖𝐢𝐳𝐚𝐫𝐝: A powerful tool for estimating 3D geometry and depth from a single image🖼️➡️🌐
GeoWizard is a new generative model that 𝐩𝐫𝐞𝐝𝐢𝐜𝐭𝐬 𝐝𝐞𝐩𝐭𝐡 𝐚𝐧𝐝 𝐬𝐮𝐫𝐟𝐚𝐜𝐞 𝐧𝐨𝐫𝐦𝐚𝐥𝐬 𝐟𝐫𝐨𝐦 𝐚 𝐬𝐢𝐧𝐠𝐥𝐞 𝐢𝐦𝐚𝐠𝐞 📏🔍
Demo coming soon🦾
📢New Research: CRM- Single Image to 3D
Model integrates geometric relationships directly into its design by generating 6 orthographic views from input image.
These views are processed through a convolutional U-Net, which excels at creating a high-res triplane representation.
CRM
Single Image to 3D Textured Mesh with Convolutional Reconstruction Model
Feed-forward 3D generative models like the Large Reconstruction Model (LRM) have demonstrated exceptional generation speed. However, the transformer-based methods do not leverage the geometric
📢Exciting research alert!
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
Gradio demo generates 3D content with high fidelity and efficiency (within few seconds). Links below.
LGM
Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
paper page:
3D content creation has achieved significant progress in terms of both quality and speed. Although current feed-forward models can produce 3D objects in seconds,…
We're seeing a bunch of music generation models coming out, so in 𝚐𝚛𝚊𝚍𝚒𝚘 𝟹.𝟷𝟺, we've released a way for you to create beautiful waveform visualizations
𝚐𝚛.𝚖𝚊𝚔𝚎_𝚠𝚊𝚟𝚎𝚏𝚘𝚛𝚖(𝚊𝚞𝚍𝚒𝚘=..., 𝚋𝚐_𝚒𝚖𝚊𝚐𝚎=...)
Create videos like this, in Python, fast ⚡️
🤩Introducing 𝐄𝐌𝐀𝐆𝐄, which generates Full-Body Human Gestures from Audio, including Facial, Body, Hands, & Global Movements
🤯Audio-to-Gesture: EMAGE- A Unified Speech-Gesture Generation
🌟Uses BEAT2 (BEAT-SMPLX-FLAME) mesh-level holistic co-speech dataset for HD 3D motion
3 years ago, we sketched out some ideas for a "gui for machine learning" on a whiteboard. Since then, more than 500,000 machine learning demos have been built with Gradio 🚀
Enjoy this short history of Gradio 🍿
🎵AudioSep is a new model that can separate specific sounds from a mixed audio based on simple language commands. Can identify and isolate various sounds and voices zero-shot!
By
@LiuXub
et al.
Separation quality is simply magical🪄
@Gradio
demo on Spaces soon- Stay tuned!😍
🚀🚀 BIG upgrade to the Gradio Playground
You can now edit Gradio apps entirely in your browser and share link to the final Gradio app
The edited Gradio app is stored entirely as a base64 string in the URL, so you can now share Gradio apps WITHOUT a server 🤯
📢Fascinating research alert: 𝐎𝐎𝐓𝐃𝐢𝐟𝐟𝐮𝐬𝐢𝐨𝐧: 𝐎𝐮𝐭𝐟𝐢𝐭𝐭𝐢𝐧𝐠 𝐅𝐮𝐬𝐢𝐨𝐧 𝐛𝐚𝐬𝐞𝐝 𝐋𝐚𝐭𝐞𝐧𝐭 𝐃𝐢𝐟𝐟𝐮𝐬𝐢𝐨𝐧 𝐟𝐨𝐫 𝐂𝐨𝐧𝐭𝐫𝐨𝐥𝐥𝐚𝐛𝐥𝐞 𝐕𝐢𝐫𝐭𝐮𝐚𝐥 𝐓𝐫𝐲-𝐨𝐧
🔥How cool is that: A model and a gradio demo that can assist with Virtual Try-on!
One of the coolest things about Gradio on Spaces is that it allows you to blend the latest research into cool pipelines.
Yesterday
@bria_ai_
came up with Background Removal AI & today it's LGM's fast image-to-3D. Can we create exciting & powerful pipeline using the two? More👇
🎭 Introducing 𝐀𝐫𝐜𝟐𝐅𝐚𝐜𝐞: a new identity-conditioned face foundation model!
🔍🌟Given an ArcFace embedding, Arc2Face generates diverse, photo-realistic images with unmatched face similarity.
🖼️Built upon Stable Diffusion, does ID-to-face generation using only ID vectors.
Introducing 𝐌𝐚𝐫𝐢𝐠𝐨𝐥𝐝-𝐋𝐂𝐌🌼
🚀Lightning-FAST version of the popular sota depth estimator! Get results in seconds for images and 3D, or in minutes for videos.
Combines the power of Marigold's 10-step model with LCM, delivering stunning results in JUST One step.
Zero-shot face-adapted image generation is a rapidly developing niche research field.
If you're looking to stay ahead of the curve or to simply exploring current possibilities with Gradio apps, this thread is the perfect place to start.
1⃣IPAdapter
2⃣PhotoMaker
3⃣InstantID
🔄𝐒𝐰𝐚𝐩𝐀𝐧𝐲𝐭𝐡𝐢𝐧𝐠: A novel framework for arbitrary object swapping in personalized visual editing!
🖼️Key advantages:
🌟Precise control of arbitrary objects and parts
🌟Faithful preservation of context pixels
🌟Better adaptation of personalized concepts to the image
The Gemini API by Google has been released. A big shoutout to the team led by
@GoogleDeepMind
for making it possible!
🤯🤯 What is even more mind-blowing is that you can now create your own Chatbot using Gemini Pro and Gradio in just eight lines of code -
Why do so many machine learning models from academia & industry never get deployed as real applications?
The barrier is too high.
Gradio lowers it so that everyone can deploy their models into the real-world, and fast.
Try it out:
🦅 Eagle 7B: RWKV (RNNs) Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages
🚀Outperforms all 7B class models in multi-lingual benchmarks. Perhaps, a good dataset + Scalable architecture: is all you need?🤓
👏Licensed as Apache 2.0 license. Demo on Spaces!
👉New in Text/Image-to-3D
🔥𝐆𝐑𝐌: A groundbreaking large-scale reconstructor that can recover 3D assets from sparse-view images in just 0.1 seconds!
🚀This transformer-based model efficiently incorporates multi-view information to create densely distributed 3D Gaussians.
Alibaba's AnyText is now on Spaces !
Try out the brilliant demo online now - AnyText include two modes: Text Generation and Text Editing.
App link below-
DiffusionGPT seems amazing! The example provided in the paper is truly surreal.
Tbh, sdxl is pretty decent here, but the aesthetics of images from DiffusionGPT seem way ahead.
We can't wait for the authors to release a gradio demo for the community to play and give feedback.
ByteDance presents DiffusionGPT
LLM-Driven Text-to-Image Generation System
paper page:
propose a unified generation system DiffusionGPT, which leverages Large Language Models (LLM) to seamlessly accommodating various types of prompts input and…
🚀 Introducing 𝐈𝐃𝐌-𝐕𝐓𝐎𝐍 : A novel diffusion model for image-based virtual try-on! 👗
😍 Improves garment fidelity and generates authentic visuals.
🔧 Uses two modules to encode garment semantics: visual encoder for high-level & parallel UNet for low-level. Links below👇
Excited for our first major release of 2024:
Gradio 4 now works ENTIRELY in the browser! Thanks to the power of @𝚐𝚛𝚊𝚍𝚒𝚘/𝚕𝚒𝚝𝚎, meaning that you can build serverless applications will work faster (and more privately) 💫
Read more:
📢Hot research alert: StableIdentity
🙌Generate customized images using single input image
🚀Combine the learned identity with ControlNet OR Inject it into a video with ModelScopeT2V OR make into 3D using LucidDreamer.
Example Inputs - President Biden &
@ylecun
Google presents LUMIERE
A Space-Time Diffusion Model for Video Generation
paper page:
Demonstrate state-of-the-art text-to-video generation results, and show that our design easily facilitates a wide range of content creation tasks and video editing…
🤯DiffBIR presents a breakthrough in blind image restoration, combining diffusion models and the LAControlNet feature.
🔥Run the
@Gradio
demo available on open-sourced Colab here-
✅Project Page-
🌟Introducing 𝐂𝐡𝐚𝐦𝐩: groundbreaking human image animation method!
🕺Leveraging a 3D human parametric model within a latent diffusion framework, Champ achieves unparalleled shape alignment & motion guidance
🔍Capturing intricate human geometry & motion has never been easier
Gradio 3.7 is out! Everything that is new in 3.7 🔽
1. Gradio now supports *batched* function. You can specify a batch size and Gradio will automatically batch incoming requests so that your demo runs on a lot faster on Spaces!
🔥InstantID demo is now out on Spaces.
Thanks
@Haofan_Wang
et al, for building a brilliant Gradio demo for the community🙌
Check out the path-breaking demo now! Here is an example of a Marvel superhero, 🦸♂️
@ylecun
, generated using InstatID within seconds!
GenerativeAI Community is relentless!
💪AudioGradio : A one-click installer for
@Meta
's cutting-edge AudioCraft - with MusicGen and AudioGen built in.
Great work
@cocktailpeanut
🙌 (image credits too)
Find more at:
New release! 3.19 🚀
𝚙𝚒𝚙 𝚒𝚗𝚜𝚝𝚊𝚕𝚕 --𝚞𝚙𝚐𝚛𝚊𝚍𝚎 𝚐𝚛𝚊𝚍𝚒𝚘
💻 nicer UI/UX if you embed a hosted Gradio app (screenshot 👇)
📈 support for Bokeh plots and a native gr.Barplot
🤖 chatbot UI improvements, e.g. single-sided messages
⛓️share links run longer!
📢New research alert: 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering (Guanjun Wu, Taoran Yi,
@JaminFong
et al.)
A technique that can learn dynamic scenes in under 20 minutes🔥
When does the community get to play with a
@Gradio
demo on Spaces?🤩
🤯𝐀𝐧𝐢𝐏𝐨𝐫𝐭𝐫𝐚𝐢𝐭 : a framework for generating High-Quality Animation driven by Audio & a reference portrait image
🚀You can also provide a video to achieve face reenactment.
Gradio Community has already started building tools using AniPortrait. Don't fall behind!👇
𝐂𝐇𝐀𝐌𝐏: 𝐂𝐨𝐧𝐭𝐫𝐨𝐥𝐥𝐚𝐛𝐥𝐞 𝐚𝐧𝐝 𝐂𝐨𝐧𝐬𝐢𝐬𝐭𝐞𝐧𝐭 𝐇𝐮𝐦𝐚𝐧 𝐈𝐦𝐚𝐠𝐞 𝐀𝐧𝐢𝐦𝐚𝐭𝐢𝐨𝐧
💻Model weights on
@huggingface
:
👉Code:
🤔Curious to see Champ in action? Build your app & submit for Spaces GPU grants!
🎨Introducing 𝐅𝐚𝐜𝐞-𝐭𝐨-𝐀𝐥𝐥: a powerful diffusers workflow that lets you customize your face with any style LoRA!🌟
💡Inspired from the Face-to-Many ComfyUI workflow
🖼️Usage: Input a face, Choose a style LoRA, and Get a stunning stylized portrait
🔥 𝐁𝐢𝐫𝐞𝐟𝐍𝐞𝐭 is really good at making masks.
💪 Build your UIs on top of this Gradio app, or maybe just use the app as an API endpoint with Gradio Clients to generate quick masks with the app🎭
🧊Explore Wonder3D, which generate high-fidelity 3D meshes from single-view images.
🙌Great work Xiaoxiao Long, Yuan-Chen Guo,
@chenglin6161
,
@YuanLiu41955461
, and team.
Code-
Gradio Demo-
📢SALMONN - A model that can be regarded as a step towards AI with generic hearing abilities (speech, audio, music).
✅New research from
@bytedance
✅Code-
✅A
@Gradio
Demo is inside the repo.
SALMONN: Towards Generic Hearing Abilities for Large Language Models
paper page:
Hearing is arguably an essential ability of artificial intelligence (AI) agents in the physical world, which refers to the perception and understanding of general auditory…
🎉 Now live! From 𝚐𝚛𝚊𝚍𝚒𝚘 𝟹.𝟿 on, your Gradio app runs 𝐋𝐎𝐂𝐀𝐋𝐋𝐘 in Colab notebooks, which means:
🔒More security since your data stays local
⚡️Your app is lightning fast
Huge props to
@thechrisperry
,
@peteblois
, & the whole
@GoogleColab
team for helping make this🔥
🔥Introducing 𝐋𝐥𝐚𝐦𝐚𝐅𝐚𝐜𝐭𝐨𝐫𝐲
A game-changing framework for efficient fine-tuning of 100+ language models!🦙💯
This powerful tool integrates cutting-edge training methods and allows users to customize their models without coding. 💻
Introducing: Gradio Deploy ⚡️
The fastest way to push a Gradio application from your computer to 🤗 Spaces
Literally two words (𝚐𝚛𝚊𝚍𝚒𝚘 𝚍𝚎𝚙𝚕𝚘𝚢) in your terminal, followed by pressing Enter a few times to get a fully working Space:
📢What if you are no longer limited by image size due to memory constraints
This Paper introduces training-free method 𝐓𝐨𝐃𝐨 that downsamples key & value tokens to accelerate SD inference (2x to 4.5x) for high res like 2048x2048 images.
Outperforms in throughput & fidelity.
🚨Text-to-Video alert: Have you checked out the brilliant new AnimateLCM-SVD space yet?
It is the most recent receiver of
@huggingface
GPU grant. Results are super nice🤩
Demo:
A 500M parameter VLLM that can do multi-turn chat and shows impressive results [ref attached video]🤯
UForm - From
@ashvardanian
and rest of the
@unum_cloud
team.
UForm
Pocket-Sized Multimodal AI For Content Understanding and Generation
demo:
a tiny generative VLM for chat, vqa and captioning with LLM, which has only 500M parameters
🔥Introducing 𝐏𝐢𝐱𝐀𝐫𝐭-Σ: Groundbreaking Diffusion Transformer model🎨
📝Generating 4K images directly from prompts
💪Evolution from PixArt-α to PixArt-Σ through "weak-to-strong training"
🌟Experience great image fidelity and alignment with text
Did you know that you can create neat UIs for Hugging Face models with few lines of code, using
@gradio
? 🤯 In this blog post, we walk you through how to build UIs for your models and share them in Spaces 🤗🧡
Have you checked out the cool Whisper-WebUI yet?
1⃣Generate subtitles from: Files, Youtube, Mic
2⃣In subtitle formats: SRT, WebVTT, etc
3⃣Whisper's end-to-end STT translation From other languages to English.
4⃣Text to Text: Translate subtitle files using Meta's NLLB, DeepL API
𝐏𝐢𝐱𝐀𝐫𝐭-\𝐒𝐢𝐠𝐦𝐚
🤯At just 600m, parameters achieve superior image quality & prompt adherence than bigger diffusion models like SDXL (2.6B) and SD Cascade(5.1B)
Demo video compares results with MJ and Dalle3. Results are Exciting!
Stay tuned🤩for the models and a demo!
PixArt-Σ
Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
In this paper, we introduce PixArt-\Sigma, a Diffusion Transformer model~(DiT) capable of directly generating images at 4K resolution. PixArt-\Sigma represents a significant
Gradio 4.0 launch in 4⃣ hours 🚀
Come join us to see everything Gradio 4.0 has to offer, from 🎨 custom components to ⚡️ blazing fast server-side events.
Livestream:
🎀 Gradio Guides 🎀
The new way to learn how to build machine learning demos!
Guides are:
🤜 Detailed: step-by-step tutorial + code for different use cases (e.g. chatbots)
🤜 Interactive: thanks to embedded Gradio demos
🤜 Linked to Related🤗 Spaces
A new model for converting Images-to-Videos has been introduced: 𝐌𝐨𝐭𝐢𝐨𝐧-𝐈𝟐𝐕
Researchers from: NVIDIA AI, The Chinese University of Hong Kong, SenseTime Research, Tsinghua University, CPII, Shanghai AI Laboratory, Avolution AI
Eagerly anticipating a demo on Spaces.
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
paper page:
introduce Motion-I2V, a novel framework for consistent and controllable image-to-video generation (I2V). In contrast to previous methods that…
📢
@GroqInc
API access is available to everyone now.
Models available: Llama2-70b-4k & Mixtral-32k context window
We went ahead and built a Gradio chatbot with Mixtral and used the 32k window to feed the entire SD3 paper from
@StabilityAI
and asked questions to it. It Went 🚀🚀.
Now that Google has released Gemma, it's about time for
@OpenAI
to launch a commercially available 7B LLM (and/or smaller version) to show us what it's really got!
Psst: Learn everything about using Gemma on
@huggingface
Blog here -