OP ChatGPT corrective prompt:
"this isn't what I wanted. read my previous instructions carefully and try again. start by explaining how this most recent response did not follow my previous instructions, and then try again."
Have it explain what it did wrong as part of its reply
With all the fear mongering about how generative models are gonna steal all the artist jobs, no one's talking about how prompt engineering has created a tangible incentive for people to study art history and learn about different artists and art styles.
The GLAZE technique is specifically adversarial to finetuning. Decided to take an alternate approach "style mimicry" and used PEZ to reverse engineer viable prompts. Seems to work as expected: generated prompts capture content but bork on style. great work
@ravenben
and team!
1/ This might be the most important oil painting I’ve made:
Musa Victoriosa
The first painting released to the world that utilizes Glaze, a protective tech against unethical AI/ML models, developed by the
@UChicago
team led by
@ravenben
. App out now 👇
ByteDance's "Universal Source Separation (USS) with Weakly labelled Data" project deserves way more than 130 stars. The source separation quality and granularity it achieves is really spectacular.
Happy Saturday! New release of my fork of
@RiversHaveWings
KLMC2 notebook adds:
* custom checkpoints
* init image
* keyframing for prompt weights and all supported parameters
* multi-prompt conditioning
* fancy spinning logo!
Gorgeous results here from training a separate motion prior, which has the added benefit that it can be composed with any other pre-trained SD checkpoint. BYOM plug-n-play text2video!
*
*
SDXL-Turbo is cool, but SD-Turbo is almost as good and even inferences faster (base model is sd-2.1 instead of sdxl) but for some reason no one is talking about it.
The "animating with variations" thing was so well received I was motivated to evolve it into an automated music video maker. This animation was made with NO editing. Scene timings inferred completely from video content (subtitles). Notebook coming soon, need to tidy up some :)
A lot of people seemed to be having trouble getting FiLM (an AI for video frame interpolation) working, so I put together a colab that hopefully makes it a bit easier or at least more reliable:
New features in
#stablediffusion
music video automation notebook,
#VKTRS
!
#DreamStudio
API is optional; connect google drive; robust resume; in-notebook spreadsheet for prompt editing, overriding, flagging images for regen... Short tutorial vid (6.5min):
Been dropping teasers for some new comfy nodes i've been working on for the past two weeks. Planning to do a proper release later today after updating the docs. check back in a few hours!
PyTTI-Tools v0.10 release is live! Lots of new features, including: AudioReactivity, ViT-L/14
@336px
, numpy functions in weight formulae, and notebook QOL improvements! Lemme know what I broke :)
Ok so check it out, I think i've already figured out a stupid simple trick for improving quality even further with the new
#texttovideo
model: just run it through the XL model multiple times, varying the strength. Here's after running it through twice at .75 then twice at .7. 🧵
2y ago, I released pytti-tools. hours l8r,
@EMostaque
reached out to recruit me as 1 of
@StabilityAI
's 1st eng hires. Easily 1 of happiest, most validating days my life. The co was SNAFU, but Emad is a gr8 guy who does a lot of good for open source AI. 😢
Although the
#stablediffusion
#AIart
bots don't formally support prompt weights at the moment, there are still several ways you can manipulate prompt influence in multi-component prompts. Here are a few prompt-engineering tricks I've found useful with SD: 1/n
text-to-video is already crazy, and it' still early days. the prompt was literally just "the godfather". this is part of a... "classifier guided random walk" where I am the classifier. still exploring, will share a travelogue soon. safe to say zeroscope has untapped potential.
StabilityAI just lost the CompVis team, the people responsible for the Stable Diffusion and Stable Video architectures.
* Robin Rombach
* Andreas Blattmann
* Dominik Lorenz
Scoop:
@robrombach
, one of two of the original developers of Stable Diffusion has quit Stability AI. Rombach leaving the company represents the departure of the person responsible for the tech that made the company famous:
Exciting news: I've joined
@CoreWeave
's ML team!! I worked closely with their lead Wes Brown while driving early DreamStudio backend development
@StabilityAI
and am super looking forward to working with Wes full time. Open Source AI goes brrrrrr!
Made a video tutorial to help folks get set up with the new zeroscope_v2_XL
#texttovideo
madness that's been making the rounds. Setup is the first 5 minutes, all the links you need are in the description.
32 frames from the 14 frame image-to-video SVD model??? Yes, you can indeed use the last frame of the output as the input condition for another round of video generation, BUT CHANGE THE SEED FIRST. Discussion and comfyui workflow here:
I'm totally exhausted from the launch, but I had a weird idea and I had to try it. Cobbled together a little experiment demonstrating how you can interpolate in "prompt-space" with
#stablediffusion
:
thanks
@KaliYuga_ai
for reminding me: my fork of
@RiversHaveWings
'
#KLMC2
nb had new features I forgot to merge and share!
* Archive old work instead of deleting!
* Resume! Choose starting frame!
* Naive video upscaling (for better encoding on socials)!
The ability to drop in a generic SD LoRA for text-to-video is quite a super power. pre-LoRA, I was getting all shutterstock-watermarked outputs a la modelscope. Add a LoRA previously trained on text-to-image: BOOM, cinematic animation.
Very interesting looking tool, makes it easier to interact with ComfyUI workflows via a more standard form UI, while also giving you the ability to modify the workflow graph as well. Also has the ability to wrap workflows into reusable "actions"
madlad on the
@AiEleuther
discord has SDXL-Turbo at 50FPS on their hyper-optimized custom inference backend. This video is not sped up. It really generates images that fast.
Finally figured out how to speed up my
#sdxlturbo
frontend! It's so fast that the only way to show the actual speed is to delete the prompt, since I can't type fast enough 😆 .. built with next.js frontend & tensorrt backend.
This could be the beginning of a new publication paradigm. Citation, model, data, and code provenance all living in the same space. Looking forward to the future of diff-able research!
New favorite ChatGPT prompt: "please implement ... following functional design principles and satisfying the user stories listed above. respond only with functioning python code and a full coverage test suite. when the implementation is complete, end with the phrase "ship it!"
Dad has dementia. mom just had surgery to remove cancer, poor prognosis. just picked dog up from vet, probably has cancer. check email after getting home from vet: best friend's dad passed unexpectedly. Not sure when I stepped on to this ride, but I'd like to get off now thanks.
Yesterday was my last day at StabilityAI. Assured it had nothing to do with performance. Just dropped like a dime Friday afternoon. Neither mgr nor skip were consulted. Plan to continue making free AI tools, you can support my work here:
succesful test using huggingface's diffusers library in the music video automation notebook! Calling it a night: api-optional notebook coming your way tomorrow, bright and early :)
Want to integrate
#stablediffusion
directly into your notebook work? Take our new SDK pacakge for a spin! Check out this colab for a simple usage demo:
Delighted to announce the public open source release of
#StableDiffusion
!
Please see our release post and retweet!
Proud of everyone involved in releasing this tech that is the first of a series of models to activate the creative potential of humanity
KLMC2 aging a portrait of the late queen by incrementing her age in the prompt. The sampler seems to identify the "aging trajectory" very quickly and so she already looks 100 by the time the prompt is asking for a portrait of her at 50, but still: lots of potential here!
95% of nascent AI startups are just a simple prompt that wraps an API call. This business model is unsustainable, hinging on temporary lay ignorance of AI accessibility. Users will leave as laypeople get accustomed to using LLMs directly.
... software library coded in the style of gerald sussman and bjarne startstroup, optimized for readability, LGTFM, ship it, passes all tests, no errors, stateless, trending on artstation
"Low background steel" is steel made prior to the detonation of the first nuclear bomb. It's important for making sensitive equipment. The same way "1945" is a cutoff year for steel, I bet "2022" will be a cutoff year for reliably human-generated training data.
New shoes with my
#stablediffusion
#aiart
printed on em!!! Letting em air out a bit (gdamn this printing process stinky), but excited af. Maybe I'll take em dancing tomorrow?
debugging plots and animations available now in my (more fully featured) fork of
@RiversHaveWings
' amazing
#KLMC2
#stablediffusion
notebook! visualize precisely how settings changes and timings impact the generated output
experiment overplotting prompt weights and step size together. probably would be better as separate subplots (ugh). def need to at least make sure they share x-axes (there should only be one moving vertical line). making progress anyway.
Great news! Vet just called with post-op histopathology results: the tumor was small and they're confident they got it all! Here's Marley as a sea monster to celebrate
I like building complex animations parameterized with keyframes and parameter curves, but I find the notation to be burdensome. The motivation of these nodes is to facilitate parameterizing keyframed animations, but leaning on the node UX as much as possible 2/
i'm increasingly coming across anecdotal descriptions of people for whom making AI art is therapeutic. any art therapy researchers investigating? would love to see some numbers to go with the anecdotes. bet there're also some novel interventions waiting to be discovered here too.
Twitter isn't the only place where you can "follow" researchers to keep up with cutting edge AI projects: I get a ton of my news from Github directly. Here's a starter pack of users I follow whose "starring" activity is one of my most valuable news feeds. 1/n
Something I dislike about the whole NFT thing is the tacit implication that an art piece has no cultural value unless it's also a commodity. A lot of influential generative artists are completely ignored by the art show/museum circuit because they don't mint.
Oh hey whaddayaknow, my KLMC2 fork supports custom output resolution. Also updated the demo to illustrate how to do a traversal with an accelerating step size
Fun fact: remember that "multiple passes" trick i figured out for improving modelscope outputs? Yeah, it works pretty great for AnimateDiff too.
third pass:
cranking up the guidance scale on modelscope's text2video produces better outputs (imho) and lets you get away with fewer timesteps. This 64 frame video was generated from just 27 steps at cfg 50 (model defaults are steps=50, cfg=9, frames=16)
Excited to present our new paper "GaussianEditor", a new
#3D
#gaussiansplatting
#editing
framework! GaussianEditor provides controllable, diverse, and interactive high-resolution 3D editing, needing only 2-7 minutes.
Project page 👉: