If OpenAI is going to continue to eat AI startups sector-by-sector, they should go public ASAP.
Building the new economy where only 500 people benefit is a shit future.
It's gets even worse - these copyrighted books are being used to train human writers! A widespread underground network of writing classes and workshops are studying books and repurposing them without royalties or attribution. Shameful.
NEW: Meta, Bloomberg, and EleutherAI have trained generative AI on a dataset including upwards of 170,000 pirated books from authors like Stephen King, Zadie Smith, Margaret Atwood. Legality is complex. We have new details and context. tip
@Techmeme
GitHub Copilot exists despite the AI fear on display this week.
Fears of productizing Copilot almost halted its creation within OpenAI and led directly to the forking off of Anthropic.
Today Copilot helps a lot of people. It would not exist if fear limited access to AI.
Host your own LLM. It’s not that hard. You just need a GPU from the last few years!
Finetune and you can shape the exact qualities you want. All it takes is some input-output pairs. Simple!
Train your own small models and fill all sorts of niches. They’re super cheap and fast!…
I liked $10 and the price sensitivity meter confirmed. $10 means almost everyone can afford Copilot, people rarely cancel, the human feedback data flows, and competitors have slim margins. You’re welcome 🫡
Here’s why your agent doesn’t work:
- compounding errors
- no trajectory RL
- reality doesn’t few shot well
- APIs don’t do enough
- irrelevant context hurts
- subchains destroy nuance
And that’s just what we know about.
We’re at the beginning of human-computer interaction all over again, at a time before PARC.
Significant chance the idea of a “product” or a “UI” doesn’t make sense in a few years.
Exited founder
high school dropout
principal engineer
Hiked 2k+ miles
Lifted 500 lbs
Fell in love
40+ countries visited
Cancer survivor
Hitchhiked across SE Asia
Been rich, been poor
Seeking does not lead to finding.
It’s always an inside job 🙏🏻
Treating engineers as commodity is symptom of promoting to mgmt track too early. Predictive in every zombie tech co. The answer is small teams, high autonomy, and firing fast.
I think Yann might underestimate the potential of AI if people have API access to strong generative AI. LLMs are capable of generating code which could be executed *automatically* by *anyone* without any human *oversight*, also in a loop and open-endedly.
This is very hard to…
Q*: Trading inference time MCTS for model capacity.
Meaning you can spend 1000x time picking the next token, to approximate a model 1000x the size.
Which you then distill down to a today-sized model. Implications for sample efficient self-play.
I wish I understood what any of…
Copilot was made using startup principles, with a tiny team, in under a year, inside very dysfunctional GitHub/MSFT org.
Imagine how much good could be unleashed within existing orgs if they put trust in individuals instead of hierarchy.
Talked to a programmer today who said AI coding tools made him about 10x more productive. Though 10 seems like a round number, this was an attempt at a precise estimate.
Will go a little further than
@simonw
.
You can run a 65bn param model on your Mac. In a few weeks there will be serviceable Copilot and chatGPT you can run yourself.
We are now in the awesome timeline.
As you become an adult, you realize that things around you weren't just always there; people made them happen. But only recently have I started to internalize how much tenacity *everything* requires. That hotel, that park, that railway. The world is a museum of passion projects.
Story time. In 2016, I deployed my first AI system. I had wanted to work on AI Personal Assistants, and joined a text-based PA company.
We trained a small seq2seq model on all the chats, and for the PAs it would suggest responses. (Kind of like Copilot grey text).
It worked…
Part of the motivation with meetups is to remove the class divide between AI tinkerers, practitioners, researchers. Also break the Bay Area centrism. Make place to share openly and build high-trust community. Mixed results so far, looking for new ideas!
No one is asking why an App Store is so important to OpenAI - it’s required for their version of AGI.
It will teach the model when and how to use external APIs, which they can weave together as demonstrated with the code-interpreter/dall-e APIs.
Come stay in my scenic SF apartment Airbnb, 2x8 A100 80gb nodes included, great views and 5 min walk to the Mission. Relax while you train models in urban luxury. - $1500/night
If you care about AI Safety, fire up LLaMA at home. Start poking around at activations. Make open analysis tools and test suites.
Where do slurs, apocalypse ideas, biases live in the network? How to remove them without hurting perf? YOU can help make AI safe.
Google is hiring smart people to work on agents because it's the one thing that can disrupt their monopoly on the web. A fundamentally defensive strategy.
Generate example task
Break down into steps
Loop (
Generate code for step
Run code
Expose error
)
Save working code as kshot example
If this is exciting to you, we're hiring
@ai_minion
🤖
Looking for a web-focused full stack senior engineer to join
@ai_minion
in SF. Willing to salary match.
Join us to work at the bleeding edge of AI: prompting, fine-tuning, synthetic data, codegen, planning, reasoning, and memory for embodied agents.
jobs+eng
@minion
.ai or DM.
Why you should never use pgvector (e.g.
@supabase
Vector Store) for production:
😮 pgvector is 20x slower than a decent vector DB (e.g.
@qdrant_engine
)
🤯 And it's a full 18% worse in finding relevant docs for you
And this can happen at as little as 10K documents when chunked!
Has anyone looked into “attention expansion” where you’d replace a large section of prompt with an ellipsis token, and if attention on ellipsis is high you expand with original content? Ideally automatically without having to repeat inference.
One way to think of the Open in OpenAI is that they are teaching a lot of devs about ML/LLM through the pacing of their releases.
Prompting, context length, attention, RAG, finetuning, param count, datamix, tok/s... all these concepts unknown outside academia a year ago.
Something different about AI projects: it’s best to iterate by starting over from scratch when you find a better abstraction. Just cannibalize the old stuff and delete it. Inaccuracy compounds with the wrong metaphor & slows dev down.
The people tinkering with small LLMs are undervalued by industry rn: AI apps become ensembles of small, fast dedicated models in support of big honking models.
People dunking on Google for messing up historic images should spend a day imagining how to solve the problem.
Draw out a little decision tree to see how hard it is. Fact vs fiction vs diversity vs helpful vs problematic isn’t easy to calibrate.
Best possible solution: upload all public tweets in parquet file daily. Keep the login wall up permanently. Minimal ops impact, zero incentive to scrape.
@TimSweeneyEpic
Several hundred organizations (maybe more) were scraping Twitter data extremely aggressively, to the point where it was affecting the real user experience.
What should we do to stop that? I’m open to ideas.
Emerging combination of LLM + RL + codegen for (agent || robot) is an interesting unification I didn’t expect.
Seems like many/most AI systems in the future will be some variant of this?
I think there’s a serious new inequality introduced by Copilot et al: adding code is much easier than refactoring. Previously they were comparable.
Top programmer now means keeping the system in your head, not being lazy, finding clean abstractions.
+5% on evals this weekend. Finetuning on all data, then again on high quality. TBD is boost from DPO - gives +8% on small model.
Proved we are data limited. Scaling up self-play next week.
New canned response when people ask me to help them with their product:
“Yeah I don’t care about any of that marketing crap. Tell me about the core technical problems, how you’re breaking them down to solve them, your key metrics, and where your baseline performance is today.”
When I met
@pmarca
he asked me about how multi-agent systems would collaborate in the future.
My response: I’m just trying to order pizza online reliably 🤣
@markchen90
Consider an outsider's perspective: OpenAI has progressively published fewer details on it's models. It was non-profit, now for-profit. It routinely ships features that its customers sell. Now it it contributing to hindering others from replicating.
i implemented Self-Extend on Mistral 7B last night
it extends context length without fine-tuning from 8k to 16k and more using a new bi-level attention technique based off group attention
code: paper:
Simple LLM technique that helps a lot (but you might not be using): add a constraint checker to ensure valid generation. On violation, inject what was generated and the rule violation, and regenerate.
Excited to see everyone at The Commons in SF tonight starting at 6 🤗
Based on the austin/seattle meetups we expected ~20, and rented a space to support 80 people. We currently have 450 people who have RSVPd 🤯
I’ll be at door to personally apologize to people we can’t let in…
That feeling when everything aligns and you’re 100% certain you’re holding the future. Cosmic? Enlightened?
Felt it with ghost text prototype for Copilot. Felt it last week with
@ai_minion
. Cannot wait to get this in people’s hands.