Stanislas Polu Profile Banner
Stanislas Polu Profile
Stanislas Polu

@spolu

13,790
Followers
607
Following
415
Media
8,676
Statuses

_co-founder+engineer(), _alumni(, , , )

Paris
Joined November 2007
Don't wanna be here? Send us removal request.
Pinned Tweet
@spolu
Stanislas Polu
5 months
A list of predictions for 2024 for the field of LLMs🧵
@spolu
Stanislas Polu
1 year
A list of predictions for 2023 for the field of LLMs🧵
47
108
707
2
15
137
@spolu
Stanislas Polu
1 year
ChatGPT is not (only) about the model, it’s (mainly) about the UI/X. Many shared their amazement at capabilities that have been here for year(s); because ChatGPT’s UI made them accessible… Reinforcing the idea that AI deployment is more about product than research at this point.
36
156
1K
@spolu
Stanislas Polu
4 years
Posted my first paper on arXiv💥🙌 GPT-f is a Transformer-based automated theorem prover. We show that Transformer + Search is suitable to formal reasoning and continuous self-improvement 🦾
Tweet media one
Tweet media two
Tweet media three
Tweet media four
16
195
903
@spolu
Stanislas Polu
1 year
If ChatGPT is slow for you too, you have a fast paid alternative now... and it does much more!🙃 👉
Tweet media one
29
82
826
@spolu
Stanislas Polu
2 years
When I started this project 2 years ago I couldn't have dreamt of us getting that far. But this is also only the beginning💥 Some thoughts on what we achieved so far 🧵
@OpenAI
OpenAI
2 years
We trained a neural network that solved two problems from the International Math Olympiad.
Tweet media one
168
1K
9K
15
97
752
@spolu
Stanislas Polu
1 year
A list of predictions for 2023 for the field of LLMs🧵
47
108
707
@spolu
Stanislas Polu
3 months
Some day I’ll tell my grand children, « I was there » 🫶 reasoning team, OpenAI 2019
Tweet media one
11
16
693
@spolu
Stanislas Polu
1 year
For the past couple weeks I've been experimenting with a new way to interact with LLMs: a GPT-based assistant that has access to my browser tabs content. It's called XP1 🧵 ... and It's now available here:
33
73
635
@spolu
Stanislas Polu
1 year
Every now and then I need to remind myself of the details of the self-attention mechanism. This picture (the dimensions in particular) is generally ~all I need to recover everything. Also this primer is fantastic, among the best I’ve seen:
Tweet media one
10
74
566
@spolu
Stanislas Polu
2 years
I’ll make a bold statement: creating a delightful AI assistant is not anymore a problem of getting smarter models. It is a now product problem. Better models will help but the main blocker is 100% a product problem at this point.
16
32
456
@spolu
Stanislas Polu
2 years
Request for (AI) Startups A 2022 list of AI-related startup ideas (I would use or see myself working on if I had the time)🧵
19
71
436
@spolu
Stanislas Polu
2 years
Yesterday was my last day at OpenAI. I had the most fantastic experience there over the past 3 years. I deeply believe everything I say in my farewell email. @OpenAI truly is an exceptional place to work on AI. This is Day 0 of a new project for me, code-named "Dust" 🚀
Tweet media one
15
13
419
@spolu
Stanislas Polu
1 year
A thread on what I believe are the 2 hottest AI/LLM research questions for 2023 🔥
9
46
409
@spolu
Stanislas Polu
1 year
This paper is quite under-rated IMHO: It shows that fine-tuning only the input/output layers of a large pre-trained LM leads to comparable performance to fine-tuning. Related is Learned soft-prompts:
8
62
393
@spolu
Stanislas Polu
7 months
I quit my job, shipped a product in 3 weeks, convinced my former co-founder to join, ported 100k lines of JS to Typescript, built a team, debugged the finickiest Rust eventloop bugs, re-oriented the product 90 degrees, shipped it to 1000s of end-users, convinced their bosses to…
Tweet media one
10
8
312
@spolu
Stanislas Polu
1 year
Dust is now officially a 👨‍❤️‍👨 (2 person show) I started and sold a company to @stripe with @gabhubert a few years ago (hum… a decade tbh). He’s the sharpest and most product driven person I know🔥 Dust up until now was just a warm-up. Now we’re getting serious. Game is on!
30
4
281
@spolu
Stanislas Polu
11 months
But Paris has all the talent 🇫🇷
@jordnb
Jordan Burgess
11 months
London now has offices of OpenAI Anthropic Deepmind Inflection Humanloop FAIR Cohere Conjecture 🇬🇧
30
127
1K
15
17
256
@spolu
Stanislas Polu
1 year
"No GPUs before PMF" No amount of GPUs going brr will produce smthg that people want. Hype yes, FOMO yes, PMF unlikely. The best models are available behind APIs or OS. Will you repro years of research in months? Unlikely. Are they expensive? Yes. Is that an issue early on? No.
25
20
254
@spolu
Stanislas Polu
1 year
A lot of people have been asking me lately about summarization of long documents. Created a Dust app to help people understand the technique: The steps are easy to follow, perfect intro to understanding how to circumvent LLMs limited context size.
12
28
248
@spolu
Stanislas Polu
1 year
ChatGPT is the Pong of LLMs. Next we’ll inevitably go through a trough of disillusionment… … but imagine, one day we’ll get the DOOM, Civ, Red Alert, and Counter Strike of LLMs🤩 Let alone multiplayer modes🔥
22
14
217
@spolu
Stanislas Polu
1 year
As LLMs get bigger and smarter they will also get more expensive and slower, restricting their uses to the most complex tasks. But such tasks are harder to review by humans, so our tolerance to errors will decrease. This is the potential "LLM uncanny valley hypothesis"📉📈
7
16
197
@spolu
Stanislas Polu
1 year
It's game over. Fabrice Bellard[0] is into LLM inference now: h/t @natfriedman Look no further for cost-saving and speed🔥 GPUs go brrr. [0]
9
14
162
@spolu
Stanislas Polu
1 year
There will be an opensource Chinchilla-style LLM released this year at the level of text-davinci-*. Maybe not from the ones we expect🤔This will obliterate ChatGPT usage and enable various types of fine-tuning / soft-prompting and cost/speed improvements.
2
5
145
@spolu
Stanislas Polu
1 year
I've used this math problem and others to track models performance (and "hence" size): 👍 latest GPT-3.5 gets it right ~50% of the time. 🔥 GPT-4 gets it right ~100% of the time. 🙃Bard got it right ~10% of the time.
Tweet media one
7
8
121
@spolu
Stanislas Polu
1 year
AI assistants will pop-up in every existing product silo. Existing AI-first company defensibility will be battle-tested, big time.
2
1
110
@spolu
Stanislas Polu
1 year
Oh and we finally get good papers on "science of fine-tuning", aka, scaling laws on fine-tuning based on pre-training / fine-tuning compute and pre-training / fine-tuning data size
10
1
105
@spolu
Stanislas Polu
1 year
XP1 has hit 100 paying customers 🚀 🦾EAs are using it to generate call preps bios 🗃️Investors are using it to turn list of companies in emails into CSV 🏗️Customer support operators are using it to generate replies based on stock availability How will you use it? Next steps👇
5
8
102
@spolu
Stanislas Polu
1 year
Maybe RLHF is not only a great alignment strategy but also a powerful distillation strategy. Let me explain...
3
5
100
@spolu
Stanislas Polu
1 year
It has to be right. Right? Is ChatGPT a new text-based OS? Are we going to speak/write to our computers all day? Or are we still going to use UIs/UXs ? UIs/UXs but they have to be reinvented with LLMs in mind. Some thoughts on that question and the future of computing.
@thesephist
Linus
1 year
This is your periodic reminder that user interfaces are important, and text is a good lowest common denominator, not the endgame. The world and our senses have a lot more to offer.
16
27
382
17
7
95
@spolu
Stanislas Polu
7 months
Outer model loop is probably the most under estimated research opportunity. CoT, ToT, Code interpretation, is not the end of the story, more likely the beginning. I genuinely believe that any kid in a garage with 10k of credits has a fair chance of defining the future of this…
@ChrSzegedy
Christian Szegedy
7 months
I think Yann might underestimate the potential of AI if people have API access to strong generative AI. LLMs are capable of generating code which could be executed *automatically* by *anyone* without any human *oversight*, also in a loop and open-endedly. This is very hard to…
43
64
481
4
10
93
@spolu
Stanislas Polu
1 year
Product idea: recursive summarisation as an inference to consume content💡 Start from the summary, click sentences to expand into intermediary summaries, recursively all the way down to the original content. Deep-dive & rapid fly through. Any content.
@spolu
Stanislas Polu
1 year
A lot of people have been asking me lately about summarization of long documents. Created a Dust app to help people understand the technique: The steps are easy to follow, perfect intro to understanding how to circumvent LLMs limited context size.
12
28
248
10
11
90
@spolu
Stanislas Polu
1 year
"Memorizing transformer style" or similar approach become predominant as they are always up-to-date and infinite in size. They enable a new wave of products that can operate on entire companies data or entire codebases, or entire chat history.
5
8
92
@spolu
Stanislas Polu
1 year
Context-size becomes the clear limiting factor to many use-cases. OpenAI releases a massive context-size model in 2023 and create a moat for 6 months, time for the opensource model to catch-up and even prevail, likely with a memorizing transformer style approach.
3
2
91
@spolu
Stanislas Polu
1 year
Agreed. All I want(ed) for Christmas is the following API: - create vector db - add doc to vector db - completion(prompt, vector_db)
@danlovesproofs
Dan Robinson
1 year
RETRO models are a giant capability unlock for LLM tech, and they're shockingly under the radar. The first ones should come out this year. They might be even more significant than GPT-4.
15
45
371
8
5
87
@spolu
Stanislas Polu
4 years
GPT-f also found new short proofs that were accepted into the main Metamath library, which is to our knowledge, the first time a deep learning based system has contributed proofs that were adopted by a formal mathematics community.
2
13
87
@spolu
Stanislas Polu
1 year
(as per above) ChatGPT will remain free until it dies out. Replaced by a paid-for equivalent based on (a) better model(s). The balance of the force will be restored at this stage. So it's not a so silly strategy to build a product for free on ChatGPT 🤔... dangerous game still.
3
2
87
@spolu
Stanislas Polu
10 months
An intuitive high level take on why RL does not work (quite yet for real) for language modelling. It’s super hard to “generate knowledge” with MCTS or RL applied to LM. RL on LM works to align (like a final late sanding of the model) because it just rewires the model against…
@typedfemale
typedfemale
10 months
i swear i've seen "use MCTS for language modeling" idea like 6 different times on twitter but no one has tried it because MCTS is a major pain in the ass 😜
Tweet media one
17
0
99
2
18
80
@spolu
Stanislas Polu
3 years
📓Today we're releasing The goal of the project is to provide an overdue shared benchmark to evaluate and directly compare automated theorem proving systems based on the formal systems targeted, initially Lean, and Metamath.
2
15
83
@spolu
Stanislas Polu
1 year
By the end of 2023 at least one state-actor start a program to compete with OpenAI, Meta AI and Google/DeepMind.
6
5
81
@spolu
Stanislas Polu
1 year
AGI does not pop up yet (even if one of the big labs have an internal LLM AlphaZero moment in 2023). Investor craziness dies out, short-context LLM assistants (as per above #1 #3 ) are heavily commoditized. Still 1-2 massive companies start to emerge in the B2B space #product
1
2
77
@spolu
Stanislas Polu
11 months
It takes village to raise a child. We have the talent and we're only getting started. 🇫🇷🦾
@Laurent_Daudet
Laurent Daudet
11 months
Deeply honored to have been invited, together with top AI experts, at Élysée palace to discuss AI strategy with President Macron.
Tweet media one
8
8
132
9
4
76
@spolu
Stanislas Polu
1 year
@johnjnay @TheEconomist @stateofaireport 3k papers per year vs 1 ChatGPT
5
1
69
@spolu
Stanislas Polu
9 months
There’s a fascinating paper from the early days of OpenAI written by @ilyasut and @johnschulman2 from which most of this talk is inspired. I thought about it many times since I left OpenAI. Now that it is 99% spoiled by this talk, would love to see it released 🙏
@DrJimFan
Jim Fan
9 months
There're few who can deliver both great AI research and charismatic talks. OpenAI Chief Scientist @ilyasut is one of them. I watched Ilya's lecture at Simons Institute, where he delved into why unsupervised learning works through the lens of compression. Sharing my notes: -…
55
433
3K
8
5
68
@spolu
Stanislas Polu
2 years
After 4 weeks of intensive coding, v0 of Dust is live with a first demo app! I've poured my heart in that product, now is time to get feedback!
@dust4ai
Dust
2 years
Dust is Live🔥 As a first demo: a Dust app showcasing the now famous capability of models to generate code to answer math questions. (some comments in 🧵)
6
25
119
3
3
67
@spolu
Stanislas Polu
1 year
LLM based assistants will automate more complex tasks than current workflow automation. But another (less obv) value is that they might automate the automation process, reducing the cost of automation to ~0 (even for simple tasks) Unclear which one will unlock the most value?
5
6
66
@spolu
Stanislas Polu
1 year
Generative AI capabilities are moving extremely fast. Don’t fear future models, embrace them. I think it means some amt of: - Building integrations - Exploring truly novel UX - Inserting yourself at the point of intent Future improvs will be tailwinds for you not tsunamis.
5
5
65
@spolu
Stanislas Polu
7 years
Developer friends! Wish you could interact with co-workers with higher bandwidth than slack or video-conference?
Tweet media one
2
18
61
@spolu
Stanislas Polu
2 years
"Autonomous vehicules from non-egocentric sensors in cities" Prediction: any company can leapfrog Tesla (massive dataset) in cities by building an AV stack from non-ego centric sensor inputs (cameras on buildings instead of the car). Infinite stream of expert trajectories.
4
1
62
@spolu
Stanislas Polu
1 year
This is the first research question. Is it possible to train a model to productively attend to a database of content that is not drawn from its training distribution but from a database of arbitrary data, like internal proprietary data, emails, slack messages, ...
5
6
65
@spolu
Stanislas Polu
1 year
I hear left right and center, and believe myself, that there is a super strong need in the market right now for a Chinchilla style RETRO-style model. First, chinchilla style, because we want davinci-003 performance at 70B parameters (which ~ what saturates 1 GPU at inference).
2
0
62
@spolu
Stanislas Polu
2 years
Broadly speaking our work demonstrates that neural networks are indeed capable of advanced mathematical reasoning when used in conjunction with a verifier (the formal system).
1
4
61
@spolu
Stanislas Polu
1 year
Asking for a friend... what if they're also ex-Stripe?😂
@bonatsos
Niko Bonatsos
1 year
2/ - Did the founders work at OpenAI and already have a product in place at their newco? Let’s make them a unicorn! #vclogic
1
1
24
6
0
56
@spolu
Stanislas Polu
1 year
LLMs performing useful tasks will accelerate w/ the emergence of “LLM apps”: structured orchestration of calls to LLMs and external data-sources. @dust4ai is a great place to design and deploy LLM apps. But for now such apps have not been “accessible” like ChatGPT is… 👇
1
4
57
@spolu
Stanislas Polu
10 months
As shareholders of @dust4ai it was no question we had to make a job offer to @uzpg_ at the end of his freshman year internship. Despite him being only 19 yr old, by Stripe or OpenAI standards, he already is an incredibly accomplished and mature software engineer with deep…
@uzpg_
Uzay
10 months
Finished my internship at @dust4ai today! was a great experience, really enjoyed working with everyone and being mentored by @spolu , and am excited about where they're going. I worked on Gens (see image), a writing interface that does a deep search on your data, where (1/n)
Tweet media one
4
2
52
1
4
53
@spolu
Stanislas Polu
1 year
If ChatGPT is Pong for LLM, this is at least Space Invaders level🤯 - Facts extraction - Automated "Database" creation and population for quantitative questions - Recursive Q&A for qualitative questions 👏 @markiewagner @yasyf
@markiewagner
Markie Wagner
1 year
@yasyf and I built Summ, an open-source tool that provides intelligent search and question-answering across large sets of transcripts. ⚡️ We turn your unstructured transcripts into queryable SQL and JSON! Try it out and read how we did it here:
21
38
253
1
7
56
@spolu
Stanislas Polu
9 months
There’s a world in which we don’t need larger models but new algorithms on top of existing models that already exhibit base reasoning capabilities. Let’s not forget that AlphaZero is at the same time a pretty weak player without MCTS and basically unbeatable by a massive…
5
6
55
@spolu
Stanislas Polu
1 year
The company, Permutation Labs incorporated on 23/02/2023. True story. Under MUH (Ultimate Ensemble Mathematical Universe Hypothesis) coincidences are highly unlikely! So either Dust is very special or our Universe is 🙃
@dust4ai
Dust
1 year
Dust is now incorporated \o/ The company name is « Permutation Labs » for all the ones that have a « theory » as to why 🤟 We’re building a very special team to tackle a very special problem, with very special partners. Don’t be a stranger. Reach out.
3
2
70
1
2
52
@spolu
Stanislas Polu
2 years
We need a system that connects models to structured LLM apps and access rights to the services where users data live and where they want to take action. …and a mechanism to orchestrate these interactions, probably, once again, leveraging one central LLM app.
Tweet media one
2
3
56
@spolu
Stanislas Polu
2 years
The most exciting detail to me is that we use language models that are capable of generating original mathematical terms required to prove most interesting statements. So the system demonstrates ingenuity 💥
1
3
53
@spolu
Stanislas Polu
1 year
Lots of hints on the future of ChatGPT in We can expect web navigation and iterative code execution at least. What else? 🤔 That is the question🔥
Tweet media one
1
5
55
@spolu
Stanislas Polu
2 years
"Image/Video editing from text" The tech is ready all we miss is a great product.
9
2
53
@spolu
Stanislas Polu
2 years
Using LLMs with 4096 Tokens contexts feels strangely analogous to coding a game with 4096 KB RAM. Mastery and trickery will advance the state of what we can do until we get our GBs of context. Since you're asking, yes I did listen to the 5h of @ID_AA_Carmack / @lexfridman 🙈🙉
5
3
55
@spolu
Stanislas Polu
2 years
We’re entering a new weird era where the dilemma « should I go solve X » or « should I go work on foundational AI so that machines solve X » will become harder and harder to resolve. And it probably only gets weirder from here.
2
2
52
@spolu
Stanislas Polu
6 months
@sidorszymon @sama @miramurati Nice opportunity to re-negotiate your compensation 🔥❤️
1
0
52
@spolu
Stanislas Polu
10 months
This is the usage of the Dust product suite, basically the metric we're optimizing for. It's so early that this will look insignificant soon. But if there's something awesome about working on a startup project, it's seeing the metric you optimize go up and to the right✨
@dust4ai
Dust
10 months
📈
Tweet media one
1
1
18
5
2
50
@spolu
Stanislas Polu
1 year
There’s been a disturbance in the force today. The future of Paris AI is incredibly bright.
3
6
53
@spolu
Stanislas Polu
1 year
🤔
Tweet media one
@spolu
Stanislas Polu
2 years
Prediction: the first individual business to become a unicorn will start in 2022. It’ll be heavily based on Language Models and a product that has yet to be built and consists in something like PromptChainer () meets Zapier.
3
1
42
4
5
52
@spolu
Stanislas Polu
1 year
Second, RETRO-style, not to lower perplexity (attend to the training set in the style of RETRO), and this is crucial, but to be able to condition on arbitrary, hot-swappable, vector databases.
4
1
52
@spolu
Stanislas Polu
2 years
{Figma} ⊕ {Zappier} for LLM productization. "Prompt engineers" paradise: Let's you connect model APIs, define processing steps with prompts, manage few-shot datapoints for each; exposes an endpoint or a stream in the end.
6
1
51
@spolu
Stanislas Polu
2 years
We built a neural theorem prover that is just better than I am on a large variety of problems. It can solve problems that I was never able to solve (though I'm not a math competitor). It also fails at times on problems that look obvious to me. So we still have a long way to go!
1
2
50
@spolu
Stanislas Polu
1 year
As users, we stop "experiencing" the magic of scale as larger and larger models are released (after all it's a power law).
2
1
49
@spolu
Stanislas Polu
2 years
"A game platform where you train agents and make them compete" Tamagoshi meets Fantasy Football meets Competitive gaming. I checked it's possible to train a decent pong agent (in browser) in under 30 games and it's fun to watch compete Some ideas:
3
3
49
@spolu
Stanislas Polu
6 months
The pitch to VCs: 1. Build an inference API 2. Profit The reality: 1. Build an inference API 2. Find GPUs 3. Add support for top 10 open source foundational models 4. Build your alignment pipeline and chat format 5. Add function calling 6. Make it stateful (you can continue…
3
3
46
@spolu
Stanislas Polu
2 years
"Support team AI augmentation" This one I wouldn't use nor build myself. But with current tech, more than 1-2 support human per 10m users is just a shame. Yes less than 1k humans to support to the entire world. Most productive support teams use TextExpander today 🤔
2
1
48
@spolu
Stanislas Polu
1 year
10Go was pure madness when GMail was first worked on. Google was ready to bankroll their users until storage cost decreased, to create one of the most successful product ever. What is the equivalent for the LLM age? A list of ideas 🧵
3
4
48
@spolu
Stanislas Polu
9 months
Oh! Officially a published first author at a major AI conference. It’s my first “real real paper” (miniF2F was @KunhaoZ and GPTf is unpublished), so I’m kind of psyched. 🤙 @jessemhan @KunhaoZ @ibab_ml @ilyasut Best co-authors.
6
2
41
@spolu
Stanislas Polu
2 months
People ask me (often with a weird look): "But dude, why did you leave OpenAI ??" This is why 👇
@dust4ai
Dust
2 months
The core motivations behind building Dust, how we believe the generative AI ecosystem will evolve and why we are building products at the intersection of models, human and productivity ⤵️ List of the main hypotheses we made so far
3
7
33
2
2
46
@spolu
Stanislas Polu
3 months
We made two hard bets with Dust: - An horizontal platform with access to all the SaaS relied on by our users (Notion, Github, Slack, Drive, Intercom, ...) - Not one Assistant, but many Assistants specialized on specific tasks. - Capability to do semantic rertieval but also…
@dust4ai
Dust
3 months
Éléonore's strategic use of Dust at Pennylane showcases a future where AI and human expertise combine to scale customer care without compromise. Curious? Dive into their journey 💡🖤
2
5
14
2
3
47
@spolu
Stanislas Polu
2 years
"Copilot but for your entire OS" This one is an oldy but it comes back every now and then. Some old ideas here: By-product: opportunity to reinvent the OS
3
1
43
@spolu
Stanislas Polu
2 years
Day 0 update: Dust has a website now 🔥
4
1
44
@spolu
Stanislas Polu
6 months
It's not just about doing RAG on top of company data. It's about building a product that unleashes the creative power of hundreds of employees looking to augment themselves, and catalyzes the dissemination of these new assistive use-cases across teams. We're the best bet a…
@dust4ai
Dust
6 months
Alan ( @avec_alan ) has been a tremendous partner. We're proud to enable hundreds of Alaners to equip themselves with super powers in their daily work life. The numbers speak for themselves and are evidence that the right product put in the hands of the right people can realize…
Tweet media one
1
2
32
4
3
45
@spolu
Stanislas Polu
6 months
We (at least I) have a strong humano-centric tendency to over-estimate human intelligence and under-estimate artificial intelligence. I have no strong evidence that "raw" artificial intelligence is noticeably inferior to "raw" human intelligence. If I think hard about it, I can…
6
5
44
@spolu
Stanislas Polu
1 year
Just tried the new Bing. One thing the videos don’t show is how slow the search/retrieval process is. To be clear it’s rather fast given the work performed but in a “web search” context it feels like waiting forever (like 4-5s)… Interesting product trade-off.
7
2
44
@spolu
Stanislas Polu
2 years
Prediction: the first individual business to become a unicorn will start in 2022. It’ll be heavily based on Language Models and a product that has yet to be built and consists in something like PromptChainer () meets Zapier.
3
1
42
@spolu
Stanislas Polu
11 months
Couldn’t have put it better 🙌 @romaindillet We’re after making work work better, Easier and funnier with LLMs at the core. Oh and yeah we raised $5m to do just that 👌Welcome onboard @KostaBuhler 🔥
@TechCrunch
TechCrunch
11 months
Dust uses large language models on internal data to improve team productivity by @romaindillet
1
7
30
3
2
43
@spolu
Stanislas Polu
3 years
📔 New MiniF2F paper! Introduces MiniF2F a benchmark of Olympiad-level problem statements formalized in Lean/Metamath/Isabelle. GPT-f applied to MiniF2F/Metamath ~ 2% 🥶 GPT-f applied to MiniF2F/Lean ~ 29% 🔥 Code: 👇
Tweet media one
2
9
42
@spolu
Stanislas Polu
11 months
4
0
42
@spolu
Stanislas Polu
11 months
Ever wondered how ChatGPT can be so fast? it's likely thanks to "speculative sampling". Makes 30B and 65B much more usable... easily adaptable to the falcon family as well.
@dust4ai
Dust
11 months
Speculative Sampling FTW🔥 Use a small model to draft for a larger one to achieve up to 3x speed-up (h/t DeepMind). We tried it for ourselves with🦙models. Check out the blog post🗞️or if you want to try it yourself, our open source implementation💻
4
22
149
3
2
42
@spolu
Stanislas Polu
11 months
And the state is not kidding about that. For the first time of my life I had the opportunity to say something one week and see the highest instance of the state make an announcement based on that the next week 🤯 France literally went over this decade from a “complain about…
@dust4ai
Dust
11 months
We decided to incorporate in 🇫🇷 because we believe we can seed and build a global enduring company from there. We’ll be pushing with the rest of these companies for a strong vibrant and productive AI ecosystem there 💪 Awesome that the state is in with us on this!
1
3
42
4
7
40
@spolu
Stanislas Polu
2 years
Looking at the replies to that tweet, one thing is clear. This is only the beginning. This is 2008 of the iPhone, I was there, I was even interning at Apple at the time. It was cool but no one had any clue how big it would get.
@ProductHunt
Product Hunt 😸
2 years
Drop the coolest AI tool ⬇️
95
37
284
0
5
38
@spolu
Stanislas Polu
1 year
If I was to travel back in time and tell OpenAI myself something, I wold urge myself to work on this. Researchers of the world, please unite! for a richer future of productive LLM deployment! 😅🙏
3
0
39
@spolu
Stanislas Polu
2 years
Automatically produce fixes based on stack traces. GitHub - Codex - Sentry. High signal input. Can show model relevant part of code only (no repo finetuning). Can prolly create good fixes most of the time for that long tail of 500s everyone wants fixed but no one fixes.
2
2
38
@spolu
Stanislas Polu
1 year
The biggest technological rupture introduced by LLMs in enterprise setups will be the disentanglement of how information is produced / stored and how it is consumed 🧵
3
2
38
@spolu
Stanislas Polu
6 months
We are in good shape with the P of PMF. Pushing on M now. This week: - We opened our self-serve upgrade flow - Turned most of our design partners into paying customers Dust is the go-to platform for companies looking to place a bet on AI for their team. Also it's a ton of fun.
@dust4ai
Dust
6 months
sitrep
Tweet media one
0
1
7
1
2
37
@spolu
Stanislas Polu
2 years
Auto-formalisation FTW! Excited by the progress on miniF2F (at this rate we’ll soon need a bigF2F!). What a week!
@Yuhu_ai_
Yuhuai (Tony) Wu
2 years
After showing a few examples, large language models can translate natural language mathematical statements into formal specifications. We autoformalize 4K theorems as new data to train our neural theorem prover, achieving SOTA on miniF2F! 1/ Paper:
Tweet media one
Tweet media two
5
111
459
0
3
37
@spolu
Stanislas Polu
1 year
The goal is a fully programmable LLM-based assistant. Customisation coming in the form of loadable LLM apps able to pull your data, use your tone, output the structure you expect.
@dust4ai
Dust
1 year
XP1 v0.2.0 Incoming Fully revamped tab search + ability to focus on selected text in tabs. Next step, persistence & ability to run Dust apps. Will enable eg. recursive summarisation from N tabs, customising tone / structure when generating content… 👉
0
2
20
2
0
37
@spolu
Stanislas Polu
4 years
The work is motivated by the possibility that a major limitation of automated theorem provers compared to humans – the generation of original mathematical terms – might be addressable via generation from language models.
1
2
37
@spolu
Stanislas Polu
2 years
Making progress on Dust v0! Had forgotten how daunting frontend programming is at first (starting from blank page). So many choices to be made. But the more you make progress the fewer choices are needed at each step, and initial drag turns into velocity. Exhilarating feeling.
Tweet media one
3
1
35
@spolu
Stanislas Polu
7 months
Learn how to get acquired by Stripe and join OpenAI 3 years before ChatGPT 🍿
@mattturck
Matt Turck
7 months
[New on the MAD pod!🎙️] Really fun, no BS chat with @spolu about starting @dust4AI , his experience at Stripe and OpenAI, the path to Generative AI success in the enterprise and why France has such strong AI talent. 🔥 *** All links in first comment below***
2
5
21
1
8
36
@spolu
Stanislas Polu
4 years
We achieve a new state of the art for the Metamath environment with our best model capable of closing 56.22% of proofs from a held-out test set (vs 21.16% for the current state of the art, MetaGen-IL).
2
3
35