Sahil Chaudhary Profile
Sahil Chaudhary

@csahil28

1,584
Followers
437
Following
27
Media
432
Statuses

Building @glaiveAI

Global
Joined March 2013
Don't wanna be here? Send us removal request.
@csahil28
Sahil Chaudhary
1 year
Past weekend I finetuned LLaMA 7B and 13B following the Stanford Alpaca repo, on 20k code generation/editing/optimization instructions. The 13B model performs impressively at small-scoped well defined insttructions. I have released the code and data here -
7
38
230
@csahil28
Sahil Chaudhary
1 year
Releasing instruct-codegen-16B today. It is a finetuned version of codegen-16B-multi on a dataset of 250k alpaca style codegen instruction samples, and achieves a pass @1 of 37.1%
5
24
165
@csahil28
Sahil Chaudhary
1 year
CodeAlpaca dataset now available on @huggingface hub -
6
34
161
@csahil28
Sahil Chaudhary
11 months
Excited to announce that I'm building @GlaiveAI , helping companies train use-case specific small language models with the help of synthetic data with the goal of commoditising language models
10
13
144
@csahil28
Sahil Chaudhary
1 year
Built a gradio demo to try out CodeAlpaca on huggingface spaces -
3
33
136
@csahil28
Sahil Chaudhary
9 months
Really excited to share that @glaiveAI has raised a $3.5M seed round led by @sparkcapital with participation from @villageglobal and @amasad
8
11
122
@csahil28
Sahil Chaudhary
8 months
Releasing a new code model and dataset today, glaive-coder-7b and the 130k+ samples used to train the model licensed as apache-2.0. Instead of just reporting the benchmarks, we also release the Code Models Arena to change the way we evaluate code models.
3
17
84
@csahil28
Sahil Chaudhary
11 months
Finetuned replit-code-3B on a glaive generated dataset consisting of 1B tokens gets pass @1 of 63.5
@Teknium1
Teknium (e/λ)
11 months
Okay so @csahil28 , the maker of codealpaca and now running the company Glaive, made a dataset of 1B tokens on code generation. It makes Replit3B outperform every opensource code model in the HumanEval benchmark (Code Gen Benchmark). Benchmark has been replicated by @abacaj
Tweet media one
16
46
327
5
11
71
@csahil28
Sahil Chaudhary
9 months
Along with the announcement, releasing a 2.7B open source model with similar function calling abilities as gpt-3.5 while being significantly smaller.
8
10
58
@csahil28
Sahil Chaudhary
1 year
For anyone wanting to use CodeAlpaca as an api, you can deploy this template on @BananaDev_ in <5 mins -
3
8
55
@csahil28
Sahil Chaudhary
1 year
@O42nl Nothing beats GPTrillion
0
0
22
@csahil28
Sahil Chaudhary
2 years
Working on some very exciting tech @BananaDev_ to unlock a true serverless gpu experience, this is what cold starts on banana are gonna look like soon
1
1
21
@csahil28
Sahil Chaudhary
1 year
Really impressive model, try it out on banana -
@_akhaliq
AK
1 year
Riffusion, real-time music generation with stable diffusion @huggingface model: project page:
Tweet media one
64
628
3K
1
4
20
@csahil28
Sahil Chaudhary
1 year
Starting a blog series on how to write CUDA kernels to improve model performance, first one is an introduction to basic concepts and terms
1
1
18
@csahil28
Sahil Chaudhary
7 months
@minimaxir released an open source function calling model + data couple months ago Also working on the next version of the model with @abacaj
0
0
19
@csahil28
Sahil Chaudhary
8 months
Rankings are now live on the Glaive Code Arena, seeing some interesting results
Tweet media one
5
1
17
@csahil28
Sahil Chaudhary
11 months
Though human eval isn’t necessarily indicative of how good the model will do for users in real life use cases, having a pass @1 higher than all open source models with just a 3B model and 1B tokens shows how good small models can get given high quality data
4
4
17
@csahil28
Sahil Chaudhary
9 months
You can run the glaive-function-calling model in production using @BananaDev_
@BananaDev_
Banana
9 months
You may have seen @csahil28 announcement for @GlaiveAI and his new 2.7B open source chat model with function calling abilities. Well, we've got a playground in which you can try it out right now: 🔗
0
2
9
0
1
16
@csahil28
Sahil Chaudhary
1 year
Time for a whisper hackathon- I’m sharing the model key to call whisper hosted on banana, free to use over the weekend . Use it to build a project and share your demo here by Sunday. Best demo gets $100 worth of @BananaDev_ credits
2
6
17
@csahil28
Sahil Chaudhary
11 months
Also, no closed source api was used for data generation
3
0
15
@csahil28
Sahil Chaudhary
2 years
@_dmca @Suhail @BananaDev_ We have stable diffusion inferences running in ~2 secs (~5sec with cold boot), let me know if you’d like to try turboboot -
@erikdunteman
Erik Dunteman 🍌
2 years
Introducing @BananaDev_ GPU TurboBoot 🔥🥾 Get 1 second cold boots for large ML models.🤯 yes, REALLY. (we spent months on this) Here's a clip of Stable Diffusion cold boot AND inference in <4.5s. TurboBoot beta access here👇
15
19
199
2
2
15
@csahil28
Sahil Chaudhary
1 year
reached front page of HN
Tweet media one
0
2
14
@csahil28
Sahil Chaudhary
1 year
The data for this was not generated using any LLM api so this model is fully suitable for commercial use as well.
3
0
13
@csahil28
Sahil Chaudhary
1 year
Reducing cold boot time has always been one of the most important engineering problems at @BananaDev_ , and with turboboot we make significant progress towards a serverless experience for gpus which is as fast as having always on machines
@BananaDev_
Banana
1 year
Announcing, Turboboot. 🎉🍌 For a year we’ve been grinding R&D, on a quest to reduce serverless GPU cold boots. Turboboot is the result. This new orchestrator now powers Banana's serverless GPUs, and is available for ALL customers. details:
2
17
67
0
3
12
@csahil28
Sahil Chaudhary
11 months
@RickLamers @Teknium1 @goofirnoth @abacaj I’ll be publishing the data (hopefully) within this week, i verified fully that there are no exact matches with human eval
2
0
12
@csahil28
Sahil Chaudhary
11 months
This is why I decided to build Glaive, to help companies get high quality data at a fraction of the cost and time and train small models on this data, achieving higher quality, lower latency and reduced costs than LLM APIs
1
1
12
@csahil28
Sahil Chaudhary
2 years
checkout- Generate research (abstracts, for now) using AI, going to add full length paper generations soon
2
3
10
@csahil28
Sahil Chaudhary
1 year
You can try out the model using the web demo- and also on banana -
1
1
9
@csahil28
Sahil Chaudhary
11 months
While I'm looking forward to this new adventure, I'm sad about leaving the wonderful people @BananaDev_ , I've learnt a lot over the past couple years and did some of the best work of my life along with them
0
1
10
@csahil28
Sahil Chaudhary
1 year
Build a tool to get prompts from stable diffusion generated images -
5
0
9
@csahil28
Sahil Chaudhary
1 year
The data was generated using text-davinci-003, similar to Alpaca, using a slightly modified initial prompt -
3
1
8
@csahil28
Sahil Chaudhary
9 months
DM if you’re interested in this
@mattshumer_
Matt Shumer
9 months
. @csahil28 and I may have access to a 32x A100 80GB cluster for 6 months. Looking for a few startups that might want to go in on it together and share the time. More details below. DM me if interested.
1
2
15
0
0
8
@csahil28
Sahil Chaudhary
1 year
@amasad @Teknium1 Just ran and got pass @1 of 0.262
2
0
7
@csahil28
Sahil Chaudhary
1 year
For no specific reason, my go to test for a new llm is always to have it explain CUDA and write basic kernels. I just generated a 4 part cuda tutorial series using GPT-4 in under 30 mins, and it's pretty good when compared to other models i've tried -
1
0
7
@csahil28
Sahil Chaudhary
6 years
Just Published! Discussed how iOS 12 and the new iPhone series will impact on-device machine learning from a developers perspective
0
0
5
@csahil28
Sahil Chaudhary
1 year
Run StableLM in prod using @BananaDev_
@BananaDev_
Banana
1 year
StableLM is here! 👀🔥 deploy it in minutes with our template 👇
3
3
55
0
0
7
@csahil28
Sahil Chaudhary
1 year
👀
Tweet media one
1
0
6
@csahil28
Sahil Chaudhary
10 years
I’m following France versus Germany in the FIFA Global Stadium #FRAGER #worldcup #joinin http://t.co/LqXHss8EKV
0
0
3
@csahil28
Sahil Chaudhary
2 years
only 🥕 demo that matters
Tweet media one
Tweet media two
2
0
5
@csahil28
Sahil Chaudhary
9 months
@tomchapin @jxnlco Working on building this @GlaiveAI
0
0
6
@csahil28
Sahil Chaudhary
11 months
Having spent past 2 years building ML hosting infra @BananaDev_ I worked with companies to get their models in prod, and saw patterns emerge where companies want to move to custom models but are bottlenecked by high quality use-case specific data
1
1
6
@csahil28
Sahil Chaudhary
6 years
A really nice way to learn about sql injections using swift playgrounds!
@aroraharshita33
Harshita Arora
6 years
I created Alice in codeLand for WWDC '18 student scholarships. It's an offline hacker simulator and a fun story of a hacker that teaches about SQL injections. Check it out here: … And video Feedback is welcome :)
5
2
26
0
1
4
@csahil28
Sahil Chaudhary
2 years
@erikdoingthings Probably more credit than I deserve but it’s twitter so I’ll take it
0
0
5
@csahil28
Sahil Chaudhary
1 year
@ajohansson1981 Not yet, but I’ll probably release it soon
1
0
5
@csahil28
Sahil Chaudhary
5 years
Some really cool apps in the list!
@fritzlabs
Fritz AI
5 years
Contributor @thesupercoder_ put together this excellent list that represents some of his favorite #iOS camera apps in 2018—photos, videos, GIFs, and more. Some really cool tools in here!
0
0
4
0
0
4
@csahil28
Sahil Chaudhary
2 years
Can confirm the hype, super excited for this
@erikdunteman
Erik Dunteman 🍌
2 years
Convinced our founding engineer to move to Mexico City. He arrives Thursday. Very big hype.
0
0
16
0
0
4
@csahil28
Sahil Chaudhary
11 months
@amasad @khandelia1000 @Teknium1 I fixed the early cutoff problem here by adding logit bias so it doesn’t generate the eos token earlier, should improve quality
1
0
3
@csahil28
Sahil Chaudhary
1 year
@CubanBTC I use @LambdaAPI , they are the best cloud provider for gpus in my opinion
1
0
4
@csahil28
Sahil Chaudhary
6 years
I got "You're a Threat Researcher." on "Are you a cybersecurity superstar?". What about you?
0
0
2
@csahil28
Sahil Chaudhary
1 year
Finetuned our carrot image-captioning model on a dataset of stable diffusion prompt-image pairs. Carrot has excellent captioning capabilities and after being trained on SD prompts it seems to do very well on predicting prompts
0
0
4
@csahil28
Sahil Chaudhary
1 year
Will aim to write 1 tutorial per week, probably on weekends, I usually avoid writing of any kind so this is going to be a fun challenge
0
0
4
@csahil28
Sahil Chaudhary
1 year
@osanseviero @OrenElbaum @nxz1z @BananaDev_ @huggingface Yeah it took a long time uploading to HF hub, but the huggingface_hub package made it simple 😁
0
0
2
@csahil28
Sahil Chaudhary
2 years
@erikdoingthings Very grateful to be working with you guys and the entire @BananaDev_ team, looking forward to many more amazing years
0
0
3
@csahil28
Sahil Chaudhary
1 year
This runs whisper large-v2 (same as openai api) hosted on Banana turboboot, with extremely fast cold boots and in most cases comparable (if not faster) inference speed to openai api
1
0
3
@csahil28
Sahil Chaudhary
5 months
@Teknium1 @abacaj @artificialguybr @main_horse Training on glaive-code-assistant data right now
2
0
3
@csahil28
Sahil Chaudhary
2 years
👀
@erikdunteman
Erik Dunteman 🍌
2 years
Introducing @BananaDev_ GPU TurboBoot 🔥🥾 Get 1 second cold boots for large ML models.🤯 yes, REALLY. (we spent months on this) Here's a clip of Stable Diffusion cold boot AND inference in <4.5s. TurboBoot beta access here👇
15
19
199
0
1
3
@csahil28
Sahil Chaudhary
1 year
@wasauce You can deploy custom repos on banana so you can implement any pre/post processing, or any form of batching on top of whisper and deploy that. Aim is to enable generic api level performance on all models running on banana
1
0
3
@csahil28
Sahil Chaudhary
1 year
@O42nl @NVIDIAAI . @erikdunteman we should probably look into open sourcing banana_model_utils, comparable performance and easier to use
0
0
3
@csahil28
Sahil Chaudhary
8 months
Check them out here -
0
0
3
@csahil28
Sahil Chaudhary
2 years
This is true, and feels great
@erikdunteman
Erik Dunteman 🍌
2 years
“Being in a startup at this stage is like being popular in high school” @csahil28 , who’s DMs are still clogged up with people wanting TurboBoot
2
0
8
0
0
3
@csahil28
Sahil Chaudhary
4 years
Just wrote a tutorial for flutter and tflite, check it out-
0
1
2
@csahil28
Sahil Chaudhary
1 year
Example- "astronaut riding a horse on mars, desert in the background, soft cinematic lighting, 8k, artstation, octane render, cinema 4 dv"
Tweet media one
1
1
2
@csahil28
Sahil Chaudhary
2 years
@erikdoingthings All thanks to the wonderful team and culture we have @BananaDev_ , enabling people to do their best!
0
0
2
@csahil28
Sahil Chaudhary
6 years
@clarifai
Clarifai
6 years
On the sixth day of #12DaysOfHacks , @clarifai gave to me six trash items sorted automatically 🎁 RT w/ @Clarifai + #12DaysOfHacks for a chance to win swag 📱 Submit your own hack to bounties @clarifai .com for a chance to win @shopforce1rc drone!
Tweet media one
0
4
3
0
0
2
@csahil28
Sahil Chaudhary
1 year
@Teknium1 This is great, I’ll try and train this further on the 250k samples
3
0
2
@csahil28
Sahil Chaudhary
1 year
@sethforsgren Would you like to host the public demo on @BananaDev_ , here’s a colab to try it out
0
0
2
@csahil28
Sahil Chaudhary
11 months
@ajohansson1981 @Teknium1 I have a deduped version of that here -
1
0
2
@csahil28
Sahil Chaudhary
6 years
Great opportunity!
@fritzlabs
Fritz AI
6 years
Mobile devs and machine learners—Heartbeat has a new, updated call for contributors! Check it out and become a leading voice in mobile machine learning.
0
5
9
0
0
1
@csahil28
Sahil Chaudhary
6 years
#12DaysOfHacks @clarifai Wanting to win oodles of swag
@clarifai
Clarifai
6 years
On the second day of #12DaysOfHacks , Clarifai gave to me two carbon footprints from our AI @Terrabeasts 🎁 RT w/ @Clarifai + #12DaysOfHacks for a chance to win swag 📱 Submit your own hack for a chance to win @shopforce1rc drone
Tweet media one
0
1
4
0
0
2
@csahil28
Sahil Chaudhary
2 years
@analyseian @BananaDev_ Had rate limited it, should be good now
1
0
2
@csahil28
Sahil Chaudhary
11 months
@ajohansson1981 @Teknium1 I did but have had multiple runs failed so will try to try again soon
1
0
2
@csahil28
Sahil Chaudhary
6 years
5th day in a row. Some great hacks. @clarifai #12DaysOfHacks
@clarifai
Clarifai
6 years
On the fifth day of #12DaysOfHacks , Clarifai gave to me five gift recommendations for my friends and family 🎁 RT w/ @Clarifai + #12DaysOfHacks for a chance to win swag 📱 Submit your own hack for a chance to win @shopforce1rc drone
Tweet media one
0
4
2
0
0
2
@csahil28
Sahil Chaudhary
1 year
@ajohansson1981 @Teknium1 Will start the training run today
1
0
2
@csahil28
Sahil Chaudhary
8 months
Given the recent questions around benchmarks for code models, it is clear that existing benchmarks are not sufficient and this is why we release the Code Models Arena, to let users vote on model outputs so we can have a better understanding of user preference on code models.
1
0
2
@csahil28
Sahil Chaudhary
2 years
Model is great, hosting is even better
0
0
2
@csahil28
Sahil Chaudhary
6 years
#12DaysOfHacks @clarifai New hacks daily. I’m lovin it.
@clarifai
Clarifai
6 years
On the eighth day of #12DaysOfHacks , Clarifai gave to me eight brand new amazing recipes 🎁 RT w/ @Clarifai + #12DaysOfHacks for a chance to win swag 📱 Submit your own hack to bounties @clarifai .com for a chance to win @shopforce1rc drone!
Tweet media one
0
4
3
0
0
1
@csahil28
Sahil Chaudhary
1 year
These are by no means perfect, but all previous models (including chatGPT) required a lot of prompting to generate usable kernels, but with gpt4 the quality of code and explanations generated with absolutely minimal prompts is amazing
0
0
1
@csahil28
Sahil Chaudhary
1 year
0
0
1
@csahil28
Sahil Chaudhary
6 years
@clarifai #12DaysOfHacks Oodles of swag on the line
@clarifai
Clarifai
6 years
On the fourth day of #12DaysOfHacks , Clarifai gave to me four sign language translations 🎁 RT w/ @Clarifai + #12DaysOfHacks for a chance to win swag 📱 Submit your own hack for a chance to win @shopforce1rc drone
Tweet media one
0
4
5
0
0
1
@csahil28
Sahil Chaudhary
5 years
@maxcodes1 Yeah TunnelBear is good
0
0
1
@csahil28
Sahil Chaudhary
2 years
@jajoosam Yeah gpt-j for now, will experiment with a bigger model next
1
0
1
@csahil28
Sahil Chaudhary
6 years
I just published “Building the new internet”
0
0
1
@csahil28
Sahil Chaudhary
6 years
@clarifai #12DaysOfHacks Some great hacks on show
@clarifai
Clarifai
6 years
On the tenth day of #12DaysOfHacks , Clarifai gave to me ten parking spots for my auto car - yippee! 🎁 RT w/ @Clarifai + #12DaysOfHacks for a chance to win swag 📱 Submit your own hack to bounties @clarifai .com for a chance to win @shopforce1rc drone!
Tweet media one
0
4
1
0
0
1
@csahil28
Sahil Chaudhary
6 years
I just published “Future of AI”
0
0
0
@csahil28
Sahil Chaudhary
6 years
Checkout my new article on mental health and AI #ArtificialInteligence #blogging
0
0
0
@csahil28
Sahil Chaudhary
7 years
Lionel Richie says “Hello” to startup investing
Tweet media one
0
0
1
@csahil28
Sahil Chaudhary
6 years
#12DaysOfHacks @clarifai Final day Oodles of swag on the line
@clarifai
Clarifai
6 years
On the twelfth day of #12DaysOfHacks , Clarifai gave to me 12 perfect background tunes for my videos 🎁 RT w/ @Clarifai + #12DaysOfHacks for a chance to win swag 📱 Submit your own hack to bounties @clarifai .com for a chance to win @shopforce1rc drone!
Tweet media one
0
3
2
0
0
1
@csahil28
Sahil Chaudhary
1 year
@CubanBTC 35-40 GB
1
0
1
@csahil28
Sahil Chaudhary
7 years
🙌🏻🙌🏻
Tweet media one
0
0
0
@csahil28
Sahil Chaudhary
1 year
@Teknium1 Dm’d you
0
0
1
@csahil28
Sahil Chaudhary
6 years
@clarifai #12DaysOfHacks Chance to wind oodles of swag
@clarifai
Clarifai
6 years
On the first day of #12DaysOfHacks , @clarifai gave to me an application in a @slackhq directory RT w/ @Clarifai + #12DaysOfHacks for a chance to win swag Submit your own hack for a chance to win @shopforce1rc drone
Tweet media one
0
9
3
0
0
1
@csahil28
Sahil Chaudhary
1 year
@CubanBTC Not in fp16, but it can be quantized to 8/4 bit
1
0
1
@csahil28
Sahil Chaudhary
11 years
Talk And Drive - operate your map applications simply by talking
0
0
0