Shishir Patil Profile
Shishir Patil

@shishirpatil_

2,859
Followers
854
Following
27
Media
193
Statuses

CS PhD @ UC Berkeley. Creator of Gorilla, GoEx, RAFT, OpenFunctions and Berkeley Function Calling Leaderboard. Previously researcher @GoogleAI @MSFTResearch

Berkeley, CA
Joined July 2009
Don't wanna be here? Send us removal request.
Pinned Tweet
@shishirpatil_
Shishir Patil
1 year
📢 Excited to release Gorilla🦍 Gorilla picks from 1000s of APIs to complete user tasks, surpassing even GPT-4! LLMs need to interact with the world through APIs, and Gorilla teaches LLMs APIs. Presenting Gorilla-Spotlight demo🤩 Webpage:
32
206
975
@shishirpatil_
Shishir Patil
6 months
📢 Introducing Gorilla OpenFunctions! 🔥 We've listened to your calls for an open-source function calling model, and are thrilled to present Gorilla OpenFunctions 🦍 And yes, we've made parallel functions a reality in open-source! 😎 Curious about typical scenarios where GPT-4
Tweet media one
11
60
317
@shishirpatil_
Shishir Patil
21 days
📊Delighted to welcome Command-R-Plus, Llama-3, and and Gemini-Pro-1.5 into the Berkeley Function Calling Leaderboard. Check out how they stack up across different categories, P95 latency, and costs at Congratulations to @cohere , @AIatMeta , and
Tweet media one
Tweet media two
12
62
323
@shishirpatil_
Shishir Patil
3 months
📢Excited to release the live Berkeley Function-Calling Leaderboard! 🔥 Also debuting openfunctions-v2 🤩 the latest open-source SoTA function-calling model on-par with GPT-4🆕Native support for Javascript, Java, REST! 🫡 Leaderboard: Blog:
Tweet media one
9
68
300
@shishirpatil_
Shishir Patil
1 month
📢Excited to release GoEx⚡️a runtime for LLM-generated actions like code, API calls, and more. Featuring "post-facto validation" for assessing LLM actions after execution 🔍 Key to our approach is "undo" 🔄 and "damage confinement" abstractions to manage unintended actions &
6
53
203
@shishirpatil_
Shishir Patil
2 months
🌀Check out RAFT: Retrieval-Aware Fine Tuning! A simple technique to prepare data for fine-tuning LLMs for in-domain RAG, i.e., question-answering on your set of documents 📄 Exciting collaboration with @berkeley_ai 🤝 @Azure 🤝 @AIatMeta MSFT-Meta blog:
@tianjun_zhang
Tianjun Zhang
2 months
📢 Excited to release RAFT: Retriever-Aware FineTuning for domain-specific RAG, a three-way collaboration between @berkeley_ai @Azure and @AIatMeta ! 🤝 Drawing parallels between LLMs and students in open-book (RAG) 📔 and closed-book exams (SFT) 🧠, we present a better recipe
Tweet media one
6
49
231
4
43
169
@shishirpatil_
Shishir Patil
11 months
🦍Introducing the all-new gorilla-cli, now available as a pip package!✍️ With a vast collection of ~1500 🆕APIs, including 👀 Kubernetes, AWS, GCP, Azure, GitHub, Conda, Curl, Sed, and more🤩 simply state your goal, and let Gorilla CLI generate the commands for execution.
5
29
148
@shishirpatil_
Shishir Patil
2 years
Train BERT on Smartphones 🤩 Announcing POET 📢 Find out how we train memory-hungry SOTA models on smartphones! #ICML2022 Thu 21 Jul 3:45 pm EDT at Room 327 🧵👇 Paper: Web:
6
34
134
@shishirpatil_
Shishir Patil
2 months
🏆New Leader Alert: @AnthropicAI 's Claude-3 tops our Berkeley Function Calling Leaderboard! 📊 Exciting updates include: ⏰Latency and cost for understanding trade-offs💲 🆕Explore new additions like - Claude-3 Haiku, Sonnet & Opus, @DbrxMosaicAI DBRX and more! ⁉️Dive into our
Tweet media one
4
18
122
@shishirpatil_
Shishir Patil
1 year
🦍Big news! Gorilla is now Apache 2.0 licensed🤩We are delighted to welcome 2 new models into the Gorilla family ⛳️Use commercially with zero obligations! Our colab has been used for 6000+ invocations in the last week 🚀Check it out:
4
23
101
@shishirpatil_
Shishir Patil
1 month
🙇‍♂️ Humbled to have @AndrewYNg talk about Gorilla 🦍 This is the most succinct write up on tool-use and function-calling. When we started Gorilla back in January of 2023, this was precisely our hypothesis 🙂
@AndrewYNg
Andrew Ng
1 month
Tool use, in which an LLM is given functions it can request to call for gathering information, taking action, or manipulating data, is a key design pattern of AI agentic workflows. You may be familiar with LLM-based systems that can perform a web search or execute code. Some of
78
324
2K
2
8
83
@shishirpatil_
Shishir Patil
1 year
+ We are building Gorilla to be an LLM API appstore - you can add your APIs to Gorilla! + Github: + Join our Discord to stay in the loop! + Gorilla-Spotlight sign-up: + Fun collaboration with @tianjun_zhang , @xinw_ai and @mejoeyg
8
10
64
@shishirpatil_
Shishir Patil
20 days
Berkeley Function Calling Leaderboard: Introducing Consistent 8 X V100 with pay-as-you-go pricing for measuring costs and latency. In depth: We fix inconsistency in the cost and latency calculation for open-source models, which are now all calculated when serving the model with
@shishirpatil_
Shishir Patil
21 days
📊Delighted to welcome Command-R-Plus, Llama-3, and and Gemini-Pro-1.5 into the Berkeley Function Calling Leaderboard. Check out how they stack up across different categories, P95 latency, and costs at Congratulations to @cohere , @AIatMeta , and
Tweet media one
Tweet media two
12
62
323
2
14
61
@shishirpatil_
Shishir Patil
1 month
We study the inherent challenges in relying on LLMs—addressing their unpredictability, the essential trust mechanisms for their decision-making, and hurdles in failure recognition & resolution. Our system, GoEx presents abstractions and policies to overcome these for RESTful
Tweet media one
5
19
55
@shishirpatil_
Shishir Patil
1 year
API invocations are extremely brittle requiring LLMs to generate accurate input arguments. Gorilla improves accuracy while reducing hallucination, and generalizes to 1600+ APIs (and counting). 📢 Given recent debates, we also find Fine-tuning >> Prompting.
Tweet media one
2
10
44
@shishirpatil_
Shishir Patil
1 month
How to do better RAG? 🤔Check out in this webinar with @jerryjliu0 on the shortcoming of today's RAG 👀and how a few simple tricks to create a fine-tuning data-set can vastly improve performance for in-domain RAG! And thanks to @ravithejads RAFT is now already part of
@llama_index
LlamaIndex 🦙
1 month
🎥 ICYMI: Check out the recording of our latest LlamaIndex Webinar on retrieval-augmented fine-tuning (RAFT)! In the webinar, @tianjun_zhang and @shishirpatil_ , the lead co-authors of RAFT, explain how combining an “open-book exam” approach (RAG) with a “closed-book exam”
Tweet media one
2
31
136
2
7
44
@shishirpatil_
Shishir Patil
7 months
🤩Check out our latest release MemGPT 🦙📚 Inspired by how OS manages pages, we explore - can the LLM manage it's own context length, and page-out / page-in from archival memory?
@charlespacker
Charles Packer
7 months
Introducing MemGPT 📚🦙 a method for extending LLM context windows. Inspired by OS mem management, it provides an infinite virtualized context for fixed-context LLMs. Enables perpetual chatbots & large doc QA. 🧵1/n Paper: GitHub:
9
107
465
3
4
39
@shishirpatil_
Shishir Patil
1 month
Excited to see @cohere 's new Command R+ model's focus on tool use capabilities! Function Calling / Tool use has become a deciding factor in choosing models as we move beyond simple chatbots into integrating LLMs in workflows and agents. Check out our live Berkeley Function
@aidangomez
Aidan Gomez
1 month
R+ is quite performant. It's very competitive with models in its price range, and sometimes even those in a category above. Tool Use is a place we've seen considerable improvement in R+ over R.
Tweet media one
2
5
89
0
6
39
@shishirpatil_
Shishir Patil
8 months
The last time @profjoeyg interviewed me, it was for PhD admissions to Berkeley.. Thankfully the stakes were lower this time 😜 Check out the latest interview where @tianjun_zhang and I dive into the details as we discuss #GorillaLLM and LLM APIs 🦍🦍🦍
@vsreekanti
Vikram Sreekanti
8 months
Gorilla's a super cool LLM application: You can autogenerate API calls for things like Kubernetes or transformers. 🛠️ @profjoeyg interviewed @shishirpatil_ & @tianjun_zhang about how the model works:
1
13
41
2
3
38
@shishirpatil_
Shishir Patil
19 days
Excited to welcome Snowflake-Arctic on the Berkeley Function Calling Leaderboard ❄️ How does Snowflake-arctic-instruct, an apache-2.0 licensed, 480B parameter MoE model perform on invoking functions (aka tools)? Attached is a quick comparison with gpt-4-0125-preview (yellow).
Tweet media one
0
10
36
@shishirpatil_
Shishir Patil
1 year
Our unique Abstract Syntax Tree (AST) evaluation, presents the first systematic evaluation of SOTA LLMs such as GPT-4, and Claude-v1. We benchmark them for Accuracy and Hallucination. Our experiments demonstrate what most people "felt", GPT-4 hallucinates more than GPT-3.5 👀
Tweet media one
1
7
34
@shishirpatil_
Shishir Patil
1 year
We have continued developing Gorilla: your go-to open-source API marketplace for LLMs 🦍Our colab has seen over 3k+ users in 3 days 🚀 A big thank you for your enthusiastic response 🫡 Gorilla is open source and will always be community driven. ⏲ Check out Gorilla in 60seconds:
@tianjun_zhang
Tianjun Zhang
1 year
📢It has been a thrilling 5 days since Gorilla release 🚀 Updates: + We released HF, TF and Torch Hub Gorilla zero-shot models + Opened up APIZoo for community contributions + 2k GitHub stars 🤩 and 780+ welcoming discord community 🎙️ Now, to answer the most pressing question
0
2
17
1
3
27
@shishirpatil_
Shishir Patil
9 days
Check out @AIatMeta 's post on RAFT for better in-domain RAG. Fun-fact, you can access it through: Meta and Microsoft: Divided by shareholders, united by Berkeley 😉
2
5
58
@shishirpatil_
Shishir Patil
1 year
Gorilla’s retriever–aware training enables it to react to test time changes in the APIs. Gorilla is able to respond to model upgrades and changes in model registry at test-time. More information in our paper:
Tweet media one
1
5
24
@shishirpatil_
Shishir Patil
11 months
🦍Week 3 🚀 📢 Introducing Gorilla-7b-hf-delta-v1, a major update turning user queries into insightful code suggestions! 👋Thrilled to rank #2 on Hacker News again! A hearty welcome to those who found us there. 🤝We're easing API discovery by transforming simple user queries into
@tianjun_zhang
Tianjun Zhang
11 months
📢How can Gorilla generate meaningful code snippets? This has been one of the first asks from the community 🤝 Today we have a major release: gorilla-7b-hf-delta-v1, a new model that gives code suggestions based on the query 🚀 Give it a try and let us know what you think! Check
2
0
10
0
6
24
@shishirpatil_
Shishir Patil
8 months
Thoroughly enjoyed this webinar with @jerryjliu0 from @llama_index and @jobergum ! Check out some key takeaways on integrating LLMs with Retrievers 🦙
@jerryjliu0
Jerry Liu
9 months
Here are six amazing insights from @jobergum and @shishirpatil_ on how to effectively use finetuning to optimize your LLM apps over your data: 1️⃣ Fine-tuning is actually remarkably effective at internalizing knowledge: retrieval algorithms in comparison can be inaccurate.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
71
293
0
5
23
@shishirpatil_
Shishir Patil
21 days
Models are increasingly diverging in their areas of focus! This is exciting as we see a fragmentation in the use of function calling capabilities! 🤔The ability of a model to refuse to present an answer continues to remain challenging!!
Tweet media one
Tweet media two
Tweet media three
0
5
22
@shishirpatil_
Shishir Patil
9 months
Honored to share the stage with luminaries at the @SimonsInstitute LLM workshop last week. In my talk, I delved into how Gorilla 🦍 adapts to imprecise retrievers and introduced a novel methodology to measure hallucination in LLMs. Watch the session here:
1
2
21
@shishirpatil_
Shishir Patil
10 months
We're thrilled to announce that our hosted Gorilla models have successfully processed over👏50,000 👏 user 👏 requests 🙇‍♂️ We're incredibly grateful for your overwhelming response and this has given us immense confidence to tear the initial stable releases on Github: 🦍Gorilla
@tianjun_zhang
Tianjun Zhang
10 months
Efficient Information retrievers are the solution to equip language models with the most up-to-date knowledge. 🚀 📢Releasing Gorilla+Retrievers to help keep APIs updated! We provide support for BM25 and GPT based retrievers in our GitHub: 🦍 Gorilla v0.0.1:
3
1
20
0
1
20
@shishirpatil_
Shishir Patil
9 months
Check out the @weaviate_io paper summary on #GorillaLLM 🦍 @CShorten30 does a phenomenal job on diving deep into the core research and presents lots of useful advise to those working on retrievers, LLMs, or APIs 🤩
@CShorten30
Connor Shorten
9 months
I am SUPER excited to present a paper summary video of Gorilla: Large Language Models Connected to Massive APIs! 🦍🛠️ API instruction following, Self-Instruct, Retrieval-Aware Training, and more! One of the most exciting papers I've seen recently! 🤯🚀
Tweet media one
5
23
84
1
4
19
@shishirpatil_
Shishir Patil
2 years
Moving data between cloud object stores is a growing phenomenon. Our work presents the first solution in optimizing data movement across AWS (S3 buckets), Azure (Blobs) and Google Cloud (Storage).
@_parasj
Paras Jain
2 years
Releasing Skyplane, a new open-source tool to move huge datasets between clouds. Skyplane is: 1. 🔥 Blazing fast (110x faster) 2. 🤑 Cheap (4x cheaper) 3. 🌐 Universal (AWS, Azure and GCP) Read more: 1/
8
57
259
0
2
17
@shishirpatil_
Shishir Patil
9 months
I'm always amazed by the energy @CShorten30 injects into his podcast! Had the privilege to be a part of it 🤩 Don't miss the latest episode of Weaviate 🤝 Gorilla podcast!
@CShorten30
Connor Shorten
9 months
Beyond excited to publish our newest Weaviate Podcast with @shishirpatil_ and @tianjun_zhang , co-authors of the Gorilla LLMs!! 🦍🛠️🎉 Loved this discussion on all things Gorilla, from the APIZoo to Self-Instruct, Retrieval-Aware Training, and more! 📚
Tweet media one
2
14
38
1
5
17
@shishirpatil_
Shishir Patil
1 year
I don't know how @arankomatsuzaki does it 🙇‍♂️ but our advisor @profjoeyg saw his tweet before he saw our arXiv notification!
@arankomatsuzaki
Aran Komatsuzaki
1 year
Gorilla: Large Language Model Connected with Massive APIs Releases Gorilla, a finetuned LLaMA-based model that surpasses the performance of GPT-4 on writing API calls. proj: abs:
Tweet media one
18
194
746
0
0
17
@shishirpatil_
Shishir Patil
10 months
@AlphaSignalAI Thanks for featuring our work! Gorilla 🦍 is an open source project from UC Berkeley with some exciting new releases planned! Stay tuned! 😉
0
1
16
@shishirpatil_
Shishir Patil
2 months
In today's updates to the Berkeley Function Calling leaderboard: 📊Enhanced Leaderboard with Additional Models and Summary Table: @MistralAI -large-2402, @GoogleAI Gemini 1.0 Pro, and Gemma now included. 🤖 Gradio for Interactive Exploration! Includes function calling demos, and
0
5
13
@shishirpatil_
Shishir Patil
27 days
Check out how good is Llama 3 on tool calling 👀 Thorough work by @RickLamers on the Berkeley Tool Calling Leaderboard!
@RickLamers
Rick Lamers
28 days
Frontier level Tool Calling now live on @GroqInc powered by Llama 3 🫡 Outperforms GPT-4 Turbo 2024-04-09 and Claude 3 Opus (FC version) in multiple subcategories At 300 tokens/s 🚀 I've personally been working on this feature, and man, the new Llama is good!
Tweet media one
21
39
305
0
2
14
@shishirpatil_
Shishir Patil
8 months
Delighting to see Gorilla's newest infant 🦍 Weaviate Gorilla 😊Checkout @CShorten30 's blog and video on how you can use LLMs to invoke GraphQL APIs 🛠️🚀 Blog: Youtube:
@CShorten30
Connor Shorten
8 months
We trained LlaMA 7B to use Weaviate!! 🦍🛠️ Presenting... Weaviate Gorilla Part 1: GraphQL! 🎉 Blog Post: YouTube: 🧵 With some more details 👇
16
76
244
1
4
14
@shishirpatil_
Shishir Patil
9 months
Fine-tune smaller models or switch to fine-tuning 3.5 from #OpenAI ? Exciting experiment by @morgymcg on the #GorillaLLM dataset. Watch along and track the progress in real time 🤩
@morgymcg
Morgan McGuire
9 months
Put together a quick colab to fine-tune @OpenAI ChatGPT-3.5 on the huggingface api code from the gorilla dataset Idea being to see if something like this can help improve ChatGPT-3.5's use of tools and mimic GPT-4's `functions` capability
5
9
43
0
2
13
@shishirpatil_
Shishir Patil
1 month
Thanks for sharing our work @arankomatsuzaki 🫡
@arankomatsuzaki
Aran Komatsuzaki
1 month
GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications Presents a runtime for LLMs with an intuitive undo and damage confinement abstractions, enabling the safer deployment of LLM agents in practice repo: abs:
Tweet media one
1
23
84
0
2
13
@shishirpatil_
Shishir Patil
3 months
Using LLMs to call functions is 🚀🚀🚀 When we released OpenFunctions-v1 the community loved the models 🙏 but we quickly realized evaluation needed attention 👀 It took a while, but we are delighted to share the Berkeley Function-Calling Leaderboard. + Abstract Syntax Tree (AST)
Tweet media one
1
1
12
@shishirpatil_
Shishir Patil
8 months
Agreed!! Love the graphic 🔥🔥🔥 Weaviate Gorilla is now out! Check it out 🚀🦍
@CShorten30
Connor Shorten
8 months
Preview of our initial Weaviate Gorilla release, GPT-4 is amazing at generating synthetic Weaviate database schemas! 🦍🦍🦍 Love the way this slide turned out haha, all sorts of applications for Weaviate! Just ask an LLM! 🤗🍱🐕💊🍿📱🏀🎶🎮👕⛈️🌍🎻🚗
Tweet media one
3
15
57
0
3
11
@shishirpatil_
Shishir Patil
10 months
🦍 in AI Business’s top 12 models - Read more at
@profjoeyg
Joey Gonzalez
10 months
I am excited to announce that two of the LLMs from my group (Gorilla and Vicuna) are on AI Business’s top 12 models. Congrats @shishirpatil_ , @tianjun_zhang , @xinw_ai , and the @lmsysorg team.  We are looking forward to working with @Meta on Llama-v2 versions.
2
7
51
0
0
9
@shishirpatil_
Shishir Patil
6 months
🫡 The stellar team behind this @fanjia_yan @AbecidAdam @tianjun_zhang and yours truly, shepherded incredibly by Ion Stoica, and @profjoeyg 🙏
2
0
9
@shishirpatil_
Shishir Patil
8 months
@CShorten30
Connor Shorten
8 months
Super happy that Weaviate Gorilla was featured in @1littlecoder 's AI Updates series!! Thank you so much and massive kudos on the series - really impressive diversity! One of the best AI / ML news shows out! 🔥🙏
5
10
35
0
1
8
@shishirpatil_
Shishir Patil
1 month
🌐 Pioneering a future where LLMs empower microservices & apps, evolving from mere data retrievers 🧵to autonomous decision-makers within our digital world 🧙 Wondering about the safety and correctness of these interactions🤔? Our latest vision paper explores these questions,
Tweet media one
1
1
9
@shishirpatil_
Shishir Patil
20 days
While one can further improve the throughput with optimizations, and of-course costs vary with the contracts and vintage of GPUs, that is besides the point. With cost and latency, our goal is to identify what the magnitude of costs and latency looks like, and understand the
2
2
8
@shishirpatil_
Shishir Patil
1 year
🤩 Shoutout to @omarsar0 who did a much better job explaining our paper on Twitter than we ever could 🙏
@omarsar0
elvis
1 year
Finetuning LLMs to call APIs Present Gorilla, a finetuned LLaMA-based model that surpasses GPT-4 on writing API calls. This capability can help identify the right API, boosting the ability of LLMs to interact with external tools to complete specific tasks. Huge potential for
Tweet media one
14
208
880
1
0
7
@shishirpatil_
Shishir Patil
2 months
@simonw @AnthropicAI @DbrxMosaicAI Yeah - spot on! And we also describe the methodology and eval-metrics here:
1
0
8
@shishirpatil_
Shishir Patil
23 days
💯
@martin_casado
martin_casado
24 days
A few (maybe obvious) challenges using LLMs within software applications I've seen as companies roll out their use: - versioning : While it's possible to tie a program to a specific model version, there is no structured way to handle new model versions (e.g. deprecation, sub
10
9
133
0
1
7
@shishirpatil_
Shishir Patil
8 months
@CShorten30 Putting the "W" back in @weaviate_io 🤩 Congratulations @CShorten30 and @philipvollet Exciting days 🚀🚀🚀
1
1
6
@shishirpatil_
Shishir Patil
3 months
Joint work with @UCBerkeley 's finest undergrads @charlie_jcj02 , @fanjia_yan , @HuanzhiMao and project co-led by @tianjun_zhang , Ion Stoica, and @profjoeyg 🫡
1
0
6
@shishirpatil_
Shishir Patil
3 months
🎁In OpenFunctions-v2, we natively train the model to support parallel functions (generate multiple functions at a time) and multiple functions (select one or more functions). Java/REST/Python APIs are also supported for the first time with extended data types📷 Looking forward
1
1
6
@shishirpatil_
Shishir Patil
7 months
🔥
@saihv
Sai Vemprala
7 months
Introducing GRID: the General Robot Intelligence Development platform, designed for prototyping smart and safe robots rapidly using foundation models, LLMs, and simulation. Paper: Try now: GitHub: 🧵👇(1/N)
18
188
875
0
0
5
@shishirpatil_
Shishir Patil
2 years
Picking the right loss functions for adaptation on the fly! Excited for @zicokolter talk on @goyalsachin007 work #ICML2022
@goyalsachin007
Sachin Goyal @ ICLR ‘24 🏰
2 years
How do you find the "right" loss for test-time adaptation to distribution shifts? Turns out we can use convex conjugates! New paper📢 with @Eric_jie_thu , Aditi Raghunathan, @zicokolter . Test-Time Adaptation via Conjugate Pseudo-labels
4
24
112
0
1
5
@shishirpatil_
Shishir Patil
10 months
@marktenenholtz Thanks for featuring our work @marktenenholtz Appreciate it 🙇‍♂️ 🦍
0
0
3
@shishirpatil_
Shishir Patil
2 years
Special thanks to @jainprateek_ , @smolix , and @petewarden for the intense brainstorming sessions 🙂 @drothkid , @charlespacker for ✍️ suggestions.
0
0
4
@shishirpatil_
Shishir Patil
11 months
@sunilmallya Thanks for your interest @sunilmallya Yes, indeed! We will open-source the dataset for the newer APIs as well!
1
1
4
@shishirpatil_
Shishir Patil
10 months
@CShorten30 @tianjun_zhang @weaviate_io Absolutely @CShorten30 ! Would love to make Gorilla 🤝 @weaviate_io happen 🤩
2
0
4
@shishirpatil_
Shishir Patil
8 months
Had a great chat with @chsrbrts on #GorillaLLM How often do you think hyperscalers change their APIs? Spoiler: More than once a day!!! Check out the blog to learn more 🦍🦍🦍
@chsrbrts
Chase Roberts
8 months
We interview @shishirpatil_ on Episode #5 of Neural Notes about Gorilla, an LLM that generates API calls.🦍 Why does this matter? LLMs are well-suited for code generation bc the task is forgiving since there are multiple ways to write a function, but they struggle with API code
1
5
13
0
0
4
@shishirpatil_
Shishir Patil
2 years
Exciting to see the tides shift. 'ML sensors' decouples the abstraction between the ML-enabled-sensors and and the rest of the system.
@petewarden
Pete Warden
2 years
I'm finally able to talk about what I've been up to for the last six months!
18
22
209
0
0
3
@shishirpatil_
Shishir Patil
9 months
@CShorten30 @tianjun_zhang Note to self: Need to set-up good lighting 🙈😀
1
0
2
@shishirpatil_
Shishir Patil
1 month
@jiayq Agree on (1) and (2). Although for (3), from the set of our users, we actually see a healthy number of scenarios where you are choosing between a handful of functions. Especially in enterprise use-cases, it's either choose A/B/C or default D (usually support, doc q-a, etc)..
1
1
3
@shishirpatil_
Shishir Patil
3 months
@intrstllrninja Thanks for your kind words @intrstllrninja Eval Code: Eval Dataset: Let us know if you run into any issues, or as with any open-source feel free to raise a PR!
1
0
3
@shishirpatil_
Shishir Patil
2 years
Our framework, Private Optimal Energy Training (POET) takes a memory and a run-time budget as input. POET schedules nodes of the training graph to satisfy the constraints by exploiting integrated rematerialization and paging. POET's MILP formulation is provably energy-optimal!
Tweet media one
1
0
3
@shishirpatil_
Shishir Patil
1 month
@Teknium1 @ivanfioravanti @Teknium1 it's added with PR This plot might be useful for understanding where different models shine! @ivanfioravanti depending on your use-case, your mileage may vary! Of-course let us know if you have any comments, suggestions, find bugs, or
Tweet media one
2
0
3
@shishirpatil_
Shishir Patil
1 month
How are LoRAs and longer contexts for LLMs related? Check out @xiuyu_l and @sijun_tan 's latest work on training LoRA adapters to support in-domain long-context 🗞️
@xiuyu_l
Xiuyu Li
1 month
Handling long context in LLMs is expensive, but can we cut the cost by learning them offline for a specific set/genre of documents? Introducing LLoCO, our new technique that learns documents offline through context compression and in-domain finetuning using LoRA, which archives
Tweet media one
7
53
269
1
1
3
@shishirpatil_
Shishir Patil
16 days
@djfarrelly @inngest You may find this to be an useful tool:
1
0
2
@shishirpatil_
Shishir Patil
8 months
@_parasj Congratulations on the release Paras! 🔥
1
0
2
@shishirpatil_
Shishir Patil
10 months
@AustinNWharton @marktenenholtz Hey @AustinNWharton thanks for sharing your feedback! Gorilla CLI is fully open sourced, and commands are executed only with users explicit approval! Let us know if there is anything we can do that would make you feel more comfortable giving it a try!
0
0
2
@shishirpatil_
Shishir Patil
9 months
@ishaan_jaff Wow, that's a nice side-by-side comparison tool 🚀
0
0
2
@shishirpatil_
Shishir Patil
8 months
@1littlecoder @CShorten30 Yeah echo that - thanks for covering it @1littlecoder 🙏
1
0
2
@shishirpatil_
Shishir Patil
28 days
@JenniferHli Congratulations @JenniferHli and best wishes 🍾 😊
1
0
2
@shishirpatil_
Shishir Patil
10 months
@dav_ell @AlphaSignalAI Thanks for trying it and your kind words @dav_ell 🫡
2
0
2
@shishirpatil_
Shishir Patil
10 months
@dotey Hey @dotey 👋 We do have some metrics in the paper, but would love it if you gave it a try and have any feedback for us on how we can improve it! 🙂
0
0
2
@shishirpatil_
Shishir Patil
10 months
@simonw Loved to see the clean integration! This is another great modality to use gorilla right in your CLI!
0
0
0
@shishirpatil_
Shishir Patil
8 months
🦍
@osanseviero
Omar Sanseviero
9 months
Which are the best ML discord servers to join? So many 😱 How to keep up with all of them? Which ones am I missing? LAION Eleuther OS Shrunkworks AI (MoE project) Hugging Face CarperAI Open Assistant Harmonai (Dance Diffusion) Replicate Suno (bark, TTS) Stable Foundation Invoke
31
42
289
0
2
2
@shishirpatil_
Shishir Patil
2 years
With POET, we are the first to demonstrate training of memory-hungry SOTA ML models such as BERT and ResNets on smartphones and tiny ARM Cortex-M devices! 💪 With my wonderful collaborators @_parasj , @prabaldutta , Ion Stoica, and @mejoeyg at @berkeley_ai @ucbrise @Berkeley_EECS
1
0
2
@shishirpatil_
Shishir Patil
20 days
0
0
1
@shishirpatil_
Shishir Patil
1 month
@jeankaddour Thanks @jeankaddour Appreciate it 🙏
0
0
1
@shishirpatil_
Shishir Patil
1 month
0
0
1
@shishirpatil_
Shishir Patil
15 years
Looking out for followers
0
0
1
@shishirpatil_
Shishir Patil
2 months
@skypilot_org cyberpunk 2077 when? Congratulations 🎉
1
0
1
@shishirpatil_
Shishir Patil
3 months
@kihemaitien Thanks for bringing this to our attention, @kihemaitien . In the process of adding!
0
0
1
@shishirpatil_
Shishir Patil
2 years
Fine-tuning models on the edge satisfies privacy constraints and enables offline operation. However, limited memory on edge makes training memory-hungry deep learning models infeasible.
1
0
1
@shishirpatil_
Shishir Patil
3 months
@ibuildthecloud Excited to see how it goes @ibuildthecloud ! Let us know if you have any feedback on the leaderboard or the model.
0
0
1
@shishirpatil_
Shishir Patil
1 year
@nickarner Here's our work on training memory intensive models on smartphones and microcontrollers (ARM Cortex M4s)
0
0
1
@shishirpatil_
Shishir Patil
2 months
0
0
1
@shishirpatil_
Shishir Patil
7 months
@lisabdunlap +1 Can confirm. Lisa has her way with matplotlib 🫡
0
0
1
@shishirpatil_
Shishir Patil
8 months
@conor_power23 @eternalroree *feature not bug 🤝
0
0
1
@shishirpatil_
Shishir Patil
9 months
@alexchaomander Was great chatting with you @alexchaomander Really appreciated the insightful questions!!
0
0
1
@shishirpatil_
Shishir Patil
2 years
@EugeneVinitsky Yeah, and you want to make sure you power them sufficiently. If you don't give them enough juice, they'll be severely under-clocked even in headless mode.
1
0
1
@shishirpatil_
Shishir Patil
4 years
@pschafhalter Multics. Does it ring a bell ?😉
1
0
1
@shishirpatil_
Shishir Patil
8 months
@CShorten30 The pleasure was mine, Connor!! Excited to see this partnership blossom ❤️ and great job on the release 🫡
1
0
1