After 5 amazing years at
@Waabi_ai
&
@Uber
ATG, surrounded by the most brilliant minds in AI and mentored by
@RaquelUrtasun
, I decided to embark on a new adventure with my close friend
@jerryjliu0
.
Today I’m excited to announce that we have started a company around
@llama_index
…
I’m super excited to make it official:
@disiok
and I have started a company around
@llama_index
, and we’ve raised a $8.5M seed round led by
@GreylockVC
! 🔥🚀
We are building the open-source data framework to unlock LLM capabilities on your private data.
Llama 2 Prompt Structure
is a fantastic playground for the latest Llama 2 model.
But wait, is the suggested prompt structure the best way to interact with the model?
"""
User: <your prompt goes here>
Assistant:
"""
Some initial notes 👇
2023 is a whole lot of “let’s do it”
- startup
- move to SF
- hire team
- new relationship
- mindset upgrades
Grateful for family, friends, believers, advisors.
The journey is just beginning.
Llama 2 Protip
If you are hacking with the
@replicate
hosted 13B or 70B models (playground or API), this is relevant for you!
The `max_length` parameter limits the total number of tokens (i.e. sum of input prompt + output response).
More details 👇
prototype: one weekend
productionize: weeks? months?
We are tackling this gap by open-sourcing production-quality app templates, so you can hit the ground running.
goes way beyond the typical streamlit demo, complete with advanced features like
-…
We’re excited to open-source - a full-stack, production-ready RAG app! 🦙🏦
Supports streaming, reasoning steps, citations, intuitive UI
This can save you weeks/months of hard work in trying to build a prod LLM app from scratch🔥
It's a nudge! It's a U-turn! It's S̶u̶p̶e̶r̶m̶a̶n̶ TrafficSim!
Excited to share our work on learning to simulate realistic multi-agent behaviors at
#CVPR2021
. Come to our poster session at 10pm ET today to learn more!
Joint work with Sebastian,
@sergioksas
, and
@RaquelUrtasun
My notes on the new
@OpenAI
fine-tuning release:
1. currently limited to GPT-3.5 Turbo with 4K token (GPT-4, function-calling, and gpt-3.5-turbo-16k all coming this fall)
2. similar DX as old GPT-3 fine-tuning API, just list of chat messages instead of (prompt, completion) pairs…
Similar to autonomous driving, maybe we should be explicit about automation levels for knowledge work as well.
L5: fully autonomous agents (AutoGPT et al.)
L4: semi-autonomous with human-in-the-loop support
L3: copilot, assistant, constrained cognitive tasks
Haha.
Auto-Regressive LLMs gonna auto-regress.
Your hands must remain on the keyboard at all time.
Level-2 Writing assistance? Yes!
Level-5 autonomous writing? No!
"Here’s What Happens When Your Lawyer Uses ChatGPT"
Designing a robust system requires clear interfaces and well-behaved components.
The future of LLM-powered systems is not one "monolithic agent" that does everything. It will be many specialized components (think query routing, knowledge retrieving, API calling).
New
@OpenAI
…
Few people know this: LlamaIndex got its name before the original Llama model came out.
Great Minds Think Alike? 🤔
Incredible to see Meta continuing its commitment to open source. And glad we can stack more 🦙.
Is reasoning inherently coupled with knowledge?
My intuition says there’s a small kernel of knowledge that can bootstrap reasoning.
Right now, to get improved reasoning, we also get useless facts.
I just want my LLM OS to be pure compute, and not have the pre-installs.
Setting a goal to tweet an original / personal thought every day.
Purely for the online to IRL friend pipeline. Optimizing for that tweet-to-friend ratio
A small milestone on this wild ride! 🚀
Super grateful for the love from
@llama_index
community, and support from our investors
@GreylockVC
and partners.
Excited for many more celebrations ahead with our incredible team.
Excited to announce that
@disiok
and I are featured in
@Forbes
30U30 2024! 🎉
It's been a wild year, and we couldn't have done this without our community, partners, investors, and of course our wonderful
@llama_index
team.
Can we stop trying to replace lawyers and scientists with LLMs, let’s just figure out how to get really really good at analyzing multiple documents first…
ChatGPT Plugins have not reached PMF.
This is unsurprising, but refreshing to hear it stated plainly by
@sama
.
Browsing Redfin through ChatGPT (or any chat interface) is painful, I think for 2 reasons:
1. When my intent/goal is clear, I might as well use the website’s own…
i also admire
@sama
's ability to confront reality and state the truth - ChatGPT Plugins have not reached PMF!
I don't wake up in the morning and book my travel and food thru ChatGPT. Instead, the apps are all integrating chat.
The Western consumer continues to resist the…
If you’re around at the
@agihouse_org
hackathon today, come check out
@disiok
’s talk on data agents! 🤖💾
- Advanced search/retrieval
- Act over a large range of external APIs (email, search, etc. )
Get a head start with our full agent guide here:
Today is my last day
@Uber
after 4 wonderful years. Thank you
@dkhos
for your support and leadership. Solving
#selfdriving
is my life's passion and I'm super excited for what comes next. Will share an update soon. Stay tuned!
> For those trying to educate, consider writing longform, designed for someone to get "sweaty", especially in today's era of quantity > quality ... This is what I aspire to do. My audience will decrease. The ones that remain might not like it. But at least we'll learn something.
Want to learn how to build and evaluate production RAG app with
@llama_index
and
@raydistributed
?
Join
#RaySummit
Training Day!
1️⃣ Implement reliable eval methods for LLMs
2️⃣ Run experiments to optimize app components
3️⃣ Take best configs to production
GPT4 is great and all but fully autonomous agents (AutoGPT et al.) aren’t happening any time soon.
Compounding error is still too severe for long horizon, multi-step tasks. The agent “diverges” and gets stuck in a thought loop.
Bullish on human-in-the-loop concepts though
The DSP project carries a lot of insights for improving RAG:
💡value of few-shot ex’s
💡declarative modules
💡compile an optimized system with distilled LM’s
We had a GREAT time chatting about this and more w/
@lateinteraction
on our latest webinar! 👉
Disclaimer: This is an adhoc investigation, and I've only tried a couple of prompts. I also don't know whether replicate hosted model handles encoding differently.
But the takeaway stands: watch out for subtle train/test mismatch bugs, especially when you don't own the API.
We have reached an agreement in principle for Sam Altman to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo.
We are collaborating to figure out the details. Thank you so much for your patience through this.
Startup is like
padding a kayak on the ocean,
with 10ft waves crashing,
while aircraft carriers fighting each other nearby.
Kinda fun though would recommend
In 2023 my Twitter feed shifted from academics to startup founders.
Striking how everyone lives in their own bubbles & optimizes their own imperfect proxy metrics.
Goodhart’s law
Reminder that you get to pick your own metric.
Realized I’ve never fully appreciated the beauty of human language.
It’s so perfectly imperfect. A single representation that allow us to express thought across various abstraction levels.
Structured enough to be compositional building blocks, yet flexible enough for adhoc…
The history of ML feels like a pendulum swinging between domain-agnostic and domain-specific models.
So far, domain-agnostic approaches have produced most big step changes, but what ends up getting deployed are highly optimized domain specific models.
Gorilla: Large Language Model Connected with Massive APIs
Releases Gorilla, a finetuned LLaMA-based model that surpasses the performance of GPT-4 on writing API calls.
proj:
abs:
First, the "llama13b-v2-chat" refers to the fine-tuned variant of the Llama 2 model.
While the base model simply completes text, the "chat" model is fine-tuned for conversations and instruction following.
But how does it recognize messages and instructions?
@OpenAI
The launch blog () did a great job of highlighting use-cases and setting expectations.
Fine-tuning helps with both quality/reliability and cost/latency.
* quality/reliability: improve steerability, reliable output formatting, custom tone
* cost/latency:…
Microsoft is training a custom, narrow-focus LLM specifically on the regulatory process for small nuclear plants. They need to build SMRs to power Bing's brain. MS expects the LLM to eliminate 90% of the costs and human hours involved.
@jxnlco
at one point in my life I thought happiness is about reading philosophy and grasping the truth of nature and self
turns out it's just sleep, eat, and exercise well
@OpenAI
There's also an updated fine-tuning guide with more details ().
Some practical advices from the guide:
1. split fine-tuning dataset into train & test (fine-tuning job will provide stats on both)
2. check data formatting with provided script locally before…
The term “hallucination” for large language models seems really misleading.
All ungrounded generations are hallucinations, just sometimes you get lucky and what comes out of the model is true
This is why retrieval augmented generation is so crucial for making this stuff useful
My notes on the new
@OpenAI
fine-tuning release:
1. currently limited to GPT-3.5 Turbo with 4K token (GPT-4, function-calling, and gpt-3.5-turbo-16k all coming this fall)
2. similar DX as old GPT-3 fine-tuning API, just list of chat messages instead of (prompt, completion) pairs…
Hey everyone, serious post for a second.
As you may have seen, I have unfortunately gotten my identity and stealth startup doxxed by reporters via voice forensics and web sleuthing. As the day has finally come, I thought I’d share more about who I am.
I’ve kept my identity…