Simon Willison Profile Banner
Simon Willison Profile
Simon Willison

@simonw

71,072
Followers
5,468
Following
2,950
Media
50,662
Statuses

Creator @datasetteproj , co-creator Django. PSF board. @nichemuseums . Hangs out with @natbat + @cleopaws . He/Him. Mastodon:

San Francisco, CA
Joined November 2006
Don't wanna be here? Send us removal request.
Pinned Tweet
@simonw
Simon Willison
24 days
I gave a 50m talk at the Story Discovery at Scale data journalism conference at Stanford a few weeks ago. The video is now out, and I've written an extensive annotated version AI for Data Journalism: demonstrating what we can do with this stuff right now
7
35
166
@simonw
Simon Willison
2 years
New hobby: prototyping video games in 60 seconds using a combination of GPT-3 and DALL-E Here's "Raccoon Heist"
Tweet media one
Tweet media two
119
1K
8K
@simonw
Simon Willison
2 years
If someone gives you a CSV file with 100,000 rows in it, what tools do you use to start exploring and understanding that data?
2K
894
7K
@simonw
Simon Willison
3 years
Here's a piece of information that will send a chill down the spine of anyone who's ever designed a database schema: Our new house that we just moved into... has two zip codes!
273
760
6K
@simonw
Simon Willison
1 year
Leaked Google document: “We Have No Moat, And Neither Does OpenAI” The most interesting thing I've read recently about LLMs - a purportedly leaked document from a researcher at Google talking about the huge strategic impact open source models are having
126
1K
5K
@simonw
Simon Willison
2 years
TIL you can run SQL queries directly against CSV files as a one-liner using the default sqlite3 command line utility
Tweet media one
57
779
5K
@simonw
Simon Willison
3 years
"Hosting SQLite databases on Github Pages" is absolutely brilliant: it adds a virtual filesystem to SQLite-compiled-to-WebAssembly in order to fetch pages from the database using HTTP range requests
46
966
4K
@simonw
Simon Willison
1 year
If you're just starting to learn software engineering right now but you're considering dropping it because you think the field might be made obsolete by AI, I have an alternative approach to suggest for you: Start learning now, and use AI tools to learn FASTER
69
367
3K
@simonw
Simon Willison
16 days
"Do stuff and then blog about it" remains one of the most underrated pieces of career advice
@vboykis
vicki
17 days
An absolutely fantastic way to increase this is to start a blog. Almost all the cool fun stuff in my professional life for me has come from doing stuff then blogging about it.
11
141
1K
25
319
3K
@simonw
Simon Willison
5 months
@d_feldman It's got me to the point where I can read Spanish language news articles and understand ~80% of them - spoken Spanish is much harder
25
30
3K
@simonw
Simon Willison
1 year
@tomgara That is the most "Society of Human Resource Management" story I could possibly imagine
5
37
3K
@simonw
Simon Willison
6 years
How about if, instead of ditching Twitter for Mastodon, we all start blogging and subscribing to each other's Atom feeds again instead? The original distributed social network could still work pretty well if we actually start using it
102
644
2K
@simonw
Simon Willison
6 months
Found the system prompt that drives this thing here: It works by generating a base64 encoded PNG of the drawn components, then passing that to GPT-4 Vision with that system prompt and instructions to "Turn this into a single html file using tailwind"
@tldraw
tldraw
6 months
let's go
148
1K
8K
23
244
2K
@simonw
Simon Willison
5 months
I tried this out yesterday and it's incredible: download a 4GB binary, child 755 it and now you have a full LLM and the software needed to run it ready to go, with multiplied operating system platforms supported by that single file
@JustineTunney
Justine Tunney
5 months
I spent the last month building llamafile which is the fattest executable file format ever. It lets you turn LLM weights into runnable llama.cpp binaries using cosmo libc. Blog post:
49
449
3K
15
190
2K
@simonw
Simon Willison
1 month
I built a new tool! It's a single page web app that runs OCR against images and PDFs entirely in your browser (no file upload needed) using Tesseract.js and PDF.js You can drop files onto it, or you can click to select and open them (which works on Mobile Safari as well)
36
230
2K
@simonw
Simon Willison
1 year
MiniGPT-4 is pretty astonishing: an AI chatbot you can use to ask questions about an image (a feature that's been promised but not yet shipped by GPT-4), building on top of the Vicuna-13B LLM (derived from LLaMA) and BLIP-2 vision-language model
Tweet media one
36
238
2K
@simonw
Simon Willison
2 years
This is really grim, if not entirely unexpected: apparently the Instagram mobile app injects additional JavaScript into every page that's loaded using the in-app embedded browser - here's the tool @KrauseFx built to track changes made to the DOM when loading a page
Tweet media one
@KrauseFx
Felix Krause
2 years
💥 New Post: Instagram & Facebook tracks everything you do on any website in their in-app browser
Tweet media one
95
2K
5K
26
695
2K
@simonw
Simon Willison
2 months
TIL about binary vector search... apparently there's a trick where you can take an embedding vector like [0.0051, 0.017, -0.0186, -0.0185...] and turn that into a binary vector just reflecting if each value is > 0 - so [1, 1, -1, -1, ...] and still get useful cosine similarities!
68
158
2K
@simonw
Simon Willison
3 years
GitHub issues tip: if you paste in a link to an issue or PR in another repo it will display it as a truncated URL, but if you instead add it in a hyphenated bullet point it will display the title of the issue and and indicate if it is open or closed
11
324
2K
@simonw
Simon Willison
3 months
AI may enable anyone to produce code, but that's not the same thing as enabling anyone to develop software The typing-out-code bit is one of the least challenging parts of building useful software that solves real problems
@Carnage4Life
Dare Obasanjo🐀
3 months
Jensen Huang, CEO of Nvidia, argues that we should stop saying kids should learn to code. He argues the rise of AI means we can replace programming languages with human language prompts thus enabling everyone to be a programmer. AI will kill coding.
1K
5K
21K
124
165
2K
@simonw
Simon Willison
2 years
Whoa. runs a full Debian VM entirely in your browser via WebAssembly... and it ships with working Perl, Python, Ruby and Node.js!
@leaningtech
Leaning Technologies
2 years
We have made a server-less virtual Linux environment that runs unmodified Debian binaries in the browser. This is powered by CheerpX, a WebAssembly virtualization platform. Feel free to play with it and report bugs:
Tweet media one
16
151
488
26
394
2K
@simonw
Simon Willison
1 year
I expect GPT-4 will have a LOT of applications in web scraping The increased 32,000 token limit will be large enough to send it the full DOM of most pages, serialized to HTML - then ask questions to extract data
60
104
1K
@simonw
Simon Willison
2 years
Reddit conversation about using GPT-3 to write your homework. A teacher comments: "Grading something an AI wrote is an incredibly depressing waste of my life."
61
188
1K
@simonw
Simon Willison
4 years
SQL is a better API language than GraphQL. Convince me otherwise!
128
174
1K
@simonw
Simon Willison
2 years
Fascinating HN comment from someone who's company built a custom distributed data warehouse using compressed SQLite DB files in S3 that were queried using Lambda functions orchestrated by PostgreSQL running a custom foreign data wrapper
Tweet media one
23
210
1K
@simonw
Simon Willison
1 year
Notes on how I ran Facebook's 7B LLaMA model on my 64GB M2 MacBook Pro using llama.cpp by @ggerganov It's genuinely possible to run a LLM that's hinting towards the performance of GPT3 on your own hardware now! I thought that was still a few years away
41
239
1K
@simonw
Simon Willison
11 months
Understanding GPT tokenizers: I wrote about how the tokenizers used by the various GPT models actually work, including an interactive tool for experimenting with their output
24
256
1K
@simonw
Simon Willison
3 years
A lesson I re-learn on every project: always have an automatically populated "created_at" column on every single database table. Any time you think "I won't need it here" you're guaranteed to want to use it for debugging something a few weeks later.
46
131
1K
@simonw
Simon Willison
2 years
The stuff happening on the Stable Diffusion subreddit right now is pretty wild - since the model can be run by anyone on their own machine if they have a decent GPU
25
250
1K
@simonw
Simon Willison
7 months
Prompt injection comes to GPT-4V
@mn_google
Patel Meet 𝕏
7 months
In GPT-4V Image content can override your prompt and be interpreted as commands.
Tweet media one
18
44
247
23
120
1K
@simonw
Simon Willison
2 years
16
60
1K
@simonw
Simon Willison
1 year
We accidentally invented computers that can lie to us and we can't figure out how to make them stop
83
264
1K
@simonw
Simon Willison
3 years
PostgreSQL 14 adds new syntax for accessing JSON data: SELECT * FROM shirts WHERE details['attributes']['color'] = '"neon yellow"' I like this so much more than the -> operators, which stubbornly refuse to stick in my head
@craigkerstiens
Craig Kerstiens
3 years
Y'all this new JSON subscript syntax in Postgres 14 is sweet. Super excited to see Postgres just getting better bit by bit -
7
105
415
26
251
1K
@simonw
Simon Willison
7 months
If you haven't tried Claude yet it's a absolutely worth spending time with - I lean on it a lot for working with longer documents, since it can handle 100,000 tokens (GPT-4 is only 8,000) at a time Plus you can upload PDFs to it - I've used it with 100+ page documents
@AnthropicAI
Anthropic
7 months
We’re rolling out access to to more people around the world. Starting today, users in 95 countries can talk to Claude and get help with their professional or day-to-day tasks. You can find the list of supported countries here:
158
362
2K
57
94
1K
@simonw
Simon Willison
3 years
We really need to start teaching web developers how to use links
25
202
1K
@simonw
Simon Willison
1 year
A new post about prompt injection attacks, which I'm increasingly concerned about now that people are hooking LLMs up to external tools through Auto-GPT, ChatGPT Plugins etc
27
250
1K
@simonw
Simon Willison
2 years
Not sure if this is a controversial opinion or not: unit tests should make up a minority segment of your overall automated test suite I'd absolutely take a project with integration and no unit tests over one with unit tests but no integration tests
130
121
1K
@simonw
Simon Willison
1 year
I see people being deceived by this again and again: ChatGPT can NOT read content from URLs that you give it, but will convincingly pretend that it can Crucial to spread this message any time you see anyone falling into this trap
78
245
1K
@simonw
Simon Willison
2 years
"GitLab plans to automatically delete projects if they've been inactive for a year and are owned by users of its free tier." Absolutely shocking decision from @gitlab , I very much hope they reconsider this
35
216
1K
@simonw
Simon Willison
3 years
My favourite reactions to this come from people who work in civic tech, because unlike regular corporate gig programmers they're not allowed to just ignore weird edge-cases like this
@cydharrell
Cyd Harrell
3 years
oh no
7
23
258
16
56
1K
@simonw
Simon Willison
1 year
@emnode @tobyordoxford I don't want my search engine to be vengeful
20
13
1K
@simonw
Simon Willison
2 years
Love this idea that the reason voice assistants don't seem to stick for most people is that they're actually command line interfaces, but even less discoverable because they don't provide any visible feedback at all
@edent
Terence Eden is on Mastodon
2 years
@daviddlow @charlesarthur @benedictevans I've droned on endlessly about how you can't expect normal people to use the command line. That's what Alexa is. If you don't say the *precise* invocation correctly, you get an error. And because there's no display, you have to remember dozens of different commands. It's too hard
10
30
306
33
249
1K
@simonw
Simon Willison
5 months
Love that we live in a time where "your software got lazier" is a legit piece of feedback
@ChatGPTapp
ChatGPT
5 months
we've heard all your feedback about GPT4 getting lazier! we haven't updated the model since Nov 11th, and this certainly isn't intentional. model behavior can be unpredictable, and we're looking into fixing it 🫡
894
665
9K
20
82
1K
@simonw
Simon Willison
1 year
If you're a programmer and you're still thinking that all of this ChatGPT stuff is a waste of your time, I strongly suggest reviewing this example It's over-hyped, sure - but it's not something anyone in our profession should continue to ignore
@simonw
Simon Willison
1 year
This entire benchmarking project took just three prompts
Tweet media one
Tweet media two
Tweet media three
10
32
352
37
127
1K
@simonw
Simon Willison
5 years
Sentences with the word "just" in them always work better if you drop that word. "Why don't you just add caching?" - that one word implies "I don't value your expertise or expect you to have thought this through" "Why don't you add caching?" - now we can have a conversation.
47
226
1K
@simonw
Simon Willison
2 months
Just got ChatGPT Code Interpreter to write me a SQLite extension in C from scratch, then compile it with GCC, then load it into Python and test it out, then iterate on it to fix the bugs All on my phone while pottering around the house
29
72
1K
@simonw
Simon Willison
4 years
Made myself a self-updating GitHub personal README! It uses a GitHub Action to update itself with my latest GitHub releases, blog entries and TILs
Tweet media one
22
134
1K
@simonw
Simon Willison
6 months
Looks like the reason that letter only had 500 signatures out of 770 might be that the rest of the company were asleep
@lilianweng
Lilian Weng
6 months
About 650 / 770 signed at this moment. As people start waking up, more will come. All the efforts started after 1:30 AM, 500+ within two hours and all of this after 2 crazy days with very little sleep.
161
605
5K
16
54
983
@simonw
Simon Willison
6 months
Sigh. Tip for if you're planning on suing an AI company: asking a model if something is included in its training data is not a reliable way method for telling what is in its training data
@minimaxir
Max Woolf
6 months
ಠ_ಠ
Tweet media one
39
14
287
27
49
974
@simonw
Simon Willison
2 months
TIL Google Chrome has a --headless option you can use to take a screenshot from the CLI that's built into the default installation
Tweet media one
21
94
965
@simonw
Simon Willison
6 years
I just released datasette - a new tool for turning any SQLite database into a web interface and JSON API:
Tweet media one
22
308
946
@simonw
Simon Willison
6 years
Maybe the solution to the Fermi paradox is that significantly advanced civilizations discover crypto currencies and then furiously burn through all available energy sources until they go extinct
@EricHolthaus
Eric Holthaus
6 years
Uhhh... about bitcoin... it's actually ruining the planet. The bitcoin computer network currently uses as much electricity as Denmark. In 18 months, it will use as much as the entire United States. Something's gotta give. This simply can’t continue.
732
11K
16K
29
484
935
@simonw
Simon Willison
10 months
Has anyone managed to run Llama 2 GPU accelerated on an M1/M2 Mac yet? Bonus points if you can provide extremely meticulous step-by-step instructions for replicating what you did!
26
55
955
@simonw
Simon Willison
2 years
@moorehn @dancow @sewellchan @guardian I'm fascinated by their use of the term "build" - they talk about building a lot, took me a while to realize that their version of building is funneling money into speculative investments and convincing others to do the same
20
28
930
@simonw
Simon Willison
24 days
What's the cheapest option right now for me to spin up a Linux server somewhere for an hour with enough GPU to run the latest Mixtral model?
111
63
952
@simonw
Simon Willison
3 years
One of the most obvious flaws in using blockchains for anything involving regular human beings is one I've not seen much discussion of: Regular human beings cannot protect their passwords, credentials or private keys. They just can't.
52
155
921
@simonw
Simon Willison
2 years
Weird, GitHub have deprecated one of their GraphQL APIs in favor of a REST one
Tweet media one
32
129
905
@simonw
Simon Willison
7 months
Embeddings: What they are and why they matter I took my recent PyBay talk and turned it into the most comprehensive answer I could possibly provide to the question "What are embeddings?"
21
133
894
@simonw
Simon Willison
2 years
I built a new tool: s3-ocr, a utility for running OCR (with Amazon Textract) against every PDF file in a S3 bucket and getting the results back as a searchable SQLite database
22
113
848
@simonw
Simon Willison
2 years
My own answer: I either open the CSV directly in the Datasette Desktop Mac application () or I do this: sqlite-utils insert /tmp/data.db rows big.csv --csv datasette /tmp/data.db That gives me a table called "rows" in a fresh SQLite database
13
41
849
@simonw
Simon Willison
4 years
This video is a great example of why I just can't agree with historians who are pushing back at efforts to use AI to colorize and up-rez historic videos - it's honestly like a magic portal window back to 124 years ago
14
144
829
@simonw
Simon Willison
1 year
Notes on how I turned an hour-long video of a digital thermometer into temperature readings over time by using ffmpeg to split the video into a frame every 10s and Google Cloud Vision to run the OCR I got ChatGPT/GPT-4 to show me how to use both of those:
Tweet media one
27
87
840
@simonw
Simon Willison
1 year
This is the thing I find most interesting about ChatGPT as a learning assistant: it turns out having a teacher that's mostly right but occasionally very wrong makes me think much more critically about what I'm learning, which I think is helping me build more robust mental models
@random_walker
Arvind Narayanan
1 year
When ChatGPT came out I thought I wouldn't use it for learning because of its tendency to slip in some BS among 10 helpful explanations. Then I tried it, and found that it forces me to think critically about every sentence, which is the most effective mindset for learning.
40
189
2K
43
100
825
@simonw
Simon Willison
6 years
Couldn't agree with this more: I Google the most trivial code things in languages that I've used for over a decade dozens of times a day. The amount of detail you actually need to commit to memory in 2018 keeps getting smaller - remember what CAN be done, not exactly how to do it
@patio11
Patrick McKenzie
6 years
Periodic observation for the benefit of junior developers: You do not have to be embarrassed about not knowing a particular bit of syntax or API. Googling things efficiently is a core job skill. ~15 years in I'll still look up "append to array javascript."
70
766
3K
20
303
817
@simonw
Simon Willison
3 years
Amusingly I first learned this when someone sent us an encrypted PDF where the password was our zip code, and what I thought was the zip code didn't work so I ran a brute force password cracker against it
5
25
818
@simonw
Simon Willison
2 years
This is a crazy cool hack: Crunchy Data have a new PostgreSQL tutorial series which runs a full PostgreSQL server compiled to WebAssembly entirely in your web browser so you can try things out!
Tweet media one
@craigkerstiens
Craig Kerstiens
2 years
And here it is, come join us @crunchydata and learn some Postgres at our playground -
19
75
293
10
192
822
@simonw
Simon Willison
18 days
Realized I've reached the point now with GitHub Copilot autocomplete where I can often guess exactly what it's going to suggest, pause for a moment to wait for it to make the suggestion and accept it and move on It's majority a typing assistant now and I really like it for that
42
30
822
@simonw
Simon Willison
1 year
Good thing my blog is behind Cloudflare
@elonmusk
Elon Musk
1 year
Might need a bit more polish …
4K
4K
40K
34
16
804
@simonw
Simon Willison
6 months
One thing I missed from yesterday: the cost of storing data in an OpenAI assistant for retrieval question answering etc is VERY steep: $0.20/GB/assistant/day Compare to S3 standard pricing which is about $0.023 per GB per month, so I think ChatGPT assistant storage is 260x more!
39
62
809
@simonw
Simon Willison
4 months
What are some LLM-driven products that you use at least once a week that aren't ChatGPT, Bard, Bing or GitHub Copilot - things that are built on LLMs but aren't direct chat interfaces to LLMs
206
47
793
@simonw
Simon Willison
2 years
A tip for writing more: expand your definition of completing a project (any project, no matter how small) to include writing a blog post (or README or similar) that explains that project
23
115
784
@simonw
Simon Willison
3 years
This is fascinating: as-of 2017 university instructors have been increasingly encountering students who have absolutely no idea how files and folders on a computer work
@dancow
Dan Nguyen
3 years
omg turns out I wasn't completely bullshitting all the times when I fussed to students that "file not found" would be their most common and soul-destroying "bug" when trying to learn programming
Tweet media one
26
138
587
43
247
785
@simonw
Simon Willison
3 years
Also fun: Instacart refuse to deliver to one of the zip codes, but we tried the other one and successfully placed an order
8
9
780
@simonw
Simon Willison
2 years
A public librarian on why two-factor authentication is a huge barrier to people with limited technology access
@heyakwan
Aydin Kwan
2 years
The assumption that people will have consistent access to the same mobile number simply isn’t true for a lot of people. Phones cost money. So do phone plans. Phones break or get stolen. We all know this. Yet people who don’t have phones still need access to their email accounts.
20
269
2K
10
341
777
@simonw
Simon Willison
7 months
New release of llm-gpt4all - it's now really easy to run models like mistral-7b-instruct-v0 on your own machine: pip install llm llm install llm-gpt4all llm -m mistral-7b-instruct-v0 "ten facts about pelicans"
22
110
782
@simonw
Simon Willison
2 months
@AdamRackis Engineers who understand web fundamentals are vastly more useful and productive than engineers who can only work with whatever the latest popular framework is
27
15
780
@simonw
Simon Willison
6 months
I had not realized quite how easy voice cloning is now. I used the free demo on just now (also used for this Attenborough demo) and got a passable imitation of my own voice on first try, from a poor quality 10m audio sample I had lying around already
@charliebholtz
Charlie Holtz
6 months
David Attenborough is now narrating my life Here's a GPT-4-vision + @elevenlabsio python script so you can star in your own Planet Earth:
728
5K
27K
18
57
777
@simonw
Simon Willison
9 months
The video for my North Bay Python talk is out, and I've put together an accompanying edited transcript with annotated slides and links If you haven't been completely immersed in this world for the last year, my hope is this can help catch you up!
20
154
772
@simonw
Simon Willison
6 months
This is really worth your time - a very solid technical introduction to LLMs, great if you've not been paying close attention but I picked up quite a few useful details from it too
@karpathy
Andrej Karpathy
6 months
New YouTube video: 1hr general-audience introduction to Large Language Models Based on a 30min talk I gave recently; It tries to be non-technical intro, covers mental models for LLM inference, training, finetuning, the emerging LLM OS and LLM Security.
Tweet media one
585
3K
18K
16
62
767
@simonw
Simon Willison
5 years
@PitHoyaFan88 @drvox Absolutely this - the mythology of Person of Interest kept getting deeper and more interesting and they nailed the ending. Recent developments in AI have made it even more relevant - the tech it imagined feels more and more realistic every year
7
39
732
@simonw
Simon Willison
7 months
Frustrating limitation from Claude by @AnthropicAI - I uploaded a 30 page PDF and asked it "What are the key findings in this paper? Illustrate with quotes" And it refused! "I cannot provide lengthy excerpts or summaries from the copyrighted material you shared."
Tweet media one
62
52
754
@simonw
Simon Willison
1 year
Some wild speculation here, but I think it might be possible to train a LLaMA 7B sized model for $85,000 now, and maybe run that model directly in your web browser - with more capabilities than ChatGPT, through hooking up extra tools to it (like Bing)
25
101
751
@simonw
Simon Willison
3 years
"Cryptocurrency is one of the worst inventions of the 21st century. I am ashamed to share an industry with this exploitative grift. It has failed to be a useful currency, invented a new class of internet abuse, further enriched the rich, wasted staggering amounts of electricity,
@Rich_Harris
Rich Harris
3 years
"If not for cryptocurrency, these services would still be available."
29
336
911
12
311
735
@simonw
Simon Willison
27 days
Have any of the large scale LLM training organizations - Anthropic, OpenAI, Gemini, Mistral, the Llama team - published anything notable about this idea of "model collapse" yet - the worry that LLM quality will drop as their training data becomes pollinated by model output?
68
62
744
@simonw
Simon Willison
2 years
DALL-E mini has renamed itself to Craiyon, in response to a request from OpenAI. You can use it for free (with no waiting list) at
15
166
718
@simonw
Simon Willison
2 months
Well this @OpenAI refusal is new and I don't like it at all
Tweet media one
86
36
711
@simonw
Simon Willison
2 years
Want to know the secret to blogging more often? Lower your standards! A post which you don't think is ready yet is a LOT better than a giant folder full of drafts that no-one ever gets to see (Your readers won't ever know how good the thing you wanted to write would have been)
33
94
703
@simonw
Simon Willison
2 years
I'm finding Mastodon a lot easier to understand having I've realized that it's just blogs. Everyone gets a little blog, on their own server or someone else's. Following someone is pretty much subscribing to their feed. You can even roll your own implementation from scratch.
17
144
697
@simonw
Simon Willison
1 year
Made some quick notes on how to use the OpenAI Python library to make ChatGPT API calls and stream out the response tokens as they arrive
10
65
688
@simonw
Simon Willison
1 year
Got access to Google Bard! I'm pleased to report that it is ethically opposed to necromancy
Tweet media one
37
76
684
@simonw
Simon Willison
9 months
This talk is the most complete version of my thinking around Large Language Models to date, including thoughts on personal AI ethics, practical applications and the enormous impact we are already feeling from Llama 2 Video, slides, transcript & links:
10
115
685
@simonw
Simon Willison
7 years
Accidentally deleted my Python source code but it was still resident in a running process - here's how I got it back
13
415
668
@simonw
Simon Willison
4 years
Anyone else getting tired of meticulously erasing ?s=21 several times a day?
20
31
675
@simonw
Simon Willison
5 months
This Q story is deeply concerning - if it's true that Q has access to private data like the location of AWS data centers that would suggest the team working on it have not been taking things like prompt injection attacks seriously at all
@emollick
Ethan Mollick
5 months
I know I say it a lot, but using LLMs to build customer service bots with RAG access to your data is not the low-hanging fruit it seems to be. It is, in fact, right in the weak spot of current LLMs - you risk both hallucinations & data exfiltration.
47
250
2K
21
77
680