Brendan Bycroft Profile Banner
Brendan Bycroft Profile
Brendan Bycroft

@BrendanBycroft

3,319
Followers
543
Following
27
Media
77
Statuses

kiwi, on a random walk. LLM Viz --

New Zealand
Joined June 2014
Don't wanna be here? Send us removal request.
@BrendanBycroft
Brendan Bycroft
6 months
Project #2 : LLM Visualization So I created a web-page to visualize a small LLM, of the sort that's behind ChatGPT. Rendered in 3D, it shows all the steps to run a single token inference. (link in bio)
114
1K
6K
@BrendanBycroft
Brendan Bycroft
6 months
Here's a technical guide on how I wrote & structured some of the LLM Visualization. A few people have been asking about it, so I thought I'd write it up. Lots of code screenshots, so not for everyone.
@BrendanBycroft
Brendan Bycroft
6 months
Project #2 : LLM Visualization So I created a web-page to visualize a small LLM, of the sort that's behind ChatGPT. Rendered in 3D, it shows all the steps to run a single token inference. (link in bio)
114
1K
6K
15
56
443
@BrendanBycroft
Brendan Bycroft
6 months
It also contains a walkthrough/guide of the steps, as well as a few interactive elements to play with. Why, you ask? For what purpose did I put all the time & effort into this project?
5
13
214
@BrendanBycroft
Brendan Bycroft
6 months
With this, you can see the whole thing at once. You can see where the computation takes place, its complexity, and relative sizes of the tensors & weights.
1
18
187
@BrendanBycroft
Brendan Bycroft
6 months
Oh yeah, the link is here: Works best on desktop (sorry mobile). Left-click drag, right-click rotate, scroll to zoom. And hover over the tensor cells. Blue cells are weights/parameters, green cells are intermediate values. Each cell is a single number!
Tweet media one
Tweet media two
Tweet media three
3
20
171
@BrendanBycroft
Brendan Bycroft
6 months
The model with all the animations is tiiny, to make it tractable. For comparison, I threw in a few of the larger models (GPT-2, GPT-3), render-only. And when you see what it takes to just produce a single value in a mat-mul, the sheer scale of these things becomes apparent.
4
10
130
@BrendanBycroft
Brendan Bycroft
6 months
There's a real advantage to unpacking a set of abstractions, flattening them out. Abstractions can be useful for terseness and management, but they can be a real blocker to seeing the big picture.
3
5
113
@BrendanBycroft
Brendan Bycroft
6 months
(Here's what goes into calculating a _single_ output value of a matrix-multiply)
2
7
99
@BrendanBycroft
Brendan Bycroft
6 months
Well, I hope you find it interesting. Let me know your thoughts! And if someone makes it through the walkthrough and finds it a little ~incomplete towards the end I might even getting around to fix it (my attention has largely turned to other projects oops)
16
4
82
@BrendanBycroft
Brendan Bycroft
6 months
@Algomancer Yup, it's all here: and the LLM code is under /src/llm (oops a bit of a scope change) But things that are more useful are 1) motivation & 2) reach/visibility. I've got another project coming up, which is, uhh, probably more ambitious
5
3
78
@BrendanBycroft
Brendan Bycroft
6 months
What about understanding what each layer does? Uhh, sorry, won't be much help. The project just came out of "Let's build a 3D viz!", so the scope is a bit limited. It's more: here's a way to learn & digest the algorithm, and perhaps think about how to optimize the process.
1
3
70
@BrendanBycroft
Brendan Bycroft
6 months
I also learnt a good amount of GL (dF/dx, fwidth, ubos, instancing), and animation approaches. So, uhh, even if no-one sees this, the project definitely has some value to me.
3
3
61
@BrendanBycroft
Brendan Bycroft
6 months
As for what I got out of creating this: before I made it, I mostly knew how image convolution nets worked, but language-based models seemed kinda magical in comparison. Well, now I know them in a fair amount of detail!
1
2
56
@BrendanBycroft
Brendan Bycroft
6 months
That's certainly been quite the response! 200k site visits, 1M X views, #1 on HN for a day or so, and plenty of very positive feedback. Here's a quick doc I put together with my sister a couple months ago, "setting expectations" I guess.
Tweet media one
2
1
29
@BrendanBycroft
Brendan Bycroft
6 months
Some quick notes: * took me maybe 200 hours * written in Typescript with next.js, react for anything DOM related * all the 3D stuff is written directly against WEBGL2 * GPT algo itself run in WASM, written in Odin (nice lang @TheGingerBill !) * here are my client-side deps:
Tweet media one
3
1
22
@BrendanBycroft
Brendan Bycroft
6 months
Short follow-up thread:
@BrendanBycroft
Brendan Bycroft
6 months
That's certainly been quite the response! 200k site visits, 1M X views, #1 on HN for a day or so, and plenty of very positive feedback. Here's a quick doc I put together with my sister a couple months ago, "setting expectations" I guess.
Tweet media one
2
1
29
1
0
18
@BrendanBycroft
Brendan Bycroft
1 year
its happening twitter is dying, a death of a thousand cuts (missing padding on android)
Tweet media one
0
1
6
@BrendanBycroft
Brendan Bycroft
4 years
Project: Robotic arm #1 I'm in the process of designing/building a little robotic arm. I'm using little servos + laser-cut wood as the base materials (it's what I have). The purpose/utility? Hmm haven't put much thought into that. Not important tbh.
Tweet media one
Tweet media two
1
0
15
@BrendanBycroft
Brendan Bycroft
6 months
All this code is open-source on my github, with the bulk of it under /src/llm/
2
1
18
@BrendanBycroft
Brendan Bycroft
6 months
The actual eval & population of the green-block values is done in WASM, written in Odin. It runs at init in a few ms (not optimized, eg uses naive matmul). We then pass that data to the GPU in texture-maps. The intro animation of the entire process is all done after-the-fact.
Tweet media one
1
2
17
@BrendanBycroft
Brendan Bycroft
8 months
It's doing its best😭😭😭
Tweet media one
0
2
11
@BrendanBycroft
Brendan Bycroft
6 months
@cunha_tristan In a proper impl, you have all steps running on the GPU (CPU<-->GPU bandwidth is super low). And then the ops themselves are dominated by the dot products in the mat-muls, i.e. a series of multiply-adds. Getting mat-muls to run fast is a bit of an art, to hide high vram latency.
0
1
12
@BrendanBycroft
Brendan Bycroft
6 months
But when we have the blocks hovered, and a row/column/cell is highlighted (+ other animation effects), we split the blocks into sub-blocks (splitGrid). That way they can take unique color/opacity as needed. With a bit of careful maths, the texture-map lookups remain consistent.
Tweet media one
Tweet media two
Tweet media three
1
0
12
@BrendanBycroft
Brendan Bycroft
6 months
So this was a really fun project, engineering-wise. Lots of experimenting with new techniques & approaches. Naturally this was 10x harder than I'd planned. And I've got a new, unrelated one on the go, which is probably more ambitious, oops.
1
0
11
@BrendanBycroft
Brendan Bycroft
3 years
The base was a lot easier & satisfying to put together 😌 A bit small and light to keep the robot upright, bit that's a future problem
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
10
@BrendanBycroft
Brendan Bycroft
3 years
Next up was the gripper assembly. Probably the most complex part, but it went together nicely with only a few minor hacks
Tweet media one
Tweet media two
Tweet media three
0
0
9
@BrendanBycroft
Brendan Bycroft
6 months
Block rendering: each block is single 3D, six-sided cube (two tris per side). Drawing the individual cell value circles is done in-shader, querying float32 texture-maps. So the circles & the white grids are all done per-pixel in the fragment shader.
Tweet media one
1
0
9
@BrendanBycroft
Brendan Bycroft
4 years
first twete just gonna post into the ether for now and that's ok
1
0
8
@BrendanBycroft
Brendan Bycroft
10 months
@gptbrooke 28 tweets 65.9k likes Someday I might even out that ratio, idk
1
0
9
@BrendanBycroft
Brendan Bycroft
6 months
A technical guide to how I built it:
@BrendanBycroft
Brendan Bycroft
6 months
Here's a technical guide on how I wrote & structured some of the LLM Visualization. A few people have been asking about it, so I thought I'd write it up. Lots of code screenshots, so not for everyone.
15
56
443
0
1
9
@BrendanBycroft
Brendan Bycroft
6 months
Of note, the block generation for the small LLM is run for every frame, and the code looks like this. Very terse, but there are ~50 unique blocks. There's info for how to look-up on-GPU texture-maps for the cell data, and the block's dep structure (for hover & anim).
Tweet media one
Tweet media two
1
0
9
@BrendanBycroft
Brendan Bycroft
6 months
A few acknowledgements: @karpathy 's mingpt repo was vital to ensuring my GPT impl. was working correctly (plus his YT vids are great), @telmudic for that first QT that got me off the ground, and my sister for getting me to actually have a plan & set a release deadline.
1
0
6
@BrendanBycroft
Brendan Bycroft
4 months
@finbarrtimbers I figured it out by, uhh, fully implementing one. And then only understood that Q/K/V came from hashtables/dicts jargon like 90% of the way through
@BrendanBycroft
Brendan Bycroft
6 months
Project #2 : LLM Visualization So I created a web-page to visualize a small LLM, of the sort that's behind ChatGPT. Rendered in 3D, it shows all the steps to run a single token inference. (link in bio)
114
1K
6K
0
0
7
@BrendanBycroft
Brendan Bycroft
6 months
Maybe I'll get around to animating those last few pages, or fixing mobile, but no promises. I'll write a thread or two on how I actually wrote the app, because that's interesting to me.
1
0
6
@BrendanBycroft
Brendan Bycroft
6 months
Here's some walkthrough code samples. First we have the commentary strings (template-strings handy), interleaved with these t_<name> variables. These provide the sequencing for the animations. Below that, we have the logic for animating the blocks, making direct changes to them.
Tweet media one
Tweet media two
1
0
7
@BrendanBycroft
Brendan Bycroft
4 years
Today I redesigned the carriage between the two base servos, since the old one had a few fit issues (nasty interference; broken bits). This one's way better. I do the layout/design in Inkscape because it's familiar. Then the construction looks something like this:
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
7
@BrendanBycroft
Brendan Bycroft
4 years
I still messed up though. The resulting distance between the pivot points (red) was about 1.5mm too short for the lower arm. So need to adjust in the designs (blue). A pain to start over, so fixed with a wooden spacer for a nice snug fit.
Tweet media one
Tweet media two
Tweet media three
1
0
7
@BrendanBycroft
Brendan Bycroft
6 months
Most of the logic is executed within a requestAnimationFrame (rAF) loop, within the ~top-level runProgram() function. The IProgramState has all the state hanging off of it, some cross-frame, and some generated per-frame. This JS logic all takes ~5ms.
Tweet media one
Tweet media two
2
0
7
@BrendanBycroft
Brendan Bycroft
6 months
We have a per-chapter value t, and when we create each t_<name> variable, it figures out its local t value from that, which ranges from 0 -> 1. These 0 -> 1 values are then used to drive the animations, typically via lerp functions.
1
0
6
@BrendanBycroft
Brendan Bycroft
6 months
I did a bunch of other mini-projects in this, for the fun (I'll spare details). * The text layout code for the hover tooltips (2D; could have been DOM?) * Constructing & rendering the ribbons with beziers * A mimalloc-inspired memory allocator * The ToC diagram highlighting
1
0
6
@BrendanBycroft
Brendan Bycroft
6 months
Original thread here:
@BrendanBycroft
Brendan Bycroft
6 months
Project #2 : LLM Visualization So I created a web-page to visualize a small LLM, of the sort that's behind ChatGPT. Rendered in 3D, it shows all the steps to run a single token inference. (link in bio)
114
1K
6K
0
0
5
@BrendanBycroft
Brendan Bycroft
6 months
@joebarnard You should give "Theft of Fire" by Devon Eriksen a go. A great read, with tight physics, I got through it in 2 sittings - was riveting
0
2
6
@BrendanBycroft
Brendan Bycroft
2 years
@__frye serious answer from youtube-for-amateurs: try to dig it up from other parts of your road or maybe other projects have a pile of surplus somewhere
1
0
4
@BrendanBycroft
Brendan Bycroft
6 months
Oh yeah, I threw in a Stripe tip-jar on my website (to be clear, this is just a hobby project & I have a full-time job). P.S. @stripe , @patio11 , your onboarding process is smooth as silk.
Tweet media one
2
0
5
@BrendanBycroft
Brendan Bycroft
4 years
gonna try playing this game, we'll see how it goes
@visakanv
Visakan Veerasamy
5 years
twitter_rpg_strategy_guide.txt
26
175
1K
0
0
3
@BrendanBycroft
Brendan Bycroft
6 months
@ccreikey @Algomancer Oh yeah, I wrote the actual GPT algorithm in Odin that runs in WASM. Just did naive mat-mul though. Originally wrote a webgl-based GPT evaluator, but perf was terrible for some reason (odin=2ms, webgl=500ms!?!?).
1
0
3
@BrendanBycroft
Brendan Bycroft
2 years
@goblincodes first thing is to check the dev tools css styles for that text: check the computed values & the style inheritances for differences also check chrome & firefox to rule out per-domain browser overrides like full page zoom
2
0
3
@BrendanBycroft
Brendan Bycroft
6 months
@InFuzeLLC Thanks! Yeah, I might do a thread soon on how I did some of the architecture, particularly around the animations
1
0
4
@BrendanBycroft
Brendan Bycroft
4 months
@dmvaldman yeah lol, and I gotta finish that thread. I mean I got it working, but it's nothing spectacular
1
0
3
@BrendanBycroft
Brendan Bycroft
2 years
@goblincodes when the css gets minified, there's a bug where they clamp opacity values to [0, 1], which breaks for percentages. fix is to use 0.2 instead of 20% etc i ran into that issue like 2 years ago oh no
1
0
3
@BrendanBycroft
Brendan Bycroft
2 years
@dhomochameleon there might be confusion about what the pattern is since on a phone it's easy to get cross-eyed across more than one of the pattern repeats the main one looks like a lion 2 me, but only when I'm gently cross-eyed. the 2nd/3rd ones have extra pop-out layers & don't make sense
2
0
2
@BrendanBycroft
Brendan Bycroft
9 months
@acidshill It's so much fun
Tweet media one
0
0
3
@BrendanBycroft
Brendan Bycroft
2 months
@MorlockP 5. It seems like Tesla are really pulling on that late-structural-combination thread for mfg efficiency. It's risky of course: modern unibody design is a bit of a dark art, so a lot of care & skill has to go into keeping structural integrity, crash behavior, etc. up to scratch
0
0
2
@BrendanBycroft
Brendan Bycroft
2 years
@tszzl good enough for today i think
Tweet media one
0
0
2
@BrendanBycroft
Brendan Bycroft
2 months
@tautologer Seems reasonable, but +1 for redis pub/sub. Big thing with SSE is the max 6 conns per [domain+browser], so if you have a bunch of tabs open you run into trouble. Can use visibilitychange to deal with that, though
1
0
1
@BrendanBycroft
Brendan Bycroft
6 months
@NakramR Cheers! Yeah it's a bit unfinished towards the end (haven't animated much). But with the amount of traction it's getting... worth tidying up I think
1
0
2
@BrendanBycroft
Brendan Bycroft
6 months
@pepijndevos @MuzafferKal_ Yeah so will I. Esp with context window & the new tricks that came in this year. Also it's worth noting the incremental cost of processing a new token is limited to just one of the columns in the embedding (provided the intermediates are stored across evals)
2
1
2
@BrendanBycroft
Brendan Bycroft
3 years
@EggProphet too long didn
0
0
1
@BrendanBycroft
Brendan Bycroft
3 years
@welovemath @LuxIgnisStylus @jessesingal Looks like Kindle pre-order is only available in US & Canada. Might be hurting your numbers @jessesingal !
0
0
1
@BrendanBycroft
Brendan Bycroft
4 years
@ollybot_redux Have you looked into the JBP route? From my own relatively mild encounters with anxiety/depression, I think finding the right sort of frame is quite valuable. And JBP's frame was pretty novel/useful, and one that I hadn't really encountered in therapy or elsewhere
0
0
1
@BrendanBycroft
Brendan Bycroft
6 months
@raysplacenspace Ray!! Haha yeah it's me and, it took off like I never dreamed 🥲
1
0
1
@BrendanBycroft
Brendan Bycroft
2 months
@MorlockP 4. One of the key driving differences here is that the battery forms part of the structure, and can be attached later in the process. You can see this partially done with the Cybertruck, and Munro appreciate this even though it helps with just 2 seats.
1
0
1
@BrendanBycroft
Brendan Bycroft
2 years
@goblincodes quick prediction: on localhost:3000, your page has a zoom of 150% (ctrl-0 to fix) rem is affected by page zoom, but vmin isn't + responsive mode ignores page zoom
1
0
1
@BrendanBycroft
Brendan Bycroft
3 years
@ollybot_redux here it is so far i decided to impulse buy a cheapo cnc laser router so thought I'd better make use of it wiring and software to do (mostly), and then we will see if it can lift it's own weight and maybe even something else
Tweet media one
1
0
1
@BrendanBycroft
Brendan Bycroft
3 years
@cheascake hello I'd like to add a submission def baz(xlst): cnt = minCnt = 0 for x in xlst + [0]: if not x and cnt and (cnt < minCnt or not minCnt): minCnt = cnt cnt = cnt + 1 if x else 0 return minCnt
0
0
0
@BrendanBycroft
Brendan Bycroft
2 years
@dhomochameleon it works for me when my phone is about a foot away, and I'm focusing on my other hand about 2 inches behind that
0
0
1
@BrendanBycroft
Brendan Bycroft
6 months
@singhhcoder Chrome devtools, the Performance tab. I use it all the time
0
0
1
@BrendanBycroft
Brendan Bycroft
2 years
@eurydicelives listening to this gives me goosebumps, wonderful!
0
0
1
@BrendanBycroft
Brendan Bycroft
1 year
@storebrandguy ooo yes please
1
0
1
@BrendanBycroft
Brendan Bycroft
2 months
@MorlockP 2. This assembly phase is: Long (many stations; cars are big) Complicated (all sub-assemblies need to be inserted into a semi-enclosed shell with relatively small openings), and Linear (any stoppage halts all stations immediately)
1
0
0