Joran Dirk Greef Profile Banner
Joran Dirk Greef Profile
Joran Dirk Greef

@jorandirkgreef

4,246
Followers
1,075
Following
94
Media
3,659
Statuses

Founder & CEO of @TigerBeetleDB — the distributed financial transactions database designed for mission critical safety and performance.

Joined October 2020
Don't wanna be here? Send us removal request.
Pinned Tweet
@jorandirkgreef
Joran Dirk Greef
1 year
@TigerBeetleDB It’s been 2.5 years since @TigerBeetleDB began, with a dream to make ledgers more efficient. Today, I’m excited to announce our Series Seed of $6.4M from @AmplifyPartners and @Coil , as we look to power the future of financial accounting infrastructure.
Tweet media one
19
20
155
@jorandirkgreef
Joran Dirk Greef
2 months
The secret to TigerBeetle’s performance is not: - @ziglang - io_uring - static allocation etc. Although these are all important for performance, the extreme 1000x performance technique in TigerBeetle is surprisingly simple: A complete rethink of DBMS concurrency control.
@TigerBeetleDB
TigerBeetle
2 months
Why do general purpose DBMS designs increasingly struggle to scale the growing OLTP workload? Why is the effect counterintuitively worse with horizontally partitioned DBMS designs? And what does TigerBeetle do completely differently to “unlock” 😎 three orders more OLTP scale?
0
3
40
9
32
243
@jorandirkgreef
Joran Dirk Greef
4 months
From the creators of FoundationDB comes… Antithesis Deterministic Simulation Testing for the world. Hard to overstate how big this is for distributed systems—DistSys suddenly became easy.
3
47
233
@jorandirkgreef
Joran Dirk Greef
10 months
Redis was my first encounter with databases beyond SQL—the idea that a bag of data structures, an append-only log (and excellent docs with big-O notation) could be useful. What are some of the interesting (hard) problems you’ve seen solved by Redis beyond SQL?
16
7
199
@jorandirkgreef
Joran Dirk Greef
2 years
Some exciting personal news! I’m thrilled to share that @Coil will be spinning @TigerBeetleDB out as a startup and that I will be joining the team as founder and CEO.
27
23
165
@jorandirkgreef
Joran Dirk Greef
10 months
Memory is fast becoming the new frontier: - allocation and fragmentation, - bandwidth (serialization overhead, hidden memcpy’s, Direct I/O and user space page caches), - syscalls and context switches will matter more and more.
@lemire
Daniel Lemire
10 months
Remember when they told you that your code could inefficient since your performance was limited by the speed of your disk? “Samsung's 990 Pro 4 TB drive offers a 7,450 MB/s sequential read speed and a 6,900 MB/s sequential write speed”
15
13
146
6
19
157
@jorandirkgreef
Joran Dirk Greef
5 months
Look what arrived in the mail! Thank you @unmeshjoshi for Patterns of Distributed Systems. Excited to dive in.
Tweet media one
0
12
139
@jorandirkgreef
Joran Dirk Greef
8 months
We’ve discovered (at least in our own mental model) a common over-specification in consensus, with performance implications. TLDR: State machine replication, with full durability, can be cheaper than single node “fsync always”—and the same techniques apply to single nodes. 1/8
5
20
133
@jorandirkgreef
Joran Dirk Greef
1 year
Systems Distributed '23 is a new conference presented by @TigerBeetleDB : - how to design, build and test systems, - at work, as a hobby or as a business, - from chips to compilers to databases and distributed systems, - in a beautiful city, - with some special speakers...
Tweet media one
4
30
132
@jorandirkgreef
Joran Dirk Greef
10 months
Entering the kingdom of vectorization…
Tweet media one
1
3
131
@jorandirkgreef
Joran Dirk Greef
1 year
The cache eviction algorithm in @TigerBeetleDB ’s user space page cache is CLOCK N’th Chance (2 bits). We chose it intuitively, for performance and elegance. It’s a beautiful algorithm that makes sense. Great to see this recent analysis now, showing that CLOCK can indeed be not
@1a1a11a
Juncheng Yang
1 year
You might have heard about FIFO-Reinsertion/CLOCK being faster and more scalable than LRU. Do you know it is also more efficient/effective with a lower miss ratio than LRU? Take a look at our recent work on HotOS, which studied over 5000 traces collected from the last two
2
17
103
3
20
131
@jorandirkgreef
Joran Dirk Greef
3 months
This is the most comprehensive tour of TigerBeetle yet. We put everything in here: - LSM optimizations like moving from RUM to RUMBA, and even - consensus tricks like how a cluster with fsync can rival a single node without fsync (!). Hope you enjoy.
@TigerBeetleDB
TigerBeetle
3 months
“Redesigning OLTP for a New Order of Magnitude” by @jorandirkgreef at @QConSF is out! A dense deep dive into TigerBeetle’s: - network - storage - consensus Plus Online General Purpose Processing (OLGP) vs OLTP, and looking ahead to speculative state machine execution. Enjoy!
1
7
46
1
26
130
@jorandirkgreef
Joran Dirk Greef
1 year
This is why I find TCP so hard. And why we built @TigerBeetleDB ’s networking instead on a message passing abstraction that makes _less_ guarantees… The network fault model—indeed, even the boundary (!) between user/kernel space—must be explicit, not leaky.
@DominikTornow
Dominik Tornow
1 year
Thinking about Distributed Systems and Message Loss?! Although TCP is considered a reliable protocol, TCP will *not* save you!! 📨🕳️👇 #ThinkingInDistributedSystems #Goals2023
Tweet media one
1
25
170
6
12
129
@jorandirkgreef
Joran Dirk Greef
7 months
The sheer engineering investment going into @ziglang as an extreme systems compiler and language (and what you can do with it already) is staggering. Fantastic interview by Kris Jenkins with @croloris .
@krisajenkins
Kris Jenkins (@[email protected])
7 months
Zig might be the most ambitious language we’ve looked at on Developer Voices - it’s trying to replace C, compete with LLVM and be the foundation of the whole compilation story. It’s a feast for language fans… 🎧 📺
6
41
237
1
16
122
@jorandirkgreef
Joran Dirk Greef
4 months
In 2015, I was working on a proprietary full-duplex file sync system. The 1st version failed. The 2nd version failed. The 3rd version was perfect. Rock solid. Surviving scenarios that would fail OneDrive or GDrive at the time and performing on par with Dropbox, albeit with
@DominikTornow
Dominik Tornow
4 months
Enabling assertions in production is a core tenet of TigerStyle, a collection of principles employed at @TigerBeetleDB TigerStyle has been instrumental at @resonatehqio in crafting reliable software systems
Tweet media one
2
5
25
4
11
124
@jorandirkgreef
Joran Dirk Greef
1 year
This is a deep dive into database durability. Commemorating the 5 year anniversary of Fsyncgate, almost to the day—and with a twist in the tale... See you on the 31st for the live chat!
@TigerBeetleDB
TigerBeetle
1 year
Missed Joran's talk "A New Era for Database Design" from QCon London? Pivotal moments in durability, I/O, systems languages and testing techniques, and how they influenced TigerBeetle's design decisions. Premiering on YouTube, Wed May 31 at 9AM PT!
Tweet media one
1
15
69
2
15
120
@jorandirkgreef
Joran Dirk Greef
8 months
@vaibhaw_vipul For developing a “nifty” DistSys toolbox: - OSTEP by the Arpaci-Dusseau’s is a favorite systems book—the design thinking in here was imbibed by TigerBeetle: - Hacking bug bounties, finding CVEs, and composing cryptographic primitives (AEADs, HKDF etc.),
2
15
117
@jorandirkgreef
Joran Dirk Greef
1 year
Excited to announce something new for systems programming soon…
3
10
116
@jorandirkgreef
Joran Dirk Greef
4 months
@iamatradernoob @sunbains Tips for going deep on databases: 1. Pick a conference, like FAST (nice because it’s less DBMS and more focused on the broader hardware/software interactions) and follow the papers/talks from there each year. They might not make sense. Keep chewing the cud. 2. A DBMS at heart
1
20
112
@jorandirkgreef
Joran Dirk Greef
4 months
Looking forward to a special stream with @ThePrimeagen 🙌 Friday, Feb 23, circa 8:30AM PT
Tweet media one
@ThePrimeagen
ThePrimeagen
4 months
Tomorrow stream will be special, CEO and founder of tiger beetle will join us and explain how amazing the database is
20
19
381
5
19
106
@jorandirkgreef
Joran Dirk Greef
2 months
This is the most incredible talk I have ever seen. It was never published. Until now. Wed 17th — 11am PT / 2pm ET / 6pm UTC
@TigerBeetleDB
TigerBeetle
2 months
Join us for a special live premiere. The stealth talk by Will Wilson of @AntithesisHQ that brought the house down at the inaugural Systems Distributed in early '23. The talk was only ever seen live... Until now. Wed 17th 11am PT / 2pm ET / 6pm UTC
4
32
157
1
20
105
@jorandirkgreef
Joran Dirk Greef
5 months
One of the things I appreciate most about @ziglang is the vision, leadership and stewardship of Andrew Kelley… …and then being able to support the Zig Software Foundation personally through a simple sponsorship on GitHub. If you’ve never chipped in…
2
14
104
@jorandirkgreef
Joran Dirk Greef
4 months
This is the secret. And why we have a zero dependency policy for @TigerBeetleDB given it’s foundational and that any performance faux pas add up. There’s a time and a place. However, dependencies introduce not only safety risk but also performance risk, and we need to own this.
@mitchellh
Mitchell Hashimoto
4 months
Breaking news: computers can be fast when you aren't forty two billion abstraction layers above the CPU instructions.
6
60
580
1
9
102
@jorandirkgreef
Joran Dirk Greef
8 months
I'm excited to speak at @QConSF this Monday: “Redesigning OLTP for a New Order of Magnitude” Some thought-provoking OLTP and Scalability questions in here, with I hope some surprising (and defining) answers!
0
15
98
@jorandirkgreef
Joran Dirk Greef
5 months
@sunbains @gesalous @LewisCTech At TigerBeetle, we like to make decisions on the basis of reasoning from first principles, drawing from research, and optimizing for quality and total cost of ownership—as opposed to making technological choices by appealing to authority or appealing to popularity (so common in
2
12
94
@jorandirkgreef
Joran Dirk Greef
7 months
Direct IO can easily be misunderstood: - Is it slow (if you use it wrong)? - Is it fast (if you use it right)? - Is it required for database durability? There's plenty of anecdote (art and skill) around the first two questions (which come down to how you think of memory
2
16
92
@jorandirkgreef
Joran Dirk Greef
10 months
A pleasure to read this over my morning coffee, and see TB next to heroes like Scylla, Seastar and Redpanda. Glauber's writing on io_uring (already in 2020) made an impact on @TigerBeetleDB , and his advice here (to do meaningful open source) is also our secret...
Tweet media one
@glcst
Glauber Costa
10 months
Most of the career advice I see on this platform is not applicable to systems level software (OS, compilers, etc). As I promised a cpl of weeks back, here's some tips from my career on how to develop your own career, including a real life story of how we hired @iavins :
12
41
276
1
11
86
@jorandirkgreef
Joran Dirk Greef
1 year
One of my favorites. Because once you know how to optimize for the network, you can apply some of the same big ideas to disk, and then even memory. Right up there, along with OSTEP.
@eatonphil
Phil Eaton
1 year
High Performance Browser Networking is one of the most high-value books you can read as a programmer IMO.
Tweet media one
17
112
1K
2
8
83
@jorandirkgreef
Joran Dirk Greef
1 year
I see static memory allocation and @ziglang ’s explicit allocator strategies as a sea change in systems programming: “The gap between processor speeds and memory access latencies is an ever-increasing impediment..” “…depending on the allocator, the performance can vary by 10x.”
@sigarch
SIGARCH
1 year
“Just like a baby in a big family, the memory allocator is growing up. Time has come when we need to give it a new room (core) in our house (CPU).”
2
14
43
4
13
82
@jorandirkgreef
Joran Dirk Greef
1 year
My old favorite is ZFS. I learned so much about databases… from a filesystem. Most of all, it was that spirit of ZFS that brought us together as a team to do @TigerBeetleDB today.
@settermjd
Matthew Setter
1 year
What’s your preferred #database of choice? No shame. No judgement. Just curious which one and why. I’ll go first. Mine’s sqlite because of the low overhead so it’s trivial to get going. Plus, it handles more than enough traffic for my needs. You?
42
3
50
6
9
81
@jorandirkgreef
Joran Dirk Greef
1 year
We use @ziglang comptime to compile our VSR consensus and LSM storage engine for @TigerBeetleDB ’s data types to: - reduce L1-L3 churn with zero copy deserialization, and - eliminate length prefixes in disk/wire/cache formats to reduce write amp/bandwidth/memory usage.
@kellabyte
Kelly Sommers
1 year
I wonder how fast a database would accelerate if it was recompiled to be hard coded against the data model it serves. I know there’s optimizers that do JIT like compilations but the DBMS still has a ton of CPU branch predictions for decisions & lookups before the query itself
29
5
153
1
12
77
@jorandirkgreef
Joran Dirk Greef
1 year
We leverage @ziglang ’s comptime in @TigerBeetleDB ’s LSM-Forest to: - generate LSM-Trees for different key/value types, - eliminate key/value length prefixes, and - optimize for the different “churn” workloads of different key/value types
@jdegoes
John A De Goes
1 year
Zig has a brilliant metaprogramming system that subsumes generics. There are no type parameters, just value parameters, and types simply happen to be value parameters that are passed at compile-time. Here's a generic 'max' function that works on any type: fn max(comptime T:
12
6
102
2
6
75
@jorandirkgreef
Joran Dirk Greef
5 months
One of the things that excited us in the creation of TigerBeetle was the lucky timing. The opportunity to shine a spotlight on the incredible storage fault research by UW-Madison. For the distributed database to embrace an explicit storage fault model and solve it with
@LewisCTech
Lewis Campbell
5 months
You ever see the same term pop up in a few places and think "hmmm"? @TigerBeetleDB often talks of "fault models". Wonder if theres more to this idea than I think. (Paper is "A Transaction Model", Jim Gray, IBM Research Laboratory 1980)
Tweet media one
0
2
32
2
7
75
@jorandirkgreef
Joran Dirk Greef
11 months
@ThePrimeagen Zero deserialisation with @ziglang ’s bitCast and cache line-aligned fixed-size structures. Less is more.
3
4
74
@jorandirkgreef
Joran Dirk Greef
6 months
Indeed. TigerBeetle would not have been possible without being designed (from the beginning) as a Deterministic Distributed Database. And now, having tasted the quality of Deterministic Simulation Testing (FoundationDB showing the way), we wouldn’t want to go back…
@ricardonunez_io
Ricardo Nunez
6 months
@isamlambert Obligatory @TigerBeetleDB / VOPR reference, but agreed for the most part. Seen too many outages coming from trying to make existing SQL engines “distributed” and serverless, not worth the risk to try sometimes
0
1
11
2
12
74
@jorandirkgreef
Joran Dirk Greef
2 years
Writing your own event loop over io_uring is such an awesome way into systems, with io_uring’s unified first-class API for async disk and networking. Shoutout to @axboe for making single-threaded control planes cool again 😎
@TigerBeetleDB
TigerBeetle
2 years
In our latest post, consider a tale of I/O and performance. Starting with traditional blocking I/O, and working up to a libuv-style event loop, we explore TigerBeetle's (and @oven_sh 's!) I/O stack.
Tweet media one
0
17
90
2
12
66
@jorandirkgreef
Joran Dirk Greef
11 months
@AndreyPechkurov @ThePrimeagen @ziglang Starting on the design for TigerBeetle in 2020, I had this gut feel from prior work that it would be a good principle... Always to think about alignment throughout the data plane: - as you recv() data from the network, - then write() to disk (with 4 KiB alignment for DIO, even
4
8
67
@jorandirkgreef
Joran Dirk Greef
3 months
Deterministic Simulation Testing is not only Fred Brooks’ “silver bullet” in terms of developer velocity… … with simulators such as @AntithesisHQ , it’s also the beginning of a “post-Jepsen” era for distributed systems. @richardartoul explains why:
Tweet media one
1
17
66
@jorandirkgreef
Joran Dirk Greef
8 months
All thanks to @ziglang . And TigerBeetle uses the same technique also for IO, Messaging and Time. Anything non-deterministic, so that TB can test and simulate deterministically, plus accelerate time when testing.
@sarna_dev
Piotr Sarna
8 months
Brilliant pattern by @TigerBeetleDB and @ziglang : pass an allocator only as a parameter, so you know if a function allocates just by looking at its params. I fully recommend the whole talk!
0
3
40
0
3
65
@jorandirkgreef
Joran Dirk Greef
8 months
When we started @TigerBeetleDB , I never imagined someone would describe this as a quest to “build the perfect database”. Thanks to Kris Jenkins for having me on the show, to imagine a world where databases are planes, and testing is a flight simulator… …safer for the pilots!!
@krisajenkins
Kris Jenkins (@[email protected])
8 months
How far would you go to build the perfect database?🤔 Let's go down a rabbit hole of performance problems, fsync gotchas and network reliability myths that may only get fixed with a new approach, and some smart testing… 🧪 🎧 📺
1
3
33
1
7
65
@jorandirkgreef
Joran Dirk Greef
1 year
And we’re on our way from NYC to Cape Town for Systems Distributed ‘23! ✈️ (Where’s Andrew Kelley?) ⁦ @ziglang
Tweet media one
5
2
64
@jorandirkgreef
Joran Dirk Greef
3 months
This is why TigerBeetle’s global consensus protocol and local storage engine were co-designed: - not only to share the WAL, but also to - run directly on a raw block device - without requiring a filesystem
@strlen
Alex Feinberg ⬜️🟥⬜️
3 months
Log stacking can happen when using Raft/Paxos to replicate DBs w/ their own WALs. Apache Kudu consolidates: the tablet WAL is also the Raft log. There have been experimental in other systems by disabling local WAL and using consensus log for recovery. Is there a generic solution?
6
6
49
0
8
64
@jorandirkgreef
Joran Dirk Greef
30 days
Surreal to sit next to @RattrayAlex (at a dinner hosted by @natalievais on Wednesday): - whose HN comment encouraged me to open source a security tool, - leading to work with Microsoft, then @Coil on the Gates Foundation @mojaloop switch, - from whence sprung @TigerBeetleDB !
@RattrayAlex
Alex Rattray
1 month
Small world; I (~rattray in the screenshot) met @jorandirkgreef (~jorangreef) randomly at a dinner, and he recognized my name from an hn comment 5 years ago. Apparently it led to a chain of events relating to the founding of @TigerBeetleDB , which is some extraordinarily cool
3
4
55
3
6
61
@jorandirkgreef
Joran Dirk Greef
1 year
We've connected TigerBeetle's deterministic simulator, The VOPR, to GitHub—to open issues automatically, classify bugs as { correctness, liveness, crash } and basically write a nice report. This (and more) in our April newsletter... 📰
Tweet media one
@TigerBeetleDB
TigerBeetle
1 year
Our April newsletter is out! Read on for a ton of detail into all the code changes happening in TigerBeetle!
1
4
19
2
8
61
@jorandirkgreef
Joran Dirk Greef
11 months
An accidental memcpy, buried somewhere in TigerBeetle's data plane: - burning memory bandwidth, - thrashing the L1-L3 cache, and - destroying performance, is something we think about... And now verify, with a new character in the TB Cinematic Universe... CopyHound!
@TigerBeetleDB
TigerBeetle
11 months
matklad has a super power he wants to share with you. This is the super power: LLVM IR is text.
Tweet media one
1
14
76
0
2
59
@jorandirkgreef
Joran Dirk Greef
4 months
TigerBeetle making IMDb 🎬😆😎
Tweet media one
0
4
57
@jorandirkgreef
Joran Dirk Greef
9 months
@iavins If you want something that’s faster to construct than XOR / Binary Fuse Filters (e.g. if write-intensive), and simple to intuit/implement as a middle ground, plus significantly faster than Bloom Filters… …then take a look at the Split Block Bloom Filter in Apache Impala, which
2
8
56
@jorandirkgreef
Joran Dirk Greef
3 months
“Make it work, make it fast, make it pretty…” …but design first! (Because back of the envelope gets you to a better maxima) You want to be staring up at the slopes of Mount Doom before you start the ascent.
@Malix_off
𝗠𝗮𝗹𝗶𝘅™
3 months
But design first See tiger_style (which is a set of design principles), from @TigerBeetleDB
0
2
13
2
11
57
@jorandirkgreef
Joran Dirk Greef
9 months
Most LSMs design the in-memory table for inserts, scans and lookups (after probing the cache). The insight here is to NOT use the in-memory table for lookups, but only as a k-way log for inserts/scans. And then use only the cache to serve lookups, since hash maps are optimal.
@TigerBeetleDB
TigerBeetle
9 months
The in-memory data structure of an LSM-tree is hard to optimize: - Do you design for point lookups? - Or scans on the read path? - Do you optimize for inserts on the write path? - What about k/v caching? - And undo? PR 1180 has new TigerBeetle ideas...
Tweet media one
2
4
53
1
3
55
@jorandirkgreef
Joran Dirk Greef
8 months
Look at that little orange line go… 🧡
@OSSInsight
OSS Insight
9 months
The ranking of attention to open-source programming language repositories shows that Go, Node, and TypeScript maintain the top three positions. CoffeeScript is declining the fastest, while Zig is rising the fastest. Swift, which ranked first in 2015 and 2016, has dropped all the
Tweet media one
4
26
91
0
4
54
@jorandirkgreef
Joran Dirk Greef
7 months
If you’ve ever wanted to walk through TigerBeetle’s code... this is the perfect way to do it. Tune in. We’re going deep!
@TigerBeetleDB
TigerBeetle
7 months
Don't miss the 3rd show of IronBeetle🤘with matklad! Live on Twitch today at 9am PT / 12pm ET / 5pm UTC. (And if you missed the 2nd, well... here it is!)
0
1
18
3
10
55
@jorandirkgreef
Joran Dirk Greef
4 months
In our own experience with TigerBeetle, the most valuable thing that static allocation gives you (beyond the many performance and safety benefits)… …is that it forces the developers of the DBMS to think through what we call “the physics of the database”.
@sunbains
Sunny Bains @TiDB
4 months
@LewisCTech @DominikTornow @ziglang @TigerBeetleDB Static allocation upfront is probably the best way to write database (any?) software. InnoDB was written using this strategy. It was changed to dynamic allocation for some transient data in 5.1 IIRC. Once you let that genie out there is no going back, it slowly leaks everywhere.
2
1
12
5
4
54
@jorandirkgreef
Joran Dirk Greef
1 year
The Animal Database Alliance #QConLondon : @redpandadata ’s @jcsp_tweets and @maslankamichal @duckdb ’s @hfmuehleisen @TigerBeetleDB and our friends @_belliottsmith and Aleksey Topics: “ECC”, work-stealing costs, and Hannes’ boat strategy for the Netherlands’ rising sea levels
Tweet media one
2
9
53
@jorandirkgreef
Joran Dirk Greef
1 year
Excited for our new @TigerBeetleDB blog post tomorrow.
@mitchellh
Mitchell Hashimoto
1 year
@eatonphil @kingprotty Y’all got some smart people over there. 🥳 Your tech content is top notch, I’m excited anytime there’s a new blog post or talk.
0
1
23
3
3
53
@jorandirkgreef
Joran Dirk Greef
1 year
Tweet media one
1
3
52
@jorandirkgreef
Joran Dirk Greef
8 months
Here’s a rough principle of scaling, in terms of linear/exponential resources, that I’ve been working on: “The decision to linearly scale an exponential resource (e.g. CPU under Moore’s Law)… by introducing communication overhead (e.g. across the network)… becomes twice as
2
4
53
@jorandirkgreef
Joran Dirk Greef
9 months
The year has gone faster than I could have imagined. So many adventures, and yet more to come! Cheers to everyone for your support.
@TigerBeetleDB
TigerBeetle
9 months
Yesterday we celebrated TigerBeetle’s 1st birthday as a company—a year since we took the leap and left the Shire! It’s a joy to be part of the fellowship with you all, and we’re looking forward to the year ahead... “It’s going to be an adventure!”
Tweet media one
2
5
79
5
1
52
@jorandirkgreef
Joran Dirk Greef
1 year
Some languages are best spoken with your hands... @ziglang
Tweet media one
Tweet media two
Tweet media three
1
2
52
@jorandirkgreef
Joran Dirk Greef
7 months
Traveltime with @ziglang 😊
Tweet media one
Tweet media two
2
0
51
@jorandirkgreef
Joran Dirk Greef
2 months
If you could paint a database from first principles, for a write-heavy workload with extreme contention... - How do you rethink networking to solve row locks and memory bandwidth? - Do you use 1 LSM-Tree or 20? - And can you apply speculative CPU techniques to consensus?
Tweet media one
@TigerBeetleDB
TigerBeetle
2 months
Our @QConSF talk is on YouTube! - With the world becoming 3 orders of magnitude more transactional in the last 10 years, - how can we take the 4 primary colors of systems (network, storage, memory, compute), - and blend them into a new design for OLTP?
1
6
35
0
6
51
@jorandirkgreef
Joran Dirk Greef
2 months
The storage fault-tolerance in TigerBeetle has some of our favorite algorithms… …join matklad for a tasting tour.
@TigerBeetleDB
TigerBeetle
2 months
How a database with storage fault-tolerance stores data on disk might seem complicated at first, but it actually can be easy. In today's episode of IronBeetle, we start looking at the on-disk data structures:
0
15
74
1
8
50
@jorandirkgreef
Joran Dirk Greef
7 months
Checking your model but against the actual code (and in accelerated time) means you can move so much faster. It's also fun.
@TigerBeetleDB
TigerBeetle
7 months
How do you catch up (and overtake) 30 years of test time? - We've ramped to 100 CPU cores. - 100 simultaneous TigerBeetle simulations, 24x7. - Each simulation (in 3.3 seconds) tests 39 minutes on avg. of real world runtime. 10 cores = 20 years 100 cores = 200 years (every day)
1
18
116
3
9
49
@jorandirkgreef
Joran Dirk Greef
9 months
DBMS gravity has inverted.
@refset
Jeremy Taylor
9 months
"This shift in hardware balance is probably the biggest change to fundamental assumptions in database architecture since SSDs started to become a thing, and most people haven't internalized the implications yet." - @jandrewrogers The race now is to optimise for latency.
Tweet media one
9
36
224
3
6
50
@jorandirkgreef
Joran Dirk Greef
4 months
I find it fascinating that, for example Deterministic Simulation Testing (and the ability to accelerate the time of the system under test) is enabled almost entirely by the right abstractions.
@DominikTornow
Dominik Tornow
4 months
The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise -Edsger W. Dijkstra
Tweet media one
4
22
119
1
3
50
@jorandirkgreef
Joran Dirk Greef
3 months
Taking @rbatiati to see the cave on Table Mountain where I spent those 15 years coding before @TigerBeetleDB 😎
Tweet media one
5
1
50
@jorandirkgreef
Joran Dirk Greef
8 months
Great to “walk and talk” animal databases with @emaxerrno in the gardens this morning!
Tweet media one
3
2
49
@jorandirkgreef
Joran Dirk Greef
1 year
What are the toughest distributed systems to build?
23
5
48
@jorandirkgreef
Joran Dirk Greef
10 months
Protocol-Aware Recovery's findings impact: - DBMS WAL designs, - LSM trees “bolted on” with off-the-shelf consensus protocols like Raft, and - formal proofs that verify components like consensus protocols and stable storage in isolation. Join us for the live interview...
@TigerBeetleDB
TigerBeetle
10 months
Join us live on Twitch tomorrow, August 9th at 9am PT, as we discuss the ins/outs of Protocol Aware Recovery with the authors of the paper, Prof. Ram and Aishwarya!
Tweet media one
2
8
27
3
7
48
@jorandirkgreef
Joran Dirk Greef
5 months
@halvarflake With TigerBeetle, it was refreshing to go back to recv/send… …enforcing little endian, fixed size cache line aligned structs, and then casting bytes off the wire for zero-deserialization.
3
5
47
@jorandirkgreef
Joran Dirk Greef
6 months
Beyond performance, the 2nd order effect of an engineering culture of static memory allocation can be powerful (and positive): PR authors think through “the physics of the code” they contribute. This deep understanding is so valuable (and the policy viral in achieving this).
@bmcnett
bmcnett
6 months
even among performance minded folks, there's a persistent myth that malloc is "pretty fast" when in fact it's so performance hostile that big game productions ban it at link time, to eliminate the damage it causes
17
20
197
0
2
45
@jorandirkgreef
Joran Dirk Greef
1 year
Fsyncgate marked the beginning of a new era for database design: - DAIO and @axboe 's io_uring - @ramnatthan 's “Protocol-Aware Recovery” - Explicit systems programming and @ziglang - Deterministic simulation testing Premiering tomorrow at 9am PT:
@jorandirkgreef
Joran Dirk Greef
1 year
This is a deep dive into database durability. Commemorating the 5 year anniversary of Fsyncgate, almost to the day—and with a twist in the tale... See you on the 31st for the live chat!
2
15
120
0
9
45
@jorandirkgreef
Joran Dirk Greef
1 year
I love the Bitcask design. Implementing a version of it years back was one of my first forays into storage systems. We also use the pattern in @TigerBeetleDB to track all the tables in our LSM manifest, to log them to disk as they get created/compacted.
@criccomini
Chris Riccomini
1 year
Beautifully simple and remarkably similar to Kafka’s early persistence layer.
1
4
40
3
4
46
@jorandirkgreef
Joran Dirk Greef
7 months
@forked_franz @fleming_matt @AndreyPechkurov @ThePrimeagen @ziglang Yes, @TigerBeetleDB was inspired by @mjpt777 here, with his @QCon talk on the “Evolution of Financial Exchange Architectures”. It came out right as we were in the middle of the “sketching” phase, and helped nail down some of the design decisions.
1
9
44
@jorandirkgreef
Joran Dirk Greef
1 year
This follows my own optimization journey. First, it was minimizing network requests with CRDTs back in ‘12, then random seeks with spinning disk, and then optimizing memory accesses for erasure coding, and now TB. Network or spinning disk is a great way to think of memory.
@paramaggarwal
Param Aggarwal
2 years
After databases that optimise for spinning disks (Cassandra) and flash storage (Aerospike) now we are optimising for RAM allocation. In a way we are back to the basics. TigerBeetle is written in Zig just like bun.js, the node.js alternative.
0
0
12
1
7
44
@jorandirkgreef
Joran Dirk Greef
4 months
Looking at the chapters in here… @thegeeknarrator and I really did “delve deep into the world of online transaction processing”. And here, one of the most surprising things for me has been to see how far OLTP and general purpose DBs are diverging.
Tweet media one
0
5
43
@jorandirkgreef
Joran Dirk Greef
3 months
@sunbains To be clear, Deterministic Simulation Testing is more than simply fault injection or chaos engineering (common misconception). To do DST, the “system under test” itself must also be written 100% deterministically. Very few DBMS systems (I know of only 3!) actually do DST. And
2
12
44
@jorandirkgreef
Joran Dirk Greef
10 months
PAR has had a huge influence on TigerBeetle: - how consensus must be co-designed with storage engine, - to handle corruption correctly, without risk of latent data loss, and - to maximize high availability. Looking forward to diving into the details with Ram & Aishwarya.
@TigerBeetleDB
TigerBeetle
10 months
Protocol Aware Recovery makes TigerBeetle resilient to radioactive storage corruption. Join us August 9th at 9am PT, as we discuss the ins/outs of PAR with the authors of the paper, Prof. Ram and Aishwarya, in a live interview on Twitch!
Tweet media one
0
7
58
1
3
43
@jorandirkgreef
Joran Dirk Greef
8 months
A thread where Andy Pavlo unpacks some “pop database” references…
@andy_pavlo
Andy Pavlo (@[email protected])
11 months
The @TigerBeetleDB team released a fancy simulator that shows off viewstamp replication implementation in their DBMS. I have a cameo in the 2nd level sitting in a bathtub. I had nothing to do with this brilliance. Lots of deep cuts in this reference.
Tweet media one
Tweet media two
2
32
242
1
6
43
@jorandirkgreef
Joran Dirk Greef
2 months
Excited to speak at @money2020 in June! “The Next 30 Years of Transactions Processing”
Tweet media one
0
5
42
@jorandirkgreef
Joran Dirk Greef
5 months
From the beginning, as we were sketching TigerBeetle, we tried to be explicit about methodology: - how to anticipate and then explain experiments - survey different architectures - solve 2nd order problems like safety This engineering methodology became known as TigerStyle.
0
2
42
@jorandirkgreef
Joran Dirk Greef
2 months
Inasmuch as the shift from row-major to column-major changed everything for OLAP, increasing awareness of the problem of “contention at scale” (transactions are transactional, i.e. multi-row, not single-row) will change everything for OLTP. Here's the TigerBeetle take on this.
@InfoQ
InfoQ
3 months
Redesigning OLTP for a New Order of Magnitude by @jorandirkgreef
0
26
93
2
4
42
@jorandirkgreef
Joran Dirk Greef
3 months
DST can be devastating when applied to a distributed system for the first time. (We fixed 30 bugs in TigerBeetle after just three weeks of our own DST in ‘21) But that’s what you want, right? To celebrate each bug you find, knowing there’s one less bug out there in the wild.
@AntithesisHQ
Antithesis
3 months
@sunbains @jorandirkgreef "write lock is expected for command kv::command::pessimistic_rollback keys([]) @ 447992310654566405 447992310654566405 | region_id: 94 region_epoch { conf_ver: 5 version: 22 } peer { id: 95 store_id: 1 } max_execution_duration_ms: 20000 resource_control_context {} keyspace_id:
1
2
8
2
2
41
@jorandirkgreef
Joran Dirk Greef
1 year
We were inspired by @EdwardTufte , to see how much distributed consensus state we could pack into an 80 column terminal line. What you're seeing here is @TigerBeetleDB 's deterministic simulator, as it runs and tests strict serializability. More details in our internal docs...
Tweet media one
@TigerBeetleDB
TigerBeetle
1 year
Our internals reference for developers contributing to the database just got a little more accessible. One of the great parts of these pages are the glossaries explaining what we mean by "LSM Forest", the "Grid", or "VSR State Sync". Check it out!
Tweet media one
0
14
72
0
1
41
@jorandirkgreef
Joran Dirk Greef
2 months
The sketch of things to come...
@TigerBeetleDB
TigerBeetle
2 months
Something's coming tomorrow...
Tweet media one
4
4
55
0
1
40
@jorandirkgreef
Joran Dirk Greef
1 year
The energy that @hfmuehleisen put into his talk on @duckdb at #QConLondon was a joy to watch. And then, at dinner afterwards, a joy to hear Hannes ask “So… what sound does the @TigerBeetleDB make?!” Thanks to @r39132 for putting us together in a quacking awesome track.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
7
40
@jorandirkgreef
Joran Dirk Greef
1 year
Our experience with @TigerBeetleDB was much the same. We were analyzing an open source switch and were stunned to realize that it would in fact be faster to design a new ledger database, for bigger order of magnitude wins in the design phase, than to incrementally optimize.
@jarredsumner
Jarred Sumner
1 year
It is much harder to take something slow and make it fast than to design for performance from the beginning
12
6
102
2
3
39
@jorandirkgreef
Joran Dirk Greef
4 months
A pleasure to get to chat with @thegeeknarrator about TigerBeetle: - how to classify OLTP vs general purpose databases, - thinking about “data direction”, workload and contention as you optimize, - why consensus is not expensive (!), - how to push the RUM conjecture with an
@thegeeknarrator
Kaivalya Apte - The Geek Narrator
4 months
Just released an episode with @jorandirkgreef talking about the fastest, smallest and toughest database, @TigerBeetleDB This episode is packed with deep discussions around: - OLGP, OLTP and OLAP workloads - Why SQL isn't the best data format language? - What is the problem
Tweet media one
0
12
74
0
8
39
@jorandirkgreef
Joran Dirk Greef
1 year
The “Protocol-Aware Recovery” paper () by @ramnatthan and @AishwaryaGanlat is less known than fsyncgate, but just as scary, affecting most WAL and consensus implementations out there. An example of how we apply this to TigerBeetle:
@DominikTornow
Dominik Tornow
1 year
If you want to check out protocol-aware failure-handling in action, check out @TigerBeetleDB The beetle puts the theory into practice and recovers from local storage failures in the context of the global consensus protocol
1
2
10
1
8
39
@jorandirkgreef
Joran Dirk Greef
1 year
With filesystem bugs like this (and others still out in the wild), it’s good if your database: 1) has an explicit storage fault model, which 2) is exercised by simulation testing. Join us on YouTube tomorrow at 9am PT to dive into database durability:
Tweet media one
0
8
38
@jorandirkgreef
Joran Dirk Greef
3 months
TigerStyle!
@mitchellh
Mitchell Hashimoto
3 months
I'm still not perfect at this, but every "TODO" that I don't turn into a runtime assertions ends up costing me an inordinate amount of debugging time later. So, tip: every `// TODO` should be a runtime assertion and crash instead.
27
21
395
1
3
39
@jorandirkgreef
Joran Dirk Greef
30 days
A real pleasure to talk with @amanda_robs and @tnachen about the business (and open source) engineering going into @TigerBeetleDB
@OssStartup
Open Source Startup Podcast🎙
30 days
Do financial transactions need a specialized database 🤔? Our co-hosts @amanda_robs & @tnachen talk with @TigerBeetleDB Founder @jorandirkgreef & dig into: ⚡️General purpose vs specialized DBs ⚡️Open source vs source available ⚡️Their unique take on monetization
1
5
19
0
8
38
@jorandirkgreef
Joran Dirk Greef
7 months
I remember the first time we switched on TigerBeetle’s VOPR—the adrenaline and velocity of being able to use Deterministic Simulation Testing to find (and fix) 30 distributed systems heisenbugs in 3 weeks. That was the 2nd time (since July 2020) that I knew we could build
@tangledbytes
Utkarsh Srivastava
7 months
First bug reported by the GH action simulator 🎉! I have never been this happy seeing an issue getting opened 😂.
Tweet media one
3
0
27
0
5
38
@jorandirkgreef
Joran Dirk Greef
8 months
One of the things we enjoy most at @TigerBeetleDB are our “Walk and Talks”, where we deep dive into problems and explore the code together (in our heads, away from screens) as we walk and talk through solutions. Here, we get to do one with you!
@ScyllaDB
ScyllaDB
8 months
How do predictable performance and 700x faster tests sound? 😲 If you're interested, don't miss @TigerBeetleDB 's Alexsei Kladov at this year's free, virtual, & highly interactive @P99CONF . #ScyllaDB #lowlatency #database #P99CONF
Tweet media one
0
1
11
0
5
37
@jorandirkgreef
Joran Dirk Greef
5 months
@dodyg @TigerBeetleDB @davidfowl Thanks @dodyg ! Binary size is something we care about and are starting to track. For example, matklad wrote about a tool called copyhound that can be repurposed as a technique to catch generic code explosion in the binary:
5
4
37
@jorandirkgreef
Joran Dirk Greef
1 year
Getting the nouns and verbs “just right” is crucial to a crisp, clear mental model for building (great!) distributed systems. And this post by @DominikTornow is a sublime example: “Concurrency is a statement about logical time, parallelism is a statement about physical time.”
@DominikTornow
Dominik Tornow
1 year
New blog post just dropped: A Tale of Two Dimensions The post explores the synchronous vs asynchronous and sequential vs concurrent dimensions of distributed systems #ThinkingInDistributedSystems #Goals2023
1
8
26
2
5
36
@jorandirkgreef
Joran Dirk Greef
2 years
Awesome to see how many people are sponsoring @MichalZiulek 's and @slimsag 's projects. All open source. And all @ziglang .
1
5
36
@jorandirkgreef
Joran Dirk Greef
3 months
I’ll never forget the thrill of switching on TigerBeetle’s Deterministic Simulator for the first time. Feeling time accelerate and finding and fixing so many bugs in such a short time (while listening to Crawl by Kings of Leon). Join matklad behind the scenes to experience this
@TigerBeetleDB
TigerBeetle
3 months
Ever wondered how debugging works with a deterministic simulator? Watch this week's episode of IronBeetle ⚡️with matklad… …where we chase a real consensus availability bug.
0
3
20
0
7
36
@jorandirkgreef
Joran Dirk Greef
3 months
Is Deterministic Simulation Testing “mainstream”? (Would ❤️ it to be!) A deep dive discussion with @DominikTornow recorded last week.
@jorandirkgreef
Joran Dirk Greef
3 months
@sunbains To be clear, Deterministic Simulation Testing is more than simply fault injection or chaos engineering (common misconception). To do DST, the “system under test” itself must also be written 100% deterministically. Very few DBMS systems (I know of only 3!) actually do DST. And
2
12
44
1
7
35