Simon Späti 🏔️ @sspaeti Twitter profile

Pinned Tweet

Simon Späti 🏔️

@sspaeti

5 months

I did it. I published my book . It's not finished, but it will be over time.

25

75

425

Last Seen Profiles

@twojword

@anita_maguire

@janeliaflyem

@Kongiscash

@FXCEITTIGER

@DephaseuR

@Coach_Stuck

@Enfamil

@FunkeSport

@GeorgeLeibold

@FixingEducation

@GettingOverCast

@santika_am16628

@HancoxWitch

@FranceJerusalem

@G_IDLENEVERLAND

@bomibbin

@Gladiatorszonee

@chriscznn

@fodensband

@LewisUniversity

@raymondbur63914

@shunto1203

@JohnJoh71020458

@Akatsuk56199786

@Linkuma_off

@LetsGo_Ugo

@vicsoan

@MAguilarHurtado

@Lnkthealchemist

@skidmarxist1

@irishwill83

@KMekulsia

@fox_harley26992

@urskazigart

@grndmstrfesxh

Simon Späti 🏔️

@sspaeti

4 months

A super helpful data engineering handbook. It lists resources from: - Certifications Courses - Communities - Conferences - Data Engineering Whitepapers - Great Podcasts - Great YouTube Channels - Great books - Newsletters - People from LinkedIn, Twitter

GitHub - DataExpert-io/data-engineer-handbook: This is a repo with links to everything you'd ever...

This is a repo with links to everything you'd ever want to learn about data engineering - DataExpert-io/data-engineer-handbook

github.com

2

114

492

Simon Späti 🏔️

@sspaeti

7 months

Building a Data Engineering Project in 20 min: you'll learn web-scraping with real-estates, uploading them to S3, Spark and Delta Lake, adding Data Science with Jupyter, ingesting into Druid, dataviz with Superset and managing everything with Dagster.

3

73

455

Simon Späti 🏔️

@sspaeti

3 months

We've open-sourced an "Open Enterprise Data Platform", integrating the Modern Data Stack into a single portal. It features state-of-the-art tools like dbt for SQL data modeling, Airflow for task orchestration, and Superset for BI dashboards, all on a Postgres database.

6

50

305

Simon Späti 🏔️

@sspaeti

2 months

Big update to my Practical Data Engineering project on GitHub. 3 years on, this hands-on guide remains a key resource, now refreshed with the latest from Dagster, Delta Lake. Goodbye, Spark (locally) - hello, delta-rs. And much more. GH or the YT 👇🏻

Simon Späti 🏔️

@sspaeti

2 months

I'm upgrading my practical data engineering project. Interestingly, the tools I used three years ago are still valid today. Except I'm ditching Spark locally—it's such a nightmare—and using delta-rs. Getting there... 🙃.

3

19

108

3

39

226

Simon Späti 🏔️

@sspaeti

6 months

Announcement: I'm writing a book! ✨ But wait... it's not your usual IT book. 1. Debuting as a digital book & website. 2. It does *not* come finished. I will steadily release new chapters and carefully listen to all feedback. But the topic? 👉🏻 Data Engineering Design Patterns

11

29

220

Simon Späti 🏔️

@sspaeti

1 year

🎉 Celebrating the release of Pandas 2.0. With Apache Arrow as its backbone, Pandas is now faster and more powerful than ever before. We'll explore why and compare it to alternatives (Polars, Vaex, Koalas, or even DuckDB), consolidating the in-memory format to an open standard.

3

16

215

Simon Späti 🏔️

@sspaeti

10 months

Found a new glossary for data engineering. Check it out . Some terms explained extensively: * Fan-Out: * Partition: * Wrangle: Well done, @dagster team 👏🏻. Next up, backlinks? :)

3

37

191

Simon Späti 🏔️

@sspaeti

11 months

My Journey through Data Modeling: Navigating the Levels. Reflecting on my 20-year data modeling journey, I'm amazed by the evolution of approaches and levels in the field. No longer limited to Inmon and Kimball, we now have diverse techniques, each with its value.

4

21

176

Simon Späti 🏔️

@sspaeti

11 months

I am sharing my list of the people in data engineering, who is missing? 🗒️

19

26

176

Simon Späti 🏔️

@sspaeti

5 months

Great quick start list to start your data engineering journey with various templates for domains like Marketing, Product, Finance, Operations, and more. ➡️ Great work by Thalia and the Airbyte team.

3

30

160

Simon Späti 🏔️

@sspaeti

27 days

> It's scary how versatile/productive your terminal, and specifically Neovim, can get. In one screen: 1. data integration/dbt code 2. analysis of SQL queries 3. db connections/browser 4. result of queries 5. docker build 6. dbt run 7. postgres 8. more windows/sessions (tmux)

7

13

154

Simon Späti 🏔️

@sspaeti

4 months

«Data Engineering Vault» > More than a mere collection of terms, it’s a curated network of data engineering knowledge to facilitate exploration and discovery. Like a digital garden, 100+ interconnected terms are a gateway to deeper insights.

3

32

151

Simon Späti 🏔️

@sspaeti

3 months

Rill Developer and Dagster are still my favorite tools; running on top of DuckDB, it's a blast to use. Currently building my personal finance dashboard, reading from exported CSV, and categorizing groups of transactions in main and subcategories. GH:

3

20

134

Simon Späti 🏔️

@sspaeti

10 months

Data lakes consist of mainly three parts: 1. Storage-Layer (S3, google/azure blob) 2. File Formats (Parquet, ORC, Avro, "Arrow") 3. Table Formats (Delta, Iceberg, Hudi) Recapping my article data lake/lakehouse guide shows that there are favorites in all categories.

2

26

128

Simon Späti 🏔️

@sspaeti

2 years

Data engineering (DE) is still not well defined; It's a discipline that has shifted from DBA, ETL developer, and BI specialist and merged with SW to a Data Engineer. If you are like me and confused about the latest terms, I started a DE concept page .

Data Engineering Concepts

Some core concepts we are going to explore: Data Engineering Topics Data Warehouse, Data Lake, Data Lakehouse Storage Layer, Data Lake File Format, Data Lake Table Format Data Catalog Modern Data...

glossary.airbyte.com

7

20

116

Simon Späti 🏔️

@sspaeti

5 months

Books of data engineering: Which one would you add?

4

14

111

Simon Späti 🏔️

@sspaeti

2 months

I'm upgrading my practical data engineering project. Interestingly, the tools I used three years ago are still valid today. Except I'm ditching Spark locally—it's such a nightmare—and using delta-rs. Getting there... 🙃.

3

19

108

Simon Späti 🏔️

@sspaeti

3 months

Excited to share a new chapter in my Data Engineering Design Pattern book on orchestration's evolution.

2

19

104

Simon Späti 🏔️

@sspaeti

1 year

Amazed by how @Readwiseio improves my workflow whenever I check the latest. Replaced my RSS feeder with Readwise Reader, which combines Instapaper, RSS, web highlights, tweets, books, email, PDF into one. All highlights/notes are automatically synced into @obsdmd , my #secondbrain

12

16

101

Simon Späti 🏔️

@sspaeti

4 months

> The history of SQL SQL -> Data Mart -> Materialized View -> Business Intelligence Dashboard -> OLAP Cube -> dbt tables -> One Big/Wide/Super Table -> Semantic Layer

2

16

97

Simon Späti 🏔️

@sspaeti

7 months

Nice illustration on the different data modeling techniques: > Enterprise Data Warehouse (Inmon) > Star Schema (Kimball) > Data Vault > One Big Table (OBT) Source:

1

12

92

Simon Späti 🏔️

@sspaeti

3 years

The results are here from the extensive @DataCouncilAI survey in case you wondered: What are the most popular #OSS data projects of 2021? ⠀ ⠀ #dbt #airflow #superset #dagster #trino #prefect #Greatexpectations #spark #amundsen #dataengineering #opensource

1

37

89

Simon Späti 🏔️

@sspaeti

2 years

#todaysoffice

6

4

88

Simon Späti 🏔️

@sspaeti

1 year

🌟 Data Modeling: The Unsung Hero of Data Engineering. In my upcoming blog post, I'll explore the significance of data modeling, its various approaches, and its role in the broader context of data engineering. #datamodeling #dataengineering #dataarchitecture

8

86

Simon Späti 🏔️

@sspaeti

5 months

If you use SQL IDEs (DBeaver but in the terminal), you might enjoy . Supports: Big Query, ClickHouse, Impala, jq, MongoDB, MySQL, Oracle, osquery, PostgreSQL, Presto, Redis, SQL Server, SQLite, DuckDB (on the way). Or Harlequin, if vim is not a thing.

GitHub - tpope/vim-dadbod: dadbod.vim: Modern database interface for Vim

dadbod.vim: Modern database interface for Vim. Contribute to tpope/vim-dadbod development by creating an account on GitHub.

github.com

Ted Conbeer

@tedconbeer

5 months

All I wanted for Christmas was for someone else to write an adapter for Harlequin, and @joshptemple delivered! May I present harlequin-bigquery (!!):

3

4

58

4

8

81

Simon Späti 🏔️

@sspaeti

3 months

Great article on fundamentals and landscape of data engineering by my friends at Airbyte. I'm sharing my two cents as well 😉.

Navigating the Data Engineering Landscape in 2024 | Airbyte

Explore the latest trends and tools shaping data engineering in 2024. Stay ahead in the dynamic data landscape with expert insights.

airbyte.com

2

16

76

Simon Späti 🏔️

@sspaeti

10 months

Please create your own website. Don't give away all your content to social media. That was always my philosophy; therefore, my website has a lot of content. I created knowledge for myself, not for other huge companies. Check out Eric's explanations: .

5

10

75

Simon Späti 🏔️

@sspaeti

2 months

#dataengineering

Simon Späti 🏔️

@sspaeti

4 months

A good place to start learning the concepts in data engineering is the «Data Engineering Vault».

2

15

226

0

13

71

Simon Späti 🏔️

@sspaeti

3 months

Interesting #ModernDataStack : «PRQL + DuckDB + Dagster». > I evaluated the space for work at my current company (handling ~300 sources, ~1k downstream dbt tables + hundreds of dashboards) Found on HN () about the PRQL as a DuckDB extension announcement.

1

10

68

Simon Späti 🏔️

@sspaeti

4 months

Just released a new chapter in my DEDP book, exploring the evolution of SQL. Dive into concepts like Materialized View, OLAP Cube, dbt Table, Traditional OLAP, and DWA. Discover common patterns such as reusability, caching, and business transformations.

3

16

66

Simon Späti 🏔️

@sspaeti

5 months

History of SQL: SQL -> Data Mart -> Materialized View -> OLAP Cube -> dbt tables -> One Big/Wide/Super Table -> Semantic Layer

3

6

62

Simon Späti 🏔️

@sspaeti

5 months

I wrote a short brain dump about my Vim-Verse—how (Neo)Vim and Markdown revolutionized my data engineering and writing workflow.

2

5

60

Simon Späti 🏔️

@sspaeti

1 year

As a data professional with 20 years of experience, I've seen repeated terms in tech over and over again. Today, I discovered "Personalized API", yet another new term for something that already existed.

5

7

60

Simon Späti 🏔️

@sspaeti

2 months

#datamodeling

Simon Späti 🏔️

@sspaeti

11 months

My Journey through Data Modeling: Navigating the Levels. Reflecting on my 20-year data modeling journey, I'm amazed by the evolution of approaches and levels in the field. No longer limited to Inmon and Kimball, we now have diverse techniques, each with its value.

4

21

176

0

9

59

Simon Späti 🏔️

@sspaeti

4 months

So much food for thought; I just found @_anitweets ' awesome write-up of « Data Developer Platform» and why data-first data stack. Check it out at .

2

14

59

Simon Späti 🏔️

@sspaeti

7 months

a fantastic presentation about `dagster-embedded-elt` with dagster. Talking about: > Types of data ingestion > What makes data integration difficult > Lessons from DuckDB ("smol" is better) > Ingesting from API and a database are inherently different 👉🏻

1

11

57

Simon Späti 🏔️

@sspaeti

1 year

DuckDB explained with @ryguyrg .

2

13

55

Simon Späti 🏔️

@sspaeti

10 months

Many asked me how to get started with data engineering. I suggest solving a problem or something you are passionate about with an actual project. I collect a list of projects if you need help—get inspired and chose according to your skills.

Open-Source Data Engineering Projects

This note is for data engineers and developers. Here are some open-source data engineering projects that you can explore: My Projects Real estate dagster pipeline: A practical data engineering...

www.ssp.sh

1

14

52

Simon Späti 🏔️

@sspaeti

4 months

With the release of my book, I added 60+ more terms to my second brain. To make these terms more discoverable, I added a map of content dedicated to #dataengineering . All notes are interconnected, similar to our brain, making learning new terms easy.

1

9

52

Simon Späti 🏔️

@sspaeti

1 year

Some 🔮 for 2023 > DuckDB standard for working with data > Rust will be more mainstream (and spark will compete with it) > MDS will rename and be more known outside the US > Semantic layers will gain adoption > Orchestration is seen as a key component > Open standards everywhere

4

5

50

Simon Späti 🏔️

@sspaeti

6 months

It's best to keep an updated CV—even if not searching. I do not like this process. Everything I do is online already. But as in Europe, CVs are still a thing, so I converted mine into Markdown and keep it updated on . Not perfect, but it's a start.

5

6

50

Simon Späti 🏔️

@sspaeti

22 days

📘 Just released the next chapter in my Data Engineering Design Pattern book. It covers the evolutionary journey of #ETL and dives into the realms of Data Warehouses, Master Data Management, Data Lakes, Reverse ETL, and CDPs.

2

10

49

Simon Späti 🏔️

@sspaeti

1 year

📊🔨 Launching the final part of our series: "Data Modeling: The Unsung Hero of Data Engineering." Delving into data architecture patterns, their influence on data modeling, & the importance of strategic decisions. #DataModeling #DataEngineering

1

7

43

Simon Späti 🏔️

@sspaeti

3 months

I'm exploring the evolution of orchestration, comparing different CEs: From Bash scripts and Cron to stored procedures and Python's modern frameworks. How did we transition from basic scripts to complex, data-aware orchestration? Any anecdotes or specifics I should include?

6

1

42

Simon Späti 🏔️

@sspaeti

10 months

Quick Update: I'm no longer at @AirbyteHQ . Tremendously thankful to Michel & John, who believed in me and created a unique position as a writer and data engineer. Also, huge thanks to Ari, the behind-the-scene. Some learning 👇🏻

5

2

42

Simon Späti 🏔️

@sspaeti

1 year

TIL— Instead of doing some dbt magic, I can use @RillData to analyze my exported transactions and build an analytics dashboard without any extra steps 🤯 Beautiful how Rill visualizes time/number data automatically, playfully, and interactively.

1

3

42

Simon Späti 🏔️

@sspaeti

7 months

I've recently replaced DBeaver (for the most part) with an extension for my IDE of choice, Neovim. It works surprisingly well. 🪄 Check out a short demo: . Extension. GitHub: kristijanhusak/vim-dadbod-ui

data base querying in vim

think dbeaver in vim

asciinema.org

1

4

41

Simon Späti 🏔️

@sspaeti

2 months

Here we go again: Master Data Management vs. ETL vs. ELT vs. Reverse ETL vs. CDP. Do you see any other patterns these terms are providing?

2

9

41

Simon Späti 🏔️

@sspaeti

2 years

#DuckDB is hot these days, but what are its uses cases? Here are 3: * Ulta fast analytical use-case locally * SQL wrapper with zero copies (e.g., on top of parquets in S3) * Bring your data to the users instead of having big roundtrips and latency by doing REST calls What else?

2

3

41

Simon Späti 🏔️

@sspaeti

1 year

🌟 Ready to Unleash the Power of Data Modeling? Data modeling is your secret weapon to unlock the value of your organization's data. #datamodeling #dataengineering #datastrategy 👉🏻

1

7

38

Simon Späti 🏔️

@sspaeti

1 year

Which orchestrator, and why?

12

1

40

Simon Späti 🏔️

@sspaeti

8 months

It's always a pleasure to listen to @schrockn and Tobias. TIL— ✅ MLOps is mostly data engineering 🤔 SQLMesh is a better dbt 💯 Orchestration shouldn't be an afterthought; instead, the first thing when starting a data project

Data Engineering Podcast: An Overview Of The State Of Data Orchestr...

Data systems are inherently complex and often require integration of multiple technologies. Orchestrators are centralized utilities that control the execution and sequencing of interdependent...

www.dataengineeringpodcast.com

1

2

38

Simon Späti 🏔️

@sspaeti

5 months

Someone submitted my book to Hackernews, and now it's on the front page 😍 Thanks.

Karim Jedda

@KarimJDDA

5 months

Take my upvote Sir! @sspaeti Massive congratulations!

1

0

2

3

4

37

Simon Späti 🏔️

@sspaeti

5 months

WASM and DuckDB to get the Parquet schema by hovering in BigQuery 🆒. Usually, you need to download parts of the metadata to read it, done in notebooks or similar, but WASM, which runs entirely inside the browser, is an excellent use case. Thanks for sharing @_Blef .

2

3

38

Simon Späti 🏔️

@sspaeti

1 year

Ready to unleash the power of #DataModeling ? Dive into the dynamic world of data modeling techniques! In Part 1, we explored the importance of data modeling & its role in unlocking the value of your organization's data. In Part 2 (), let's delve deeper.

3

8

37

Simon Späti 🏔️

@sspaeti

1 month

The moment I discovered the efficiency of Vim’s modal editing, my journey has been about finding clarity in my work. Going from Notepad++ and SSMS to embracing Vim, represented a significant shift in how I approach tasks in data engineering and writing.

freeCodeCamp.org

@freeCodeCamp

1 year

Vim is a popular text editor that relies heavily on keyboard shortcuts to get stuff done fast. Once @sspaeti started learning its language, he was hooked. Here he explains why Vim is more than just an editor & discusses its language, motions, & modes.

2

21

121

1

7

36

Simon Späti 🏔️

@sspaeti

1 year

Learning #rust with #duckdb 🧐. So far: Converting exported transactions in XLSs to CSV and importing them into DuckDB. Next step: Find missing rows 🙈. 🔗

3

0

33

Simon Späti 🏔️

@sspaeti

2 years

Awesome write-up about trendy #TableFormats ( @DeltaLakeOSS , @ApacheIceberg , @apachehudi ) and its features explained nicely with super illustrations.

7

31

Simon Späti 🏔️

@sspaeti

27 days

Why do I always end up justifying Facts and Dimensions instead of directly creating a one-big-table with the fixed group when implementing dimensional modeling? How do you argue against it despite the added complexity no business user ever understands? 🫠

2

3

33

Simon Späti 🏔️

@sspaeti

1 year

Hey everyone! Have you tried Ballista, a distributed compute platform primarily built in Rust and powered by Apache Arrow and Datafusion? It competes with Apache Spark for distributed SQL query processing. I'd be curious to hear anyone's thoughts 🤔.

4

3

31

Simon Späti 🏔️

@sspaeti

6 months

Diving deep into "Business Intelligence, Semantic Layer, Modern OLAP, and Data Virtualization." Each has unique attributes but with intersecting goals. A thread👇

1

2

30

Simon Späti 🏔️

@sspaeti

1 year

12 Things you need to know to become a better #dataengineer in 2023—anything you'd add/change?

1

3

30

Simon Späti 🏔️

@sspaeti

1 year

It takes a lot of work to keep up with the latest data engineering. We ( @AirbyteHQ ) created a to keep up with yearly trends. Check out the exciting results from 886 participants, concluding the largest data engineering survey.

0

8

30

Simon Späti 🏔️

@sspaeti

1 year

If you're interested in learning more about how Pandas 2.0 and Arrow are transforming the data processing landscape, check out my blog post .

Pandas 2.0 and its Ecosystem (Arrow, Polars, DuckDB) | Airbyte

Dive deeper into the power of Pandas and how leveraging it can benefit your organization. Explore a new way to work with data and unlock powerful insights!

airbyte.com

3

4

29

Simon Späti 🏔️

@sspaeti

3 months

Explore HelloDATA BE on GitHub () and try the docker-compose on your local machine. We're early and value community feedback. For a deeper understanding, our documentation covers the data stack, components, architecture, and infrastructure.

GitHub - kanton-bern/hellodata-be: The Open-Source Enterprise Data Platform in a single Portal

The Open-Source Enterprise Data Platform in a single Portal - kanton-bern/hellodata-be

github.com

1

6

29

Simon Späti 🏔️

@sspaeti

2 years

Some Personal News: I joined @AirbyteHQ as Data Engineer & Technical Author! 🎉 So proud and honoured to work with such talented people and a fantastic #dataintegration tool. Stay tuned if you are following me for #datacontent . I will spend more time writing than I could before.

5

0

27

Simon Späti 🏔️

@sspaeti

2 months

A fascinating podcast about CRDTs and Automerge with @martinkl . I am looking forward to when we have a @obsdmd sync with CRDT for collaboration on top of Markdown.

#4 – Martin Kleppmann: CRDTs, Automerge, generic syncing servers & Bluesky - localfirst.fm

A podcast about local-first software development.

www.localfirst.fm

1

4

27

Simon Späti 🏔️

@sspaeti

2 years

To me, it's pretty mindblowing what #dagster brings to the table with the next release. They call it #softwaredefined assets.

1

4

26

Simon Späti 🏔️

@sspaeti

2 years

Fast Analysis with DuckDB + Pyarrow compared to Pandas. Fantastic read by @GarsBar35Plus !

0

1

26

Simon Späti 🏔️

@sspaeti

3 months

I'm adding paid chapters to . It was a tough decision. But I have to try. I'm working hard to consolidate my decade of experience into one book, introducing new data engineering patterns and insights. I'm confident some will appreciate & pay a small amount.

1

4

25

Simon Späti 🏔️

@sspaeti

2 years

Understanding semantic vs transformation layer: > The transformation logic (e.g., dbt) and logic hosted in metrics are different. A semantic layer transforms/joins data at query time, whereas the transformation layer does during the transform (T).

3

25

Simon Späti 🏔️

@sspaeti

4 months

Exploring the history of SQL reveals a fascinating evolution of data management. SQL's journey has been groundbreaking since its 1970s inception as SEQUEL to today's advanced Natural Language Queries. What's your favorite SQL evolution with all these different SQL flavors today?

2

4

25

Simon Späti 🏔️

@sspaeti

2 years

Who else is reading? Thoughts? :)

4

0

23

Simon Späti 🏔️

@sspaeti

9 months

Another one to add on my Open-Source Data Engineering Project list 😍

pedram

@pdrmnvd

9 months

Built a free, open-source pipeline that fetches data from public APIs, ingests from postgres database, reads 17 million row CSV files, writes parquets , reads into DuckDB, aggregates, joins, sorts, runs dbt and completes on my laptop in a minute, with materialization and…

46

145

2K

1

3

25

Simon Späti 🏔️

@sspaeti

1 year

🎉 Today, I'm sharing something different - A post where I share my heart, struggles, and triumphs. It's raw, honest, and real. I took the @0xFoster course and read @p_millerd 's Pathless Path book, which inspired me to dive deep into my personal journey.

Finding My Pathless Path

Discover your Pathless Path and create a life true to you. Embrace curiosity and explore unexpected twists.

www.ssp.sh

5

1

24

Simon Späti 🏔️

@sspaeti

5 months

I created a «Technical Writers' Collective» for anyone who is passionate about writing and interested in efficiency, PKM, workflow, and tools like Obsidian, Vim motions, etc. An intersection between tech and a love for writing. @zulip invite:

1

5

24

Simon Späti 🏔️

@sspaeti

5 months

I can't stop thinking of CRDTs (Conflict-free Replicated Data Types). > General-purpose data structures, like hash maps and lists, uniquely built multi-user from the ground up. I'm looking forward to how they'll be integrated into our daily tools for local first collaboration.

4

5

23

Simon Späti 🏔️

@sspaeti

1 year

Lots of insights from the Airflow migration day by @dagster . It is fantastic to see these customer examples. Here is a short recap of what I found most interesting. Thanks for sharing.

1

6

24

Simon Späti 🏔️

@sspaeti

3 months

> No Meetings: Why I Choose Async Over Short Calls ## Why I Choose Async Communication ## How to work with me ## Who am I?

2

23

Simon Späti 🏔️

@sspaeti

3 years

Nine-month ago, I posted my latest blog post, but hey, here is a new one! This time it's all about getting your hands dirty with a real-estate #dataengineering project, including common challenges explained along the way:

Building a Data Engineering Project in 20 Minutes

Real-estates uploaded to S3, Spark & Delta Lake, adding Jupyter notebooks, ingesting to Druid and managing everything with Dagster.

sspaeti.medium.com

4

5

23

Simon Späti 🏔️

@sspaeti

3 years

Awesome what you can get with three lines of code: > wget > chmod +x minion > ./minio server /data #s3 #minio #objectstore #gateway #googlecloud #azure #azureblob #aws #storage #kubernetes #cloudagnostic #developer #python #programming #localdeployment

1

9

20

Simon Späti 🏔️

@sspaeti

2 years

What is a semantic layer? > A SL we use every day. We build dashboards with yearly and monthly aggregations, and design dimensions for drilling down reports by region, and product. What has changed is that we no longer use a singular BI tool; teams use different visualizations.

3

1

22

Simon Späti 🏔️

@sspaeti

5 months

#movedata : There are so many great speakers from the data engineering space! I loved the insights from everyone, all bundled into short lightning talks in one single YouTube playlist.

move(data) '23 All Sessions

www.youtube.com

2

1

22

Simon Späti 🏔️

@sspaeti

2 years

My #dataengineering project is going crazy on Reddit :)

2

0

20

Simon Späti 🏔️

@sspaeti

1 year

This is the future of blogging. Notes are updated constantly, evolving, and hosted on plain Markdown files. No lock-in into a platform that will be gone in 2-3 years, as I experienced many times throughout my career. 👉🏻 Checkout the first movers:

Other Public Second Brains

Besides my Second Brain (see how it’s built on [[Public Second Brain with Quartz]]). Here are some other good ones:

www.ssp.sh

Simon Späti 🏔️

@sspaeti

1 year

@anna__geller @imrobertyi @matsonj @pdrmnvd @AirbyteHQ Thanks for sharing, Anna. Exactly, the data glossary is built on top of the digital garden/second brain analogy. Instead of single levels, it lets you go inwards. You can learn and go deeper into each connection with an interactive graph and backlinks. .

2

1

8

1

2

20

Simon Späti 🏔️

@sspaeti

2 years

Have you missed the #dbtcoalesce keynote yesterday? The most significant update is the dbt semantic layer and its robust integrations in other platforms. dbt python is still early, and not much news there.

1

5

20

Simon Späti 🏔️

@sspaeti

6 months

As Tristan Harris said before: "A handful of tech companies control billions of minds every day" by creating the most sophisticated technology to make us addicted. It's great to see a counterforce at least doing it for a good cause. :)

How to Make Learning as Addictive as Social Media | Luis Von Ahn | TED

When technologist Luis von Ahn was building the popular language-learning platform Duolingo, he faced a big problem: Could an app designed to teach you somet...

www.youtube.com

0

20

Simon Späti 🏔️

@sspaeti

4 months

At age 37, I realized that my most productive days are when I sleep enough and let my brain wander. Instead of checking SM, drinking coffee, and doing another YT for research, aka overstimulation, I do nothing. Later in the day, I will have an insight I wouldn't otherwise have.

2

0

20

Simon Späti 🏔️

@sspaeti

4 months

Pathless Path, a book I truly enjoyed. Please download it now for free; see the link below. Even so, it inspired me to write about my path.

Finding My Pathless Path

Discover your Pathless Path and create a life true to you. Embrace curiosity and explore unexpected twists.

www.ssp.sh

Paul Millerd

@p_millerd

4 months

Breaking: the pathless path is now free, instantly i asked myself, what would be the most fun thing to do with my book? the recent Smart Friends podcast with @EricJorgenson convinced me to do this (i am not tracking downloads) 👉

58

69

491

1

6

19

Simon Späti 🏔️

@sspaeti

6 months

Don't specialize, hybridize. > T-shaped hybrid path: engineering and design, or singing and dancing. > U-shaped: engineering and dancing, or singing and design. Skills that are not often found together. By becoming a hybrid, you can become greater than the sum of your skills.

4

5

18

Simon Späti 🏔️

@sspaeti

6 months

As a seasoned computer scientist, I've learned the power of a Personal Knowledge Management (PKM) system for a deeper life. Imagine capturing every fleeting thought, every piece of knowledge, and interlinking them. It's more than productivity; it's crafting a deeper existence.

3

2

19

Simon Späti 🏔️

@sspaeti

1 month

Tonight's office #codingwithwine

6

0

18

Simon Späti 🏔️

@sspaeti

11 months

Apparently, my article was on the front page of Hackernews 😍

Hacker News 20

@betterhn20

11 months

Using Adapt and Beam for Effective Data Modeling ()

0

1

2

0

1

19

Simon Späti 🏔️

@sspaeti

2 years

This is beautiful 💙

Dagster

@dagster

2 years

🤯 You can automatically create a Dagster job from an existing Airflow DAG. We'll infer all the dependencies for you.

1

0

7

1

2

19

Simon Späti 🏔️

@sspaeti

2 months

What a beautifully written piece, Kepano 👏🏻

kepano

@kepano

6 months

Quality software deserves your hard‑earned cash Quality software from independent makers is like quality food from the farmer’s market. A jar of handmade organic jam is not the same as mass-produced corn syrup-laden jam from the supermarket. Industrial fruit jam is filled with…

36

172

1K

0

1

19

Simon Späti 🏔️

@sspaeti

1 month

Btw, I'm trying out dlt, and the Postgres to Postgres was so slow for some biggish table that I exported it to DuckDB (using the performance gain of ConnectorX, Parquet) and used the Postgres extension to import it to Postgres. Hacky? Yes totally! But so far, brutally fast.

2

17

Simon Späti 🏔️

@sspaeti

2 years

What is a Data Catalog? > A data catalog is a centralized store where all metadata data about your data is made searchable. Think about a Google search for your internal metadata.