We’re super excited to announce that we just raised $150 million funding in Series B ($1.5B valuation) led by Altimeter and Coatue. This brings us closer to our goal: power all companies’ data movement.
#moderndatastack
#funding
#opensource
[1/12]
We at Airbyte are excited to share our journey of optimizing our CI/CD process with the Dagger platform. As an open-source ELT platform managing hundreds of Docker containers, we faced challenges with our original CI system, which was a maze of GitHub Actions, YAML, and shell…
Most surprising result so far from our State of Data Engineering Survey ():
🛑 20% of data teams are in a hiring freeze! 🛑
We're conducting the biggest survey of Data Engineers ever. BigQuery vs Snowflake? Monte Carlo vs Great Expectations? Come vote!
move(data) 2022 is going to be the biggest Data Engineering conference yet! You don't wanna miss anything 😉
Join us using the link below to secure your ticket now!
Help us prioritize which connectors to build first with this 1.55 min long survey (that's our average)! Only 4 questions that will help us prioritize the connectors YOU need, so please participate!
🥳🚀 We've reached a massive milestone!
10k stars on GitHub, and we couldn't have done it without our incredible community! 🌟
Your passion, support, and contributions are the true driving force behind our success.
Thank YOU for being on this journey with us! 🙌
There is so much knowledge to gain from top engineering leaders, that we decided to curate their best articles and list all the key insights for you in our new blog. Our newsletter "Weekly Bytes" curates the top 5 new articles right in your mailbox.
While software engineering efficiency has become key to every company, it is still a black box and is mostly managed in a gut-driven way. That's the problem
@anaxihq
addresses by bringing visibility to the software development process.
How do you build a user-facing dashboard without impacting your production database?
Here is how
@dunithd
did it using Airbyte,
@apachekafka
, and
@ApachePinot
.
In March, we announced our $5.2M Seed round with Accel. Today, we're announcing our $26M Series A with Benchmark!
So much happened in the past 2 months. Metrics quadrupled: companies syncing data (500 to 2,000), Slack community (with now 1,300 members), contributors too!
[1/2]
⏰ NEW TUTORIAL ⏰
Learn how to chat with our SQL data warehouse using
@llama_index
- Write questions in plain English
- Have AI write SQL queries to answer the questions
- No need for embeddings or vectorstore databases.
Check it out now 👇
Here's a great big guide on data engineering! 👉 The Data Engineering Cookbook
The repo got almost 10k stars on Github, it's well written and contains all information you need to know to start you
#dataengineering
journey!
#bigdata
#dataops
⏰ NEW TUTORIAL ⏰
Learn how to chat with our SQL data warehouse using
@llama_index
- Write questions in plain English
- Have AI write SQL queries to answer the questions
- No need for embeddings or vectorstore databases.
Check it out now 👇
Today, we are announcing an important change about Airbyte and how we are going to future proof the commoditization of data integration with the Airbyte community.
(1/7)
Verified by Twitter? So 2008. Verified by Prefect is the new blue check. Today launches our new Premier Technology Partner Program within our partner ecosystem, read on 👉
🎉 We have officially launched our new Connector Builder!
Creating REST API Connectors has never been easier and did we mention it can now be done with no code?
Here are some resources for you to get started 🧵⬇️
Postgres has established itself as one of the most popular databases in the 🌎.
Our team spent the last quarter developing our Postgres GA Connector for Airbyte Cloud, which is now live! 🥳
New accounts will receive 30 days of unlimited syncs below.
@AirbyteHQ
has nailed community-led growth by leveraging Slack and focusing on two key metrics: time to first response & time to resolution.
Result: their GitHub repo stars chart is not a curve, it's a straight line 🤯
In the past few months, we surveyed the data community in order to build the State of Data 2023.
With 886 participants, this is the largest data engineering survey!
1/8
In this article, we discuss the fundamental architectural difference that makes data warehouses appropriate for analytics, and that makes operational databases appropriate for operational workloads.
Our latest blog post discusses the evolution of the
#dataengineering
role! ✨
This blog post looks at the past and the present of the data engineer role, examining emerging
#trends
to offer you some predictions about the future 🔮
Learn everything you'll need to implement robust AI data pipelines with
@LangChainAI
and
@dagster
Use Dagster to set up a pipeline that processes the data and stores it in a vector
Combine information in the store with LLMs using the LangChain QA module
Tuesday was a big day for
@dagsterio
at the
#dagsterday
! We wrote a short recap about the announcements and newest features ✨. Huge congrats on the v1.0 release and the General Availability of Dagster Cloud! 👏🏻
#branchdeployments
.
An Airbyte user self-hosting an unsecured instance had their connector credentials stolen.
Transparency is a core value at Airbyte. Here's what happened, what you should know, and what we will do to improve.
We're excited to announce Destinations V2!
It ships with:
- One-to-one table mapping
- Improved per-row error handling with _airbyte_meta
- Internal Airbyte tables in the airbyte_internal schema
- Incremental delivery for large syncs
🎂 It's the 10 year anniversary of
#AmazonRedshift
today!
What do you think were the key highlights of AWS' fastest growing service?
How did it revolutionize the industry, and where is it going in the next decade?
Full retro + 2 insider interviews:
Last week, we got 3/4 of the team in San Francisco. It was amazing to see them in real life. Nothing replaces spending time together in person. We're so looking forward to our next in-person event!!
#remote
Have you heard of
@LangChainAI
?
If you haven't, you're about to learn 🧠
We now support vector databases (Pinecone and DocArray) through LangChain!
Learn more in our article below
Dive into our EaC for data structure tutorial and learn how to:
- Use Airbyte’s Terraform provider to manage data ingestion
- Orchestrate multiple Airbyte syncs in parallel using Kestra
- Add data transformations with dbt and Python
- And more!
🐘 Meet the hottest new Data community: Mastodon!
Your guide to a federated social media future, and how to join the burgeoning new Data Folks on the platform that *cannot be bought*!
It’s now possible to utilize the Airbyte sources for Gong, Hubspot, Salesforce, Shopify, Stripe, Typeform and Zendesk Support directly within your
@llama_index
based application, implemented as data loaders
Learn about using this powerful new tool 👇
Today is a very special day: Airbyte is officially turning 1! 🎉
We're going over all what happened in the past year in this article.
We also hit a very important milestone as we now officially support 100+ connectors!
Thanks so much for your support! 🙏
🎉 We just hit 10k members on Slack!! 🎉
From all of us here at Airbyte, we can’t thank you enough for the continued support and growth. Here’s to many more and keep on spreadin the word 🗣 ☺️
We're proud to announce our inaugural Data Podcast awards:
🏆 The Best Data Podcasts in 2022!
(Basically we listened to a ton of podcasts and chose our top 5 favorites each, and why you should check them out!)
Did we miss any? What are your faves?
AI Is super cool, and here at Airbyte, we think a lot about how engineers can best leverage up-and-coming AI tools.
But we're seeing some worrying patterns!
There are plenty of data-movement best practices to apply here.
No need to reinvent the wheel.
"What is DuckDB?" - top data twitter question the past week/month year 🔥
good thing
@sspaeti
added it to our data glossary recently!
It's open source, additions/contributions welcome!
Airbyte got its first case study. Exciting! Most of the numbers are outdated now, but it's a pretty great review of everything that happened in the first 9 months of the project! Thanks for doing this
@Threado
!
#OpenSource
In this tutorial, you can build an ELT pipeline to discover GitHub users that have interacted with the Prefect, Airbyte, and dbt repositories by leveraging the three tools! 🤩
Have you used
#Airflow
operators for ETL pipelines? There are huge challenges using its transfer + transformation operators. You'll end up with 100s of connections. The best way is to use Airflow as orchestrator, Airbyte as EL, and dbt as T.
#opensource
At Airbyte, we use Gradle as our build tool, jOOQ as the ORM, and Flyway to manage database migrations for our open-source project.
@LirenTu
shared why we write Flyway migrations scripts in Java and how we update the jOOQ code with Gradle.
@JavaOOQ
Part 2 of "Data Modeling – The Unsung Hero of Data Engineering" is here. We're focusing on modeling approaches & techniques, including top-down vs bottom-up, dimensional, data vault, and entity-centric methods. Learn more about common modeling challenges👇
Hex? Preset? Mode? Tableau?
Have opinions about BI? Come fill out our STATE OF DATA ENGINEERING survey
And weigh in on:
- best data substack
- best data youtube
- best data podcast
- best data community
Be the first to get results + special🐙 swag!
Dagster can orchestrate data ingestion pipelines with Airbyte, SQL-based transformations with dbt, and any kind of Python transformation.
Learn how to ingest and transform Github and Slack data into Postgres.
We're big fans of Seattle Data Guy, and having an article from him about Airbyte is always a good sign that we must be doing something right! Here's his latest piece:
#opensource
#bigdata
#dataops
#dataengineering
PSA: Special guest for our next Airbyte Community Call:
@vinodhinisd
from
@lakeFS
!
📆 Wednesday Oct 12, 1pm PST
Register:
Come learn about data versioning at *MASSIVE* (100s of PBs) scale, and how
@TheTreeverse
uses Airbyte!
🧵 We're super excited to announce the release of the 0.50 version for Airbyte with 3 significant features, now available on both Airbyte Open Source and Cloud:
🔄 Schema propagation
👉 Column selection
🏁 Checkpointing
Here's why this is significant for Open-Source ⬇️
Did you hear from the most loved language, Rust? Are you curious why and how this might be relevant for data engineering?
@sspaeti
took a first glanced and checked how it compares to the language of data engineers, Python.
Hacktoberfest 2022 starts next week! 🎃
From 10/03 to 11/02, we’re having a contest to see who can build the best and most connectors using our low-code CDK (available for Early Access only). We've got cash prizes and more!
All details here 👇
Hey Python enthusiasts! 😊🐍 Are you eager to create a destination using Python? We've created this in-depth tutorial to guide you through the process. It features example code showcasing the implementation of the DuckDB destination. Happy coding, everyone