Adi Polak Profile Banner
Adi Polak Profile
Adi Polak

@AdiPolak

14,404
Followers
803
Following
1,183
Media
6,771
Statuses

DevX @ Confluent • Cloud • ML/AI & Data Platforms • Ex Microsoft, Akamai • Keynote Speaker • Author of Scaling ML Systems(O'Reilly) • Opinions are mine

San Francisco
Joined May 2011
Don't wanna be here? Send us removal request.
Pinned Tweet
@AdiPolak
Adi Polak
3 years
I earned a master's degree following my thesis in machine learning and can tell you right now that some of the BEST coders I've worked with never had formal education. It 💯% helped me land a job interview, but it does not make me a better coder/data scientist than anyone else.
34
116
1K
@AdiPolak
Adi Polak
3 years
The purpose of Code Review is learning, not blaming.
96
1K
7K
@AdiPolak
Adi Polak
7 months
Stanford has released Prof. Manning's NLP course for free (!!!) on Youtube if you are interested in diving deeper into NLP. It starts with the very basic word2vec and expends to: -> Domain adaptation for supervised sentiment -> Retrieval augmented in-context learning ->…
10
355
2K
@AdiPolak
Adi Polak
3 years
While Data Scientists are the most profitable jobs in Tech, there is a huge rise in demand for people who understand the Machine Learning Pipeline and can take responsibility for the whole stack. Yes, that includes extensive work with Data & Analytics stack. Interesting times.
27
194
2K
@AdiPolak
Adi Polak
3 years
Microsoft partnered with PyTorch to provide you with a completely FREE PyTorch fundamentals course. The course includes a built-in sandbox experience where you can code directly from your browser. No need to install/download anything, just schedule the time to learn. Link ⤵️
Tweet media one
12
345
1K
@AdiPolak
Adi Polak
2 years
Personal ANNOUNCEMENT ✨ Happy to share the early release of my book on Machine Learning with Apache Spark. It contains 3 chapters, and new chapters, corrections, and updates will be released every month. Looking forward to your thoughts and how it can better serve your needs!
Tweet media one
63
135
1K
@AdiPolak
Adi Polak
7 months
“ Solid understanding of Spark and ability to write, debug, and optimize Spark code.  “ I was today years old when I discovered that openAI relies heavily on Spark.
Tweet media one
16
83
885
@AdiPolak
Adi Polak
2 years
Entropy is the golden measurement for machine learning. High entropy means data chaos, low information gain, not accurate model. Low entropy means - better knowledge in the system, high information gain and better model. Noise reduction & clean data is a must for good model!
16
123
804
@AdiPolak
Adi Polak
3 years
When you LOL and nod your head with agreement at the same time Thanks, #ProductLead for sharing this!
Tweet media one
9
153
753
@AdiPolak
Adi Polak
2 years
In the Data Engineers community, we're working on a list of fundamental concepts you should know: - Distributed Systems - Load Balancing - Caching - Data Partitioning - Indexes - Redundancy and Replication - SQL vs. NoSQL - CAP Theorem - Consistent Hashing What did we miss on?
59
102
730
@AdiPolak
Adi Polak
3 years
Are your BI tools and Data Pipeline sufficient? Think again. Here is a summary of the impact of DataOps: - by Data Engineering weekly
Tweet media one
11
97
718
@AdiPolak
Adi Polak
3 years
A Self thought friend is now, officially a Software Engineer at google. It took her 6 years. How she did it? 👉 Studied google open source technologies in depth 👉 Joined a small startup as a dev 👉 Moved to a bigger company to learn more tech stacks 👉 Practice with LeetCode
23
80
709
@AdiPolak
Adi Polak
4 years
LOL 😂😂😂
Tweet media one
33
134
643
@AdiPolak
Adi Polak
4 years
What is the first programming language you learned? will you recommend learning it today?
1K
61
622
@AdiPolak
Adi Polak
2 years
We always focus on good code as 1st priority for building software. But Machine Learning is different. With ML, Code takes a secondary role, and "data" becomes the lead actor. If you can understand how to produce, collect, manage and analyze data, you’ll own your future.
10
82
558
@AdiPolak
Adi Polak
2 years
Make a data engineer cry with just 4 words
442
60
521
@AdiPolak
Adi Polak
2 years
If you want to be a data engineer, you have to build data pipelines! Write some code, make some integration and break some stuff! It will literally make you a better at understanding data!
14
54
482
@AdiPolak
Adi Polak
3 years
Getting into data science/engineering can be overwhelming. Pick just one open-source library and start digging into it. You'll learn design patterns, func, and algorithms all in one open-source code. You’ll also get exposed to how to write code that is more than hello world.
14
60
478
@AdiPolak
Adi Polak
2 years
After fantastic years in Microsoft, it’s time for a new adventure. Excited about the next chapter in my career, where I will continue to push the boundaries of Data and Machine Learning technologies and work towards novel open-source solutions. The Future is Open ✨
57
10
428
@AdiPolak
Adi Polak
3 years
Hello, cutie🤩
Tweet media one
8
3
397
@AdiPolak
Adi Polak
10 months
Data and AI summit, here I come!
Tweet media one
15
6
400
@AdiPolak
Adi Polak
2 years
Machine learning models are not used in production as is. You have to wrap & serve them in a way that supports your application requirements. It can be served as Rest API, streaming, or batch. Yes, You will need to take care of those too. ML workflow is more than building models.
8
70
391
@AdiPolak
Adi Polak
3 years
You must appreciate this infographic. While a couple of improvements/corrections can be made to address staging & production stages, it captures the high-level flow of a machine learning pipeline well. Design by Sebastian Eberstaller. Infographic from -
Tweet media one
6
68
379
@AdiPolak
Adi Polak
3 years
5 things every Data Engineer should know in 2021: 1. Lambda architecture 2. Apache Spark 3. CI/CD 4. Basic SQL 5. Airflow
37
28
341
@AdiPolak
Adi Polak
3 years
Oh look 👀 Delta Lake cheat sheets that cover all the functionally you need! 👍 Thumbs up if you LOVE a well-organized and informative cheat sheet!
Tweet media one
Tweet media two
4
64
325
@AdiPolak
Adi Polak
2 years
Having a machine learning model that performs great locally is just the beginning. To productionize your machine learning and move it through the stages of dev, validate, staging all the way to production requires you to adopt tools and data practices of production scale.
9
39
317
@AdiPolak
Adi Polak
6 months
Having worked with aws for 2 years now, returning to Azure is a pleasure. My dockerized app was registered in ACR and deployed to AKS inside my vnet in less than 10 minutes. A bit of Github action is on its way too!
Tweet media one
14
20
323
@AdiPolak
Adi Polak
4 years
Wow this one just arrived ✨ ** #Dev 2019 Distinguished Author ** I am thankful to be one of 2019 top 500 authors of @ThePracticalDev Thanks so much dear community 🙏❤️ May the next year will be full with content and empathy 💜 Happy 2020 🎉
Tweet media one
Tweet media two
16
16
291
@AdiPolak
Adi Polak
2 months
New beginnings are scary. Yet, with the right team and brave, can-do, empathetic culture. Everything is possible. Excited to share that I've started a new position as Director @ Confluent. Working with a brilliant team of rock stars ✨ Expect more Kafka, Flink, and Data…
Tweet media one
Tweet media two
35
12
281
@AdiPolak
Adi Polak
2 years
"It works on My Machine" - MLOps edition:
Tweet media one
5
40
270
@AdiPolak
Adi Polak
3 years
Want to be a software developer? Great! Motivated by the money? Great! We need more people and.. guess what! There is enough space for everyone! Find your niche and go all in!
15
24
259
@AdiPolak
Adi Polak
3 years
📚It was such a pleasure collaborating on this book with a long list of talented Data Engineers. If you want to learn about data cleaning, processing, wrangling, storing, ingesting and much more. Check it out!
Tweet media one
14
40
241
@AdiPolak
Adi Polak
2 years
Don't negotiate with your brain. Motivation is an after-effect of experiencing progress and success. The real barrier is you. Identify it and set a goal to overcome it. Think about what you can do today. the small thing that takes 10min but will develop you greatly.
7
41
224
@AdiPolak
Adi Polak
3 years
Microsoft is a company that demonstrates consistency in giving employees more. Up until now, we've got extra vacations days twice and a special bonus of $1.5K for everyone, equally, wherever you are on the planet. With that, I’m hiring for my team:
8
52
224
@AdiPolak
Adi Polak
3 years
SQL is the base for all Data related jobs in tech, Data Scientists Data Engineers Data Analysts They all need to know SQL at some level
14
21
220
@AdiPolak
Adi Polak
30 days
True story.
Tweet media one
8
40
216
@AdiPolak
Adi Polak
2 years
Data is eating the world. Happy to share my new adventure as VP of DevEx at lakeFS io. Developing open-source to enable Data best practices is an exciting challenge! What do lakeFS offer to solve our data & machine learning pain? → link in comment.
21
6
203
@AdiPolak
Adi Polak
3 years
Uber's real-time Data Infrastructure journey is a fascinating read for all data engineers:
4
44
207
@AdiPolak
Adi Polak
3 years
Without code, you can’t build machine learning-based systems. Without data, you won’t be able to build machine learning models. But, you are always able to leverage precomputed ML models through open-source, cognitive services, and the cloud.
3
32
195
@AdiPolak
Adi Polak
4 years
This is painfully true
Tweet media one
2
16
192
@AdiPolak
Adi Polak
2 years
SQL is easy, but good data architecture is hard. The challenge is not writing queries but understanding a highly complex data model. How would you go about simplifying data models? Find granularity, define relationships, use indexes and plan for future schemas evolution.
9
33
187
@AdiPolak
Adi Polak
3 years
This morning I officially signed a technical book contract. I equipped myself w/ detailed market research, an Excel sheet, early morning alarm clock, and a decent coffee stash. Setting up my intention for 2021 to be about Curiosity, Learnings, and Education is on track 🛣️
15
4
178
@AdiPolak
Adi Polak
2 years
@svpino Many smart people get stuck in the knowledge-gathering stage but never use it. It’s one thing to learn something but completely different to act on it.
10
20
173
@AdiPolak
Adi Polak
3 years
You want to build your career as a software developer? Take a look at these Extraordinary GitHub repos 🧵👇
3
54
168
@AdiPolak
Adi Polak
2 years
The rise of Analytics Engineer. Check out this interesting table of data responsibilities. The more data continues to dominate our products, the more professions, majors, and jobs will open up in the tech industry for you to join the ride. Keep learning. Table by Claire Carroll.
Tweet media one
7
22
162
@AdiPolak
Adi Polak
3 years
Parent: If all your friends jumped off a bridge, would you follow them? Machine Learning Model: Yes! Parent: ... Model: ...
7
38
162
@AdiPolak
Adi Polak
3 years
a python programmer has two career lines to choose from. The obvious is programming, the second one is at the zoo.
8
21
158
@AdiPolak
Adi Polak
2 years
The magic behind software is not data, architecture, algorithms, or programming language. The magic piece is you. It's all about your creativity. Nothing replaces your ability to turn a whole lot of nothing into something that changes the world.
4
31
159
@AdiPolak
Adi Polak
2 years
“Uber adopted Apache Pinot several years ago and today Pinot is a key technology inside Uber Data Platform to power multiple mission-critical real-time analytics applications“ Interesting article from Uber engineers on leveraging Presto, Kafka, Pinot,etc.
1
26
160
@AdiPolak
Adi Polak
10 months
Interesting distribution of data professionals. Curious what one might suggest otherwise
Tweet media one
8
26
145
@AdiPolak
Adi Polak
2 years
The baby bird is not a baby nor a bird anymore. This is a South American Electric Fish, also known as electric eel. Despite the name, it is not an eel but a knifefish. It is considered a freshwater teleost which contains an electrogenic tissue that produces electric discharges.
Tweet media one
17
8
140
@AdiPolak
Adi Polak
3 years
Stanford offers their research students a guide for Data Best Practices. What can we learn from it? 🧵 1/n
3
28
136
@AdiPolak
Adi Polak
3 years
Data science != Data Engineering
7
17
128
@AdiPolak
Adi Polak
2 years
When your friends know you so well. Glad to be part of this awesome community 🌺✨ #birthdaygirl
Tweet media one
14
5
128
@AdiPolak
Adi Polak
4 months
Welcome, DocLLM by JPMorgan (yes, yes). As the name implies, it understands documents (invoices, receipts, reports, contracts, etc.). JPMorgan emphasizes that this model is not just another language model and explains how they built it slightly differently to achieve better…
Tweet media one
2
19
126
@AdiPolak
Adi Polak
2 years
To all my Hindu friends, Wishing you a very happy Diwali 🪔 may the festival of light bring lot of peace , good health and good luck.
13
3
122
@AdiPolak
Adi Polak
4 years
📚 Curious to learn more about the basics of distributed systems? patterns and paradigms? have some time to read? Checkout - Designing Distributed Systems by @brendandburns . Free PDF version -->
Tweet media one
3
57
120
@AdiPolak
Adi Polak
3 years
Data Engineering: ✦ CS Fundamentals ✦ Java / Python / Go ✦ Testing ✦ DB,No/SQL ✦ Scaling, CAP, OLTP vs. OLAP ✦ Data WareHouse ✦ Distributed Computing ✦ Messaging ✦ Monitoring ✦ Data Security & Privacy ✦ Orchestrators ✦ CI/CD it takes a team. It's not a 1 person job.
Tweet media one
3
28
114
@AdiPolak
Adi Polak
6 months
if you ever built a team, a vision, a community, you'll understand.
Tweet media one
1
5
107
@AdiPolak
Adi Polak
3 years
Thank You for joining Santiago’s Twitter Space on Starting out with Machine Learning. Great insights and questions. Let's start a 🧵. I invite you to fill in your thoughts ⤵️
3
19
103
@AdiPolak
Adi Polak
3 years
Got my first shoot. Feeling ok.
Tweet media one
5
0
102
@AdiPolak
Adi Polak
3 years
Traditional career paths are out. It's time for You to develop your own journey based on what you’re passionate about and enjoy doing. Getting inspired by others is Great. But focus on yourself and what spark joy for you. Also, know that a career is a marathon, not a sprint.
1
12
100
@AdiPolak
Adi Polak
5 years
Command+R would save clicks, 95% of the cases 😐
6
30
99
@AdiPolak
Adi Polak
1 year
StackOverflow Survey 2022 is out. 🐳Docker adoption is increasing - 55% to 69% 🦀Rust is still the most loved language 📊PostgreSQL wins over Redis as the most loved 💼💼Data skills are well compensated w/ Apache Spark, Apache Kafka, & Hadoop- the top 3
2
24
99
@AdiPolak
Adi Polak
4 years
JAVA & Visual Studio Code updates! 🧠 Better IntelliSense (a.k.a. Code Complete) performance 🗃️ “java.project.resourceFilters” for Workspace refreshing 📊 Better overall project view @JavaAtMicrosoft @code #Java #vscode Learn more about it here ⏩
Tweet media one
0
16
97
@AdiPolak
Adi Polak
3 years
You are the only one responsible for your career. Don't wait for people to provide you with a syllabus, tell you what to learn or do. Go ask questions. Do the research. Make decisions. It's your life, be the driver, not the passenger - don't let life drive you.
3
12
98
@AdiPolak
Adi Polak
4 years
I am ✨ready✨ See you live in just 15 min. #MSBuild
Tweet media one
2
4
96
@AdiPolak
Adi Polak
2 years
I talk to people in the data industry every day. Most know everything about ETL but can't mention a single tool for validating data products. Not a single one! Or let alone, what a data product is. We are so early.
9
7
96
@AdiPolak
Adi Polak
3 years
Data Science is still VERY hyped. If you want to SECURE yourself a future & a job in tech, take a look at these career paths: a). Building Data Science platforms - ML Engineer b). Manage production ML lifecycle - MLOps c). Making data available for ML&Analytics - Data Engineer
0
15
96
@AdiPolak
Adi Polak
2 years
Many machine learning algorithms can not handle free text out of the box. We need to marshal the data into a tabular format while removing noise, hashing Strings, and build a translation table to explain the model outcomes. Interpretable ML is 🔑 to solve the black box problem.
3
6
93
@AdiPolak
Adi Polak
1 year
How do you handle late-arriving data?
56
3
93
@AdiPolak
Adi Polak
3 years
You should speak at tech events: ✅ Your journey is unique ✅ Your voice matters ✅ Extend your professional network ✅ The prepping process will make you dive into the smallest detail ✅ You’ll become better at it What to talk about: —> Success, Failures, Best practices..
5
23
93
@AdiPolak
Adi Polak
3 years
HELP India 🙏
8
8
93
@AdiPolak
Adi Polak
3 years
" To become a data scientist, you could earn a Bachelor's degree in Computer science, Social sciences, Physical sciences, and Statistics. ... The truth is, most data scientists have a Master's degree or Ph. D " Can you do it with a course?
15
5
90
@AdiPolak
Adi Polak
2 years
Working at Microsoft provided me with a front-row view of how the best leaders in the industry practice humility every day. Admitting mistakes, giving space, allowing ideas to emerge, building a learning culture, and so many more, I'm happy to continue practicing in my role.
3
0
92
@AdiPolak
Adi Polak
3 years
📚Free O’Reilly book introducing MLOps and how to strategize the organizational culture to bring all engineering stakeholders to support ML. I had the pleasure to review it early on, and it's a great read covering essential aspects of productionizing ML. Link in first comment.
Tweet media one
5
17
88
@AdiPolak
Adi Polak
4 years
Don't ask yourself - what can you do for your cat Ask yourself - what your cat can do for you #getIsiACat
Tweet media one
3
16
90
@AdiPolak
Adi Polak
2 years
✅ “Hello Spark fans” Selfie! Found the famous Simon Whitley - and got awesome Advancing analytics stickers! 😎🇬🇧
Tweet media one
6
0
88
@AdiPolak
Adi Polak
2 years
Scare a data engineer in 5 words or less 👇
162
13
87
@AdiPolak
Adi Polak
3 years
Experienced software developers, what do you do to keep on track with new technologies and trends? Did it change with the move to remote work?
31
6
85
@AdiPolak
Adi Polak
4 years
I have a GitHub joke, but nobody gives a fork..
@jessehouwing
Jesse Houwing ⭐
4 years
I have a computer programming joke, but it won't compile...
0
0
4
4
6
84
@AdiPolak
Adi Polak
3 years
Elasticsearch and Kibana have changed their license from Apache V2 to SSPL. By continuing using them in your online services code, you are at risk of being forced to release every supporting piece of software your product is built from. DON'T IGNORE THIS.
2
36
83
@AdiPolak
Adi Polak
2 years
While Scala dominated the distributed data world for a long time, it wasn't as friendly to engineers such as Python or Go. Just understanding what is monad took an experienced engineer a whole month. This choice impacts our productivity and learning curve. Chose wisely.
11
5
80
@AdiPolak
Adi Polak
1 year
Unpopular opinion. $8 in San Francisco is not the same as in Bangalore.
13
2
81
@AdiPolak
Adi Polak
1 year
ChatGPT-driven development. Courtesy of the data engineering community.
Tweet media one
3
8
79
@AdiPolak
Adi Polak
4 years
When your software is being tested, but you still have confidence in it
Tweet media one
1
13
77
@AdiPolak
Adi Polak
4 years
Tweet media one
Tweet media two
4
2
77
@AdiPolak
Adi Polak
3 years
How does it feel when you write 50 lines of code without having to copy-paste from somewhere?
25
2
75
@AdiPolak
Adi Polak
3 years
Being a data science is 70% preparing data, and 30% complaining about preparing the same data.
1
7
75
@AdiPolak
Adi Polak
2 years
In software development, articulating your ideas and experience in simple language is the strongest capability of all. Otherwise, you will miss out on people that couldn’t understand your wisdom due to complex language.
6
8
75
@AdiPolak
Adi Polak
3 years
Opening the morning 🌅 with thought leaders such as Shingai Manjengwa at O’Reilly Radar Data and AI event!
Tweet media one
1
2
75
@AdiPolak
Adi Polak
2 years
Decentralized, distributed systems are changing the world!
5
7
73
@AdiPolak
Adi Polak
4 years
After probably too much time, I decided to update it. Here it is, my #NewProfilePic
Tweet media one
6
1
69
@AdiPolak
Adi Polak
1 year
lakeFS ❤️ DuckDB
Tweet media one
1
8
68
@AdiPolak
Adi Polak
4 years
Mmm...
Tweet media one
4
8
69
@AdiPolak
Adi Polak
3 years
Taking a long weekend off work to recharge, relax and come back energized, ready to innovate and continue building great things with the team! 🧘‍♀️ For inspiration, I'm going to read my favorite book once again ;) You can too! It’s free. Link in the first comment 📚
Tweet media one
3
8
69
@AdiPolak
Adi Polak
3 years
3 Reasons to invest in Data Engineering teams: 👉 "If you want to become a software company, you need data to make better decisions as a business”. 👉 "every company is becoming a big data company”. 👉 Data is growing on a massive scale, and big data is here to stay.
7
12
66
@AdiPolak
Adi Polak
2 years
👍 This is how it looks like when @Michal_Wosk creates limited edition t-shirts for @lakeFS . @qconlondon 🇬🇧, first day, here we go! Feeling cute 🥰
Tweet media one
8
1
68