vicki Profile Banner
vicki Profile
vicki

@vboykis

52,560
Followers
1,151
Following
4,079
Media
37,180
Statuses

Born: USSR. Raised: USA. ML Eng @mozillaai Ex: @duosec @Tumblr , @automattic Nights: 👦 & 👧 working on some ✨ new vectors ✨

Philadelphia
Joined January 2009
Don't wanna be here? Send us removal request.
Pinned Tweet
@vboykis
vicki
11 months
Ok, so LLMs are a Thing. How do they work? Embeddings. WTF are embeddings? I spent a year doing a deep dive. But when I was researching, I couldn't find anything that explained them in business, engineering, AND math contexts. So I wrote a thing.🚀
52
434
2K
@vboykis
vicki
1 year
Women: I can change him Him:
Tweet media one
159
2K
19K
@vboykis
vicki
5 years
It's only called a Neural Network if it comes from the Neuralè region of France. Otherwise you have to call it a logistic regression.
51
2K
11K
@vboykis
vicki
5 years
Ten years ago, I didn't know what unit tests, version control, or continuous integration were. Today, my code fails at each of these steps at least once a day. Follow your dreams.
53
1K
8K
@vboykis
vicki
6 years
Me at career day in middle school, brightly: Okay, kids. I'm a 'machine learning engineer', which is just a fancy word for - Kid in the back: Yeah, we know what you do. PyTorch or Tensorflow? Me: Well, I, uh, don't do deep learn- Kids: BOOOO GO HOME I BET YOU ONLY USE ONE CORE
51
1K
6K
@vboykis
vicki
1 year
Intro to machine learning books make so much sense after you have at least five years of experience in machine learning.
77
446
6K
@vboykis
vicki
6 years
Hottest programming skills in 2018: 5. Fixing git merge conflicts 4. Correctly mapping ports in Docker containers to host machines 3. Getting info from AWS documentation 2. Pulling summary stats from a data stream 1. Turning any of the above into a conference talk about AI
68
2K
5K
@vboykis
vicki
2 years
A senior developer is someone who fluently hates more than one programming language.
49
676
5K
@vboykis
vicki
2 years
ML textbooks have titles like “A Gentle Introduction to Linear Algebra for Data Analysis” and then the first sentence hits you out the gate with, “Assume the inverse covariance matrix theta we were imagining in our head before you picked up the book is transposed to reveal— ”
65
505
4K
@vboykis
vicki
5 years
Sorting Hat: Merge sort. Harry Potter: Er, excuse me? Sorting Hat: The white board, do a merge sort. Harry Potter: But I thought Hogwarts was a mag- Dumbledore: My dear boy, castle rent is much too expensive. We've pivoted to SaaS products. The hat does our interviews now.
33
1K
4K
@vboykis
vicki
5 months
LLMs are so weird because one side is people with five PhDs who have been studying neuron activations for the past three decades and on the other side is someone called leetm5n with an anime avatar just casually releasing increasingly better performing fine tunes of mistral
63
396
4K
@vboykis
vicki
6 years
Name a more iconic trio, I'll wait.
Tweet media one
103
869
4K
@vboykis
vicki
2 years
CDC says you can use log4j again.
30
511
4K
@vboykis
vicki
10 months
hundred startups just tanked
Tweet media one
29
119
4K
@vboykis
vicki
5 years
Producer: Pitch me. Me: It's a heartfelt romance about two data scientists who have never met, but leave each other carefully-commented notes in a shared codebase, falling in love in the process. It's called "The Jupyter Notebook." Producer: Get out.
51
631
4K
@vboykis
vicki
6 years
Somewhere in the multiverse, this HN exists.
Tweet media one
30
983
3K
@vboykis
vicki
6 years
Producer: Pitch me. Me: It's an ensemble sitcom about a lovable, goofball DevOps team that works for a startup in New York and investigates outages. It's called Brooklyn Five-Nines. Producer: Get out.
35
755
3K
@vboykis
vicki
1 year
Hey, I’m trying to improve this unsupervised model to correctly label users. Looking for an intern to improve it. Here’s a picture of the current clusters. If anyone has ideas, feel free to respond to this tweet with a gist of your implementation. Winner gets a lollipop.
Tweet media one
150
217
3K
@vboykis
vicki
3 years
If you use your default system Python, don’t worry about what’s in the vaccine.
32
323
3K
@vboykis
vicki
1 year
Here is a way to think about what actually happens when we type inside the ChatGPT textbox. Wonderful paper.
Tweet media one
32
529
3K
@vboykis
vicki
6 years
Unreal. This thing has gone too far.
Tweet media one
28
827
3K
@vboykis
vicki
6 months
He train on test or what?
40
178
2K
@vboykis
vicki
5 years
New blog post: For the past couple years, I've been telling people who ask me for advice not to go into data science. Here's why: The data science job market is way oversaturated. Here's what they should do instead.
119
813
2K
@vboykis
vicki
2 years
By age 30 you should have a group of friends that talk you out of distributed systems
53
191
2K
@vboykis
vicki
5 years
One of my greatest weird little joys in life is stripping the UTM parameters in links before I send them to other people.
74
213
2K
@vboykis
vicki
4 years
Some MIT faculty have put together a course called "The Missing Semester of Your CS Education." Having looked it over a bit, it looks fantastic and will benefit data science people from non-dev backgrounds fill in a lot of gaps, too.
20
629
2K
@vboykis
vicki
1 year
I was worried ChatGPT would make me obsolete and then I tried it and it almost got the syntax I wanted, I just had to prompt it seventeen more times, now I’m pair programming with someone with no long-term memory and network timeouts every day, this thing is truly revolutionary
42
168
2K
@vboykis
vicki
3 years
If the machine is the one that’s learning why am I so tired
27
228
2K
@vboykis
vicki
6 years
Ten years ago I was working with malformed data solely in spreadsheets. After ten years of hard work learning engineering and statistics, I finally am blessed to be working with malformed data in queues, matrices, containers, and serialized formats.
26
255
2K
@vboykis
vicki
6 years
Every NPR podcast: Host 1: I'm Natasha Gutierrez-Levine. Host 2: And I'm Kevyn Smith-Witherington Jones. Host 1: And today we're exploring Borks, the biodegradable sporks handmade in Vermont. *olive oil sizzles on a pan* *warm guitar riff* Host 2: *nonchalantly* Stay with us.
26
234
2K
@vboykis
vicki
2 years
*cries in Slavic*
Tweet media one
77
55
2K
@vboykis
vicki
2 years
👑🐐👑
Tweet media one
57
195
2K
@vboykis
vicki
4 years
Suppose you have to choose between a black box AI surgeon that runs on TensorFlow 1.0 on an EC2 instance that hasn't been upgraded to Python 3 but has a 80% cure rate and a black box AI surgeon with an 80% cure rate that runs on Excel vlookups. Do you want to live on this planet?
44
237
2K
@vboykis
vicki
6 years
Dream data science/engineering conference agenda
Tweet media one
25
731
2K
@vboykis
vicki
4 years
This is really weird and surreal time for happy personal news, but today is my first day as a machine learning engineer @automattic ! 👋🎉 💻
106
29
2K
@vboykis
vicki
4 months
This is actually not dumb! One of the first things I learned year 1 on the job was that executives are often busy and consume content very differently from the rest of us at work, usually via email on the phone. (BlackBerry at the time 🫠
Tweet media one
93
64
2K
@vboykis
vicki
5 years
December 31 Resolutions: 1. Well-named Jupyter notebooks that run in order 2. No random temporary S3 buckets 3. Clean commit messages 4. Small p-values, good A/B tests, no peeking 5. Lots of READMEs January 16: 1. This model runs. Please, please, don't ask me how it works.
13
237
1K
@vboykis
vicki
19 days
An absolutely fantastic way to increase this is to start a blog. Almost all the cool fun stuff in my professional life for me has come from doing stuff then blogging about it.
@EmilybyNight
Emily
20 days
"The amount of serendipity that will occur in your life is directly proportional to the degree to which you do something you're passionate about combined with the total number of people to whom this is effectively communicated."
Tweet media one
26
444
3K
11
141
1K
@vboykis
vicki
4 years
Hope everyone is enjoying their week-long stint as a data scientist, the most glamorous tech job of the last 10 years, supposed to be spent analyzing sophisticated models, but really spent mostly monitoring those few stray batch import jobs that haven't finished yet.
27
234
1K
@vboykis
vicki
5 years
Some personal news: I had a baby this weekend! 👶 Everyone is healthy, we are thrilled, and big sister is thrilled. ❤️ So I’m looking forward to catching up with everyone on Twitter either at 2:37 am or next year.
112
14
1K
@vboykis
vicki
4 years
Me at the beginning of the decade versus me at the end
Tweet media one
12
153
1K
@vboykis
vicki
1 year
Inject it into my veins
Tweet media one
17
68
1K
@vboykis
vicki
4 years
Me, standing outside Geoffrey Hinton’s office, matted hair, bloodshot eyes, shouting at passers-by: IT’S ALL A SCAM. THERE IS NO AI. NEURAL NETS ARE JUST MATRIX OPERATIONS. *sobbing softly as guards approach* they’re just matrix operations.
36
212
1K
@vboykis
vicki
5 years
This is the most terrifying page I’ve read all week. From Creativity, Inc. by Pixar’s Ed Catmull.
Tweet media one
43
484
1K
@vboykis
vicki
2 years
DALL-E this and Stable Diffusion that, but we still can't copy and paste text directly from PDFs and keep the formatting.
24
96
1K
@vboykis
vicki
3 years
Being vaccinated doesn't mean you can stop writing unit tests
21
147
1K
@vboykis
vicki
5 years
Hot data science trends: 2011 T-tests on laptops 2012 Hadoop 2013 Bayesian inference 2016 Spark 2017 Deep learning 2019 Reinforcement learning 2022 Robot war 2023 Cloud computing outlawed 2024 Computers outlawed 2025 T-tests on pen and paper
33
216
1K
@vboykis
vicki
5 years
Producer: Pitch me. Me: It's a reality competition show featuring two groups people that can't stop talking about what they do: gym rats and data scientists. We pit them against each other in feats of strength. It's called "CrossFit Validation" Producer: Get out.
29
158
1K
@vboykis
vicki
2 years
I am much less worried about sentient AIs than the fact that we are surrounded and influenced by machine learning systems and are not taught how to reason through how they work.
41
172
1K
@vboykis
vicki
4 years
In order to encourage social distancing, it is now mandated that every matrix is a sparse matrix.
17
185
1K
@vboykis
vicki
1 year
GitHub just wrote an article about how they had to write their own search engine (in Rust, for performance reasons) and a new probabilistic data structure to reduce indexing time because ES and Lucene were blockers for them but sure big data is over 😅
Tweet media one
32
104
1K
@vboykis
vicki
2 years
Gonna start a conference called #NormIPS that’s just presentations of middlebrow ML topics. “how to structure Python packages 2022”, “how many k-folds is too many”, “how to make the browser pop-up come up when the notebook is done running”, “putting features in Postgres”, etc.
64
107
1K
@vboykis
vicki
2 years
Big shoutout to this book. I’ve already recommended it to a lot of people looking for either an intro or refresh to linear algebra and am psyched to recommend it in an official capacity.
Tweet media one
Tweet media two
@thomasnield76
Thomas Nield
2 years
The paperback for my @OReillyMedia book "Essential Math for Data Science" is now available! Thanks to the hard-working folks at O’Reilly for helping make this book as great as possible for readers. It is going to fill a much-needed gap.
4
16
91
10
168
1K
@vboykis
vicki
4 years
How it started. How it’s going.
Tweet media one
Tweet media two
18
170
1K
@vboykis
vicki
6 years
Producer: Pitch me. Me: It’s a gripping action thriller about a detective who switches from Java to Python, hoping to catch a criminal developer, who then switches from Python to Java. Both have extreme syntax difficulties. It’s called Brace/Off. Producer: Get out.
25
193
1K
@vboykis
vicki
1 year
Marginalia, the indie search engine that surfaced non-commercial content first, is currently on the front page of HN and handling the traffic load with one $5k commodity server with 128GB RAM/24 cores at 85% utilization with a single Java app
Tweet media one
Tweet media two
21
118
1K
@vboykis
vicki
2 years
LogJ 2Log2J: 2 Log 2 Furious Log3J: Tokyo Drift Log4J Log5: The Fate of J
27
176
1K
@vboykis
vicki
5 years
Job Req: ------ Years of Experience: 37 PhD: Required Languages: Python, R, Scala, Fortran, and Cantonese Experience with: Machine Learning, DevOps, Agile, Marie Kondo Method Someone who is good at recruiting, help me find a data scientist, my company is dying.
40
144
1K
@vboykis
vicki
4 years
I'm an introvert, so I'm not having as hard of a time as the poor extroverts, but something that I really miss is ambiently being around people. I sometimes like being in cafes, in workspaces, surrounded by conversation and the pulse of busy-ness, feeling like a part of humanity.
24
84
1K
@vboykis
vicki
5 years
And the second part. Tfw maternity leave saves the day.
Tweet media one
21
206
1K
@vboykis
vicki
4 years
Fifty billion years ago in March, before 2020 really hit the ground running, I started working on a fun proof-of-concept ML project to really explain all the things that need to happen for machine learning to work in the wild. I finally wrote it all up.
23
229
1K
@vboykis
vicki
1 year
When you go to a dude's website and it's just plain HTML, not even any CSS, and links to posts like "Some thoughts on prime numbers" and "Efficiently checking tries for fun and profit" and a picture of him in a sweater at a party from 2014, watch the fuck out. This guy codes.
13
56
1K
@vboykis
vicki
2 years
“Give me six hours to run a deep learning model and I will spend the first four tuning Kubernetes.” – Abraham Lincoln
23
92
996
@vboykis
vicki
2 years
Don't touch that, that's my emotional support Sublime Text tab for making sure copy/paste gets formatted in plain text
25
82
1K
@vboykis
vicki
2 years
Have to install Python on a new machine, goodbye forever
75
30
988
@vboykis
vicki
1 year
Neural nets when you ask them to explain even a single output
22
142
970
@vboykis
vicki
1 year
Marked safe from thought leadership
Tweet media one
7
79
969
@vboykis
vicki
3 years
You do realize that if AI autocompletes your code, that means you have more free time for meetings, right
41
120
956
@vboykis
vicki
2 months
i'm just a girl standing in front of the ml research community please begging everyone to type their python method inputs and outputs especially if they are tensors or weird nested dicts of lists of dicts
219
100
673
@vboykis
vicki
3 years
The more programming languages I learn, the fewer strong opinions I have about any one language other than just as a tool to get something done.
32
98
905
@vboykis
vicki
7 years
As a data person, here's my understanding of the front-end development landscape at the moment:
Tweet media one
29
349
892
@vboykis
vicki
3 years
Yeah, of course, I work 90 hours a week, but 60 of those are writing my own CSV parser
32
41
906
@vboykis
vicki
6 years
Tired of the fight between R and Python? There's another way.
Tweet media one
20
257
888
@vboykis
vicki
4 years
My developer path: Learning how to work with dirty data in Excel @ 19 Learning how to work with dirty data in Access @ 24 Learning how to work with dirty data in pandas @ 25 Learning how to work with dirty data in scikit @ 28 Learning how to work with dirty data in Airflow @ 33
26
63
883
@vboykis
vicki
5 years
Producer: Pitch me. Me: It's a musical comedy about an overeager team of data scientists that does way too many tests on bad data. It's called "Gimme Gimme Gimme a NaN after Midnight." Producer: Get out. Me: *over my shoulder* The team calls themselves A/B/B/A.
14
137
854
@vboykis
vicki
3 years
Producer: Pitch me. Me: It’s a high-stakes Korean thriller where you watch people under enormous stress kill Linux processes. It’s called Pid Game. Producer: Get out.
16
101
847
@vboykis
vicki
4 years
Checking out Effective Python by @haxor . I’m a big fan of the book so far because it takes all the Pythonic best practices you hear about in conferences and on StackOverflow and contenxtualizes and organizes them. Thanks to the publisher for sending this over.
Tweet media one
Tweet media two
Tweet media three
17
95
824
@vboykis
vicki
3 years
I love how the LinkedIn crowd is like, “Get fifteen hours of sleep, create space for your meditation. Truly focus on your success.” Buddy, just now as I was trying to eat a piece of toast for 2 minutes, the toddler found me and bit me. LMK how that fits into The Strategy.
23
46
823
@vboykis
vicki
7 years
I couldn't find a comprehensive guide for how to go from #Python scripts to a packaged project, so I wrote one. 🐍
20
318
809
@vboykis
vicki
1 year
They don’t tell you this in the paper (well they do but you have to read it like 15 times)
Tweet media one
15
57
793
@vboykis
vicki
2 years
The hardest problem in machine learning is getting off your chair to do the 2-factor auth confirm when your phone is two rooms away.
29
48
781
@vboykis
vicki
1 year
Throwing "idempotent" into all my documentation and watching people bow to my unyielding technical acumen.
39
31
797
@vboykis
vicki
3 years
Spotify Wrapped, but for my command line history
28
49
790
@vboykis
vicki
6 years
CTO: We’re thinking of replacing our on-perm server with a distributed system in the cloud. What kinds of considerations would help us make that decision? Me:
Tweet media one
22
282
782
@vboykis
vicki
2 years
🎉🚨This is real. 🚨🎉 #NormConf is happening. December 15. Online. Free. A day of normcore data takes. It’s gonna be great. Registration is open.
25
211
775
@vboykis
vicki
11 months
When there is a Lewis Carroll quote on the first page of a textbook, you know you are about to have your mind blown wide open
Tweet media one
16
72
772
@vboykis
vicki
2 months
2013 — 2023: you were hired to do machine learning but do data engineering 2023 — : you were hired to do machine learning but do web dev
21
35
773
@vboykis
vicki
4 years
Ok I’m only 14 pages into Thinking in Systems but I’m going to have to go all in on recommending on it.
Tweet media one
40
51
772
@vboykis
vicki
5 years
My working theory is that 10% of any data science/adjacent job is actual machine learning. Unless your title is "Machine Learning Engineer", in which case it can be as much as 20%.
24
106
744
@vboykis
vicki
6 months
that’s really what it says
Tweet media one
Tweet media two
14
100
738
@vboykis
vicki
2 years
The NLP My kids community at 5 am 🤝 Attention is all you need
3
42
746
@vboykis
vicki
2 years
Smoothly segues into the iq bell curve meme.
Tweet media one
3
66
739
@vboykis
vicki
4 months
Everything in ML and engineering is borderline ridiculous and you need to have a sense of humor about the work or ngmi
266
50
446
@vboykis
vicki
2 years
When you split up sentences into smaller strings but in the process make your text sound like a made-up Elvish language, that's tolkienization.
20
66
726
@vboykis
vicki
3 years
Every data science article on Medium is like “How every day I deploy a million-feature deep learning model at scale to millions of users. This is Real machine learning.” Meanwhile I spent a good half hour today figuring out why sbt wouldn’t build and it was because of a typo.
24
36
717
@vboykis
vicki
4 years
2019: + Had my second baby + Read 103 books + Started a newsletter that now has 250+ paid and 3k free subscribers + Changed 1k + diapers + Cancelled a load balancer that was costing me $50/month in AWS 2020 Resolutions: + Sleep more than 5 hours at a time
29
19
704
@vboykis
vicki
1 year
Getting marginally depressed thinking about all those brilliant Joel on Software essays, the level of craftsmanship in the software, those gorgeous offices, the whole of Stack Overflow’s contributions to humanity and for what? To end up as training data for ChatGPT.
31
51
699
@vboykis
vicki
2 years
And these DataFrames…are they in the RAM with us now?
11
55
700
@vboykis
vicki
5 years
VC: Ok, whatchu got? Me: Imagine WeWork, but with clean and stocked bathrooms. VC: That's just WeWork. Me: No it's not. VC: We won't fund it. Me: The bathrooms will be cleaned with AI. VC: Here's $30 million.
17
101
693