Emmanuel Ameisen Profile
Emmanuel Ameisen

@mlpowered

8,279
Followers
214
Following
228
Media
1,914
Statuses

Research Engineer @AnthropicAI Previously: Staff ML Engineer @stripe , Wrote BMLPA by @OReillyMedia , Head of AI at @InsightFellows , ML @Zipcar

San Francisco, CA
Joined June 2017
Don't wanna be here? Send us removal request.
Pinned Tweet
@mlpowered
Emmanuel Ameisen
5 years
13 months. 250 pages. I wrote an ML book! Want to learn how to ship ML in practice? Check it out! Includes tips from @WWRob , @mrogati , @cdubhland and more! It'll be out in winter & you can preorder it now. Amazon: O'Reilly:
Tweet media one
18
95
556
@mlpowered
Emmanuel Ameisen
4 months
Claude 3 Opus is great at following multiple complex instructions. To test it, @ErikSchluntz and I had it take on @karpathy 's challenge to transform his 2h13m tokenizer video into a blog post, in ONE prompt, and it just... did it Here are some details:
51
264
2K
@mlpowered
Emmanuel Ameisen
29 days
Today, we announced that weโ€™ve gotten dictionary learning working on Sonnet, extracting millions of features from one of the best models in the world. This is the first time this has been successfully done on a frontier model. I wanted to share some highlights ๐Ÿงต
Tweet media one
33
258
2K
@mlpowered
Emmanuel Ameisen
5 years
How to ship ML in practice: 1/ Write a simple rule based solution to cover 80% of use cases 2/ Write a simple ML algorithm to cover 95% of cases 3/ Write a filtering algorithm to route inputs to the correct method 4/ Add monitoring 5/ Detect drift ... 24/ Deep Learning
10
275
1K
@mlpowered
Emmanuel Ameisen
2 years
Most ML folks I know have @AnthropicAI 's Toy Models of Superposition paper on their reading list, but too few have read it. It is one of the most interesting interpretability paper I've read in a while and it can benefit anyone using deep learning. Here are my takeaways!
Tweet media one
5
62
463
@mlpowered
Emmanuel Ameisen
5 years
When moving from traditional ML (GBDT) to deep learning for categorical data, the vast majority of improvements usually come from learned embeddings of categorical variables. Once you have the embeddings, you can feed them to any model and it will perform noticeably better.
8
62
382
@mlpowered
Emmanuel Ameisen
4 years
This is one of the main tricks I recommend when trying to improve current model performance. Don't start with hyperparameter search, start by looking at individual examples with large losses, and most of the time you'll understand what feature your model is missing.
@karpathy
Andrej Karpathy
4 years
When you sort your dataset descending by loss you are guaranteed to find something unexpected, strange and helpful.
30
227
2K
5
70
374
@mlpowered
Emmanuel Ameisen
5 years
Today, a friend transitioning to ML asked me about a data challenge. Him: I trained a perceptron to predict breast cancer, and reached 99% accuracy on test, what's next? Me: Time to learn about class imbalance, data leakage and metric selection!
8
40
342
@mlpowered
Emmanuel Ameisen
5 years
One of the biggest tool gaps in ML right now is tin building utilities to more easily inspect and understand data. I gave a talk about just this at: It also quotes your great data in industry vs data in academia slide in the conclusion @karpathy
@karpathy
Andrej Karpathy
5 years
We see more significant improvements from training data distribution search (data splits + oversampling factor ratios) than neural architecture search. The latter is so overrated :)
41
365
2K
10
59
335
@mlpowered
Emmanuel Ameisen
5 years
Config files are underrated in ML. You start with a simple model, and soon enough you are trying 13 hyperparameters, 7 models, and 9 data augmentation strategies. Use a config file, and experimentation becomes much easier. Python's implementation:
13
55
316
@mlpowered
Emmanuel Ameisen
1 year
Do you want to understand how to train models like ChatGPT and stable-diffusion? Good news, I wrote an illustrated notebook which explains different parallelism approaches and give a functional example for each. I've summarized some takeaways below NB:
4
58
309
@mlpowered
Emmanuel Ameisen
5 years
More and more of my network is transitioning away from looking for deep learning jobs and focusing more on ML infra and platforms. The ML Engineering hype cycle is just starting :)
12
45
247
@mlpowered
Emmanuel Ameisen
1 year
I just finished watching @karpathy 's let's build GPT lecture, and I think it might be the best in the zero-to-hero series so far. Here are eight insights about transformers that the video did a great job explaining. Watch the video for more. (1/9)
4
31
240
@mlpowered
Emmanuel Ameisen
5 years
Currently training a Q&A model, and it is producing crazy impressive results! Q: How do you find a good title? A: See attached None of the samples can be found in the training set that I used. ๐Ÿ˜ฑ๐Ÿ˜ฑ๐Ÿ˜ฑ๐Ÿ˜ฑ๐Ÿ˜ฑ
Tweet media one
Tweet media two
3
39
230
@mlpowered
Emmanuel Ameisen
4 years
What do you mean my analysis isnโ€™t reproducible? I ran a query five month ago on a redshift table thatโ€™s now deprecated, wrote a notebook in Python 2 to pre-process the data, and used a DNN implementation from a GitHub repo thatโ€™s since been deleted. Whatโ€™s not reproducible?
4
35
220
@mlpowered
Emmanuel Ameisen
4 years
Wow, the book went from best new release to best seller!!! Looks like the free first chapter helped some folks decide. If you are still on the fence, feel free to check the free PDF out below, the book is also currently 40% off!
Tweet media one
6
30
202
@mlpowered
Emmanuel Ameisen
5 years
Wow, @lyft built an ML system that automatically: Finds the best potential users to target. Allocates the right budget for each ad. Sets the right amount to bid on each platform to maximize the use of the budget. Thank you @seanjtaylor for the find!
Tweet media one
5
29
200
@mlpowered
Emmanuel Ameisen
29 days
The paper contains *a lot* more experimental results, and feature examples, including interactive visualizations of feature neighborhoods. If you've made it this far in the thread, you should give it a read
Tweet media one
7
22
201
@mlpowered
Emmanuel Ameisen
27 days
On Tuesday, we announced our results finding interpretable features in Claude 3 Sonnet. One of the features we identified is about the Golden Gate bridge. When activated, the model starts being obsessed with the bridge. For a limited time, we've made this available to everyone
Tweet media one
12
14
196
@mlpowered
Emmanuel Ameisen
5 years
The solution I recommend for anyone with a similar problem: Letโ€™s say you have 3 classes Label 20 examples, including at least 2 examples of each class Train a simple model on your labels and have it predict the rest Look through predictions and label a few wrong ones Repeat
@alexpghayes
alex hayes
5 years
I have 1,200 unlabelled observations that I want to label. The labels are categorical. What's the most efficient way to do this? Any chance someone's written a Shiny app for this?
18
4
20
6
29
195
@mlpowered
Emmanuel Ameisen
2 years
How do you run models that are too big to fit in RAM or in GPU memory? Great @huggingface article explaining their approach to distributing computation
0
33
191
@mlpowered
Emmanuel Ameisen
5 years
When using websites that I suspect have a team of data scientists optimizing them, I make sure to spend some time clicking around randomly. We all have to do our part to keep datasets messy.
9
16
183
@mlpowered
Emmanuel Ameisen
5 years
The Illustrated Word2vec. If you've heard my talks, you've seen my many attempts at giving a visual representation of word vectors. This post from @jalammar takes this to a new level. Really comprehensive and accessible overview!
Tweet media one
0
48
165
@mlpowered
Emmanuel Ameisen
11 months
10 years of ML experience and all I have to show for it is a sharp intuition for when projects will be harder than they sound.
8
7
162
@mlpowered
Emmanuel Ameisen
4 months
This was done in one prompt that @zswitten @ErikSchluntz and I wrote. If you'd like to try to improve it, here is the prompt And the full blog post
11
20
160
@mlpowered
Emmanuel Ameisen
6 years
I wrote a tutorial on leveraging #DeepLearning to build a powerful image search engine quickly. It includes a notebook to walk you through and a codebase to play with. Also comes with a shoutout to @jeremyphoward and his great class on the topic!
7
43
152
@mlpowered
Emmanuel Ameisen
5 years
Most experienced Data Scientists understand why I dedicate an entire section of my book to cover how to deploy models. Others ask: โ€œCanโ€™t you just wrap it in a Flask server?โ€ This post from @ravelinhq shows why thatโ€™s not enough brilliantly.
1
43
150
@mlpowered
Emmanuel Ameisen
4 years
I wrote "How to solve 90% of NLP problems: a step-by-step guide" after seeing dozens of applied NLP projects at @InsightFellows . It has been read by over three hundred thousand people! It presents a cookie cutter NLP approach, along with reference code
2
40
144
@mlpowered
Emmanuel Ameisen
4 years
This article by @a16z has the best ML infra charts Iโ€™ve seen in a very long time. If youโ€™d like to know more about the challenges that come with ML, and the tools to solve them, this is a great start
0
27
140
@mlpowered
Emmanuel Ameisen
4 years
Impressive work! Combine: - 3 dimensional convolutional auto-encoder - @spacy_io embeddings - An RNN encoder - Sprinkle some t-SNE on top Get a model that can generate 3D models from text descriptions! The interactive app is really fun, try it!
1
30
141
@mlpowered
Emmanuel Ameisen
5 years
2018 has been a continuing flurry of exciting work in Machine Learning. If you are interested in being part of the field in 2019, I've written about how some of the most impactful trends of 2018 will impact this year!
Tweet media one
4
38
140
@mlpowered
Emmanuel Ameisen
5 years
Benchmark datasets are frustrating. On one hand, datasets often drive significant innovation initially. On the other hand, they usually become completely overfit, leading everyone to overestimate state of the art performance. There should be a hype cycle for datasets.
9
35
141
@mlpowered
Emmanuel Ameisen
4 years
Wow, this is unexpected... and wonderful!
@alex_gude
Alex Gude
4 years
Tweet media one
0
1
14
2
6
133
@mlpowered
Emmanuel Ameisen
5 years
As always, @fastdotai posts are a pleasure to read, and present results clearly and fairly. This post covers language modeling for low-resource languages and provides useful info on learning rates, loss functions, and model architecture choices.
2
26
132
@mlpowered
Emmanuel Ameisen
15 days
The most broadly applicable prompting technique: 1. Collect a random subset of failing examples from a training set 2. Add the examples and a correct response to your prompt 3. Repeat Doing the above 5 times has solved 90% of prompting challenges Iโ€™ve seen
2
13
123
@mlpowered
Emmanuel Ameisen
29 days
If you use the same prompt, but force the Golden Gate Bridge feature to be maximally active, Claude starts believing that it is the bridge itself! We repeat this experiment in domains ranging from code to sycophancy, and find similar results.
Tweet media one
8
11
118
@mlpowered
Emmanuel Ameisen
6 years
Continuously impressed by @distillpub publications. This one on feature transformation does such a good job of centralizing and explaining trends around multimodal learning. An informative read that will leave you wanting to try out a lot of ideas.
0
30
115
@mlpowered
Emmanuel Ameisen
3 months
I donโ€™t think people have fully internalized yet how cheap and how good Haiku is. Opus gets all the press for good reason, but Haiku pushes the intelligence / cost boundary even more imo
@SullyOmarr
Sully
3 months
Kinda bonkers, but this agent workflow only cost me ~$0.09 (9 cents) It's about ~ 350k tokens, and the outputs are on par with gpt-4, at a fraction of the price. GPT = $10/1m = $3.50 Haiku = $0.25/1m = $0.09 40x cheaper. This wasn't even possible 1 month ago.
Tweet media one
19
33
442
5
9
110
@mlpowered
Emmanuel Ameisen
5 years
To iterate faster on models, here is how I examine results: Summary metrics for an overview Confusion matrix and calibration curve to find challenging data types Model explainers to inspect features Manually inspect top and worst performing examples Decide on next steps!
1
15
110
@mlpowered
Emmanuel Ameisen
5 years
ML is 5% modelling, 95% data cleaning, product thinking and model serving and 999999999% dealing with Python environment errors
5
21
109
@mlpowered
Emmanuel Ameisen
6 years
Bayesian reasoning and Deep Learning can seem very different. One excels at accurately measuring uncertainty, while the other mostly seems concerned with optimization. @yaringal 's writing connects both elegantly, showing how dropout can define uncertainty
Tweet media one
0
28
106
@mlpowered
Emmanuel Ameisen
29 days
For context, the goal of dictionary learning is to untangle the activations inside the neurons of an LLM into a small set of interpretable features. We can then look at these features to inspect what is happening inside the model as it processes a given context.
Tweet media one
2
5
102
@mlpowered
Emmanuel Ameisen
4 years
I gave a talk sharing tips for teams to ship ML. At most companies I've been, the main challenge to getting models out is organizational. The talk focuses on ways to enable product, ml, and infra teams to work together. Slides are available below!
1
22
97
@mlpowered
Emmanuel Ameisen
6 years
An amazing deck from the @netflix team on how they keep their recommendation systems fresh by continuously monitoring them, and incorporating corrupted data at training time to mimic real world conditions
Tweet media one
0
24
102
@mlpowered
Emmanuel Ameisen
5 years
Using a learned variant of SVD to compress embedding size by 90%. For many speech and NLP tasks, the embeddings represent most of the size of a network, so this clever take on SVD is really impactful! Great to see a clear post accompany the paper as well.
Tweet media one
1
18
98
@mlpowered
Emmanuel Ameisen
6 years
Oldie but goodie, this writeup about a winning @kaggle submission is one of the first I've seen successfully apply deep learning to tabular data. More examples have started cropping up recently, so I would not be surprised if the trend continues
Tweet media one
1
18
97
@mlpowered
Emmanuel Ameisen
4 years
Experimentation and statistics could really benefit from a beginner friendly attitude. Not just in the form of textbooks, but an openness to discuss concepts that are often confusing (Lack of) Knowledge of stats causes imposterโ€™s syndrome for too many folks in my network :(
6
12
92
@mlpowered
Emmanuel Ameisen
1 year
I'm excited to announce I've joined @AnthropicAI as a Product Research Engineer! I loved my time on @Stripe 's Radar team, working to improve fraud detection models, and in time I hope to write more about some lessons I learned there.
9
4
90
@mlpowered
Emmanuel Ameisen
5 years
A hack that many teams use to deploy ML more easily: Create good embeddings for future use. See @facebook using code embeddings to surface best practices Note that they don't really use any deep learning and focus on handcrafted features. @HamelHusain
Tweet media one
2
26
89
@mlpowered
Emmanuel Ameisen
5 years
It is fascinating to me that we spend so much time talking about models and so little time talking about test sets In industry, your test set is supposed to represent your performance in prod Designing a test set that does this well is extremely hard, and not talked about much
5
22
85
@mlpowered
Emmanuel Ameisen
6 years
Many researchers including @fchollet have highlighted the importance of having an explanation (and ideally a tuning mechanism) for content recommendation. @mcinerneyj gives an overview of @Spotify 's category based recommender, a step towards explanations
Tweet media one
0
29
85
@mlpowered
Emmanuel Ameisen
4 months
First, we grabbed the raw transcript of the video and screenshots taken at 5s intervals. Then, we chunked the transcript into 24 parts for efficient processing (the whole transcript fits within the context window, so this is merely a speed optimization).
3
3
78
@mlpowered
Emmanuel Ameisen
6 years
Slowly but surely, deep learning is joining the last domain to resist it, tabular data. The networks are still relatively simple which makes me think we need better primitives for structured data. What is the convolution equivalent for spreadsheets?
Tweet media one
3
29
79
@mlpowered
Emmanuel Ameisen
5 years
95% of the work of a Data Scientist is gathering, cleaning and presenting data. Some think that's a bad thing, but I wish every practitioner would embrace it! Data Science is about using data to produce useful things, there is a good reason the title isn't Model Scientist.
2
14
73
@mlpowered
Emmanuel Ameisen
5 years
This work from @mcleavey of @OpenAI has me floored. Train GPT-2 on a large dataset, and you get an amazing composer that lets anyone create music. Here is a riff on Poker Face created in 2 minutes. Notice the coherence of the composition!
1
15
74
@mlpowered
Emmanuel Ameisen
4 years
For the book I built four successive versions of an ML app: First a heuristic with a bunch of rules. Then a simple model. Then, a much more complex model. Finally, a model that simplified the previous approach. Same lifecycle Iโ€™ve seen in industry!
4
7
76
@mlpowered
Emmanuel Ameisen
29 days
We find features for almost everything you can think of: geographical concepts (cities and countries), architecture, sports and science. They combine like youโ€™d expect: "an athlete from California" triggers both the athlete feature and the California feature But there's more!
Tweet media one
1
5
76
@mlpowered
Emmanuel Ameisen
6 years
Pre-training Language Models to build classifier is quickly becoming more accessible, and has huge promise. The next step in my opinion, is to make it as easy to use a pre-trained model on a custom dataset as VGG transfer learning in Keras for example.
Tweet media one
0
18
73
@mlpowered
Emmanuel Ameisen
5 years
Hey! If this resonates with you, take a look at my book ( currently on early-release). It has many many more tips about building sane ML products! Amazon: O'Reilly Early Release:
Tweet media one
0
7
73
@mlpowered
Emmanuel Ameisen
5 years
My todo list is a classic example of why ML is hard: Spin up a webserver to serve current prototype of model: 5 hours Add some simple monitoring: 2 hours Add input validation logic and error handling: 3 hours Iterate on the ML and data side to get a better model: ???
1
13
71
@mlpowered
Emmanuel Ameisen
5 years
One of the biggest differences I see between more experienced folks and novices in ML: error analysis If your model shows .94 precision, don't try a new set of hyperparameters Look at the data Then look at it again, using different methods The data always has the answers
4
8
70
@mlpowered
Emmanuel Ameisen
4 months
Here is a subset of some of what we asked the model, in one prompt (full prompt attached) - directly write HTML - filter out irrelevant screenshots - transcribe the code examples in images if they contain a complete example - synthesize transcript and image contents into prose
Tweet media one
1
6
68
@mlpowered
Emmanuel Ameisen
4 years
Building my own prototype of an ML guided editor as an example for my book () has helped me realize how good @Grammarly and @textio are. Using ML to assist and guide a user requires a lot more than just modeling, need to think deeply about product.
4
3
69
@mlpowered
Emmanuel Ameisen
5 years
Big fan of . I love how all the questions sound reasonable at a glance, but completely nuts once you actually dive in. I think I now know where @YouTube comments come from...
Tweet media one
1
22
67
@mlpowered
Emmanuel Ameisen
4 years
Getting into spaced repetition for memory thanks to @michael_nielsen and @andy_matuschak โ€™s work. It feels like unsupervised vs supervised learning Normal reading is unsupervised Spaced repetition provides labels you get tested on at successive epochs, to minimize memory loss
1
2
66
@mlpowered
Emmanuel Ameisen
6 years
Regarding the debate on the rigor of Deep Learning, I recommend this paper by Leo Breiman making the case for focusing less on Data Modeling (understanding how data is generated) in favor of Algorithmic Modeling (measuring predictive accuracy).
Tweet media one
0
23
65
@mlpowered
Emmanuel Ameisen
11 months
The alignment and interpretability work at @AnthropicAI was one of the main reasons I joined The linked thread does a good job of breaking down some of the details, but I wanted to take a stab at explaining a couple findings in even simpler terms in my own words ๐Ÿงต
@AnthropicAI
Anthropic
11 months
When language models โ€œreason out loud,โ€ itโ€™s hard to know if their stated reasoning is faithful to the process the model actually used to make its prediction. In two new papers, we measure and improve the faithfulness of language modelsโ€™ stated reasoning.
Tweet media one
12
130
725
1
11
64
@mlpowered
Emmanuel Ameisen
4 months
We gave Opus the transcript, video screenshots, as well as two *additional* screenshots: - One of Andrej's blog to display a visual style to follow - The top of the notebook @karpathy shared with a writing style example On top, we added lots of instructions (prompt in repo)
Tweet media one
Tweet media two
2
5
61
@mlpowered
Emmanuel Ameisen
5 years
Found on the @Yelp app. Named Entity Recognition is hard! At a brewery, an โ€œAmberโ€ might be a popular beer but at this restaurant itโ€™s actually the name of the waitress...
Tweet media one
1
13
61
@mlpowered
Emmanuel Ameisen
6 years
Visual search is the most natural way to search for fashion items, and is becoming more and more popular. If you are curious how to build a visual search app yourself, check out and tell me what you think!
@rrhoover
Ryan Hoover
6 years
Snap just announced visual search: Reminds me of Pinterest Lens ๐Ÿค”
Tweet media one
12
26
177
0
17
60
@mlpowered
Emmanuel Ameisen
4 months
System prompt design is fascinating. Itโ€™s basically prompt engineering for the broadest use case you can think of. That means you need to be subtle with, as each word affects every single use case. We spent quite a bit of time iterating on this version. See ๐Ÿงต for more.
@AmandaAskell
Amanda Askell
4 months
Here is Claude 3's system prompt! Let me break it down ๐Ÿงต
Tweet media one
121
556
3K
2
4
60
@mlpowered
Emmanuel Ameisen
6 years
Ensembles are very powerful in #ML and routinely win @kaggle competitions. This paper is one of the first I saw mention the idea of using #DeepLearning checkpoints to form an ensemble. The idea has a lot of supporters including @jeremyphoward ! #ClassicML
Tweet media one
1
7
59
@mlpowered
Emmanuel Ameisen
5 years
Heard about @YouTube 's recommendation algorithm, heavily criticized for promoting radical and hateful content? Take a look through this fascinating paper describing it. Steps are taken to reduce clickbait, but the only target metric is watch time...
Tweet media one
4
21
57
@mlpowered
Emmanuel Ameisen
5 years
This post by @alex_gude should be required reading for Data Scientists. There are two ways you can learn this lesson: You could use a simple model metric, ship a useless model and make people sad. OR You could read Alexโ€™s post.
1
15
57
@mlpowered
Emmanuel Ameisen
5 years
Many ML practitioners ignore latency, but most apps try to make every interaction no longer than 100ms. It makes using the app more enjoyable! ML should aim for it too, especially for creative uses. This is why Iโ€™m excited for @jeremyphoward and @clattner_llvm โ€˜s Swift work!
2
5
58
@mlpowered
Emmanuel Ameisen
4 years
Last week the @feedly team had me over to chat about some of the practical ML tips Iโ€™ve been writing about. The recording is available now. Itโ€™s a short video about why and how you should look at your data, including a slide copied from @karpathy :)
2
11
57
@mlpowered
Emmanuel Ameisen
5 years
@collision Not TikTok but the YouTube recommendation paper is one of the most detailed and insightful I've seen to this day.
1
5
55
@mlpowered
Emmanuel Ameisen
29 days
We also explore how the model actually uses these features to predict the next word. In other words, we try to separate features that are related to the context from ones that are useful for prediction. Let's look at an example
Tweet media one
1
4
53
@mlpowered
Emmanuel Ameisen
5 years
Re-reading World Models by @hardmaru and Schmidhuber, it feels like such an elegant combination of many great ideas about representation learning and dynamic world representations. Feels like something key to build on, especially with all the code here
Tweet media one
1
12
52
@mlpowered
Emmanuel Ameisen
5 years
This is what practical ML looks like. Notice how this @UberEng article covers building the tooling to train, run, test and update a model. Not much is said about the architecture -> that is not where most of the gains come from in practice
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
11
51
@mlpowered
Emmanuel Ameisen
6 years
A new #ML approach by @jeremyphoward and @seb_ruder , taking transfer learning for #NLP to the next level. Pre-train a general RNN language model on a corpus, fine tune on specific tasks, achieve state of the art results!
0
16
49
@mlpowered
Emmanuel Ameisen
3 months
So excited to finally see tool use out in the world! Improving Claudeโ€™s ability to use tools was a core focus when building the Claude 3 family. Looking forward to seeing what you all build with it
@AnthropicAI
Anthropic
3 months
Tool use is now available in beta to all customers in the Anthropic Messages API, enabling Claude to interact with external tools using structured outputs.
70
293
2K
4
4
50
@mlpowered
Emmanuel Ameisen
5 years
To decide between Data Science and ML Engineering roles, ask yourself whether you want to focus more on product/analytics question or on engineering challenges? There is no right answer, but in my network, most DSs transitioned into product, and MLEs into engineering.
1
7
49
@mlpowered
Emmanuel Ameisen
29 days
In the sentence "Fact: The capital of the state where Kobe Bryant played basketball is", a lot of features are active. e.g. features for: - various words (fact, of, etc.) - trivia questions - basketball and geography But only a subset are useful to predict the next token
Tweet media one
1
3
50
@mlpowered
Emmanuel Ameisen
5 years
I literally gave a 90-minute talk yesterday that consisted of over fifty slides repeating this message. The first question I got after the talk? "What's the best NLP model?" ...
2
6
48
@mlpowered
Emmanuel Ameisen
6 years
I wrote a tutorial on #ReinforcementLearning , with the help of many people and based on amazing work by @awjuliani . Includes a shoutout to the awesome World Models paper by @hardmaru . Would love to hear any feedback people have :)
0
13
50
@mlpowered
Emmanuel Ameisen
5 years
Image colorization has really come a long way! Really impressive demo by @citnaj and @jeremyphoward . Most deep learning colorizers I've seen produce videos where colors change a lot between frames. Not the case at all here! Here is an example of one of my favorite French movies
0
9
46
@mlpowered
Emmanuel Ameisen
5 years
Here are the slides for my talk at @QConAI on tips and tricks to make NLP models work in practice. I built this talk to mirror the real life of a Data Scientists, so it is 10% about models, and 90% about data and error inspection. ๐Ÿ”Ž๐Ÿ”Ž๐Ÿ”Ž
Tweet media one
1
13
48
@mlpowered
Emmanuel Ameisen
5 years
@CaseyNewton I hear an upcoming feature allows you to see when your matches have shared your profile to a WhatsApp group to make fun of it.
0
1
48
@mlpowered
Emmanuel Ameisen
6 years
The most confusing concept in speech recognition for me is always CTC loss. I learn it and forget it roughly every 6 months, and this article by @_lab41 is always a good way to remind me of how it works.
Tweet media one
2
15
47
@mlpowered
Emmanuel Ameisen
4 months
It writes code examples, and relates the content of the transcript to the screenshots to provide a coherent narrative. Overall, the tutorial is readable, clear and much better than anything I've previously gotten out of an LLM.
Tweet media one
1
2
46
@mlpowered
Emmanuel Ameisen
4 years
Amazing articles on ML in production, and constraints of applied systems and organizations. I particularly enjoyed reading twelve truths of ML for the real world:
@deliprao
Delip Rao e/ฯƒ
4 years
@mlpowered @WWRob @DavidSHolz I believe Rob is referring one of these?
3
1
17
0
7
47
@mlpowered
Emmanuel Ameisen
5 years
This is true at the scale of infrastructure. Many individuals that have led ML teams ( @mrogati , @chrisemoody ) promote the importance of increasing "experiment velocity", the speed at which projects can launch. It's a huge productivity boost. We saw this firsthand @Zipcar !
@fchollet
Franรงois Chollet
5 years
It matters because trying more ideas (with fewer mistakes) means you will converge faster towards better ideas (thus winning competitions more often or increasing your paper acceptance rate). I'm thinking Kaggle kernels or Colab would be a way to gather hard data on this...
3
5
65
1
10
45
@mlpowered
Emmanuel Ameisen
6 years
In a great episode of @twimlai , @JeffDean mentions finding embeddings for categorical variables through word2vec style optimization is extremely common at @Google , but that there are few relevant papers. We see the same @InsightDataAI , and often have to build it from scratch.
4
17
45
@mlpowered
Emmanuel Ameisen
5 years
If you look at the dataset to understand how your model performs, you'll often see that your model is actually struggling. Here, BERT's accuracy on the test set drops from 77% to 50% (random) after researchers identify and correct data leakage.
1
10
46
@mlpowered
Emmanuel Ameisen
5 years
Data tip of the day: If you are doing anything involving processing more than a hundred rows of data (SQL, Spark, model training, viz), use only a small subset to iterate faster! Writing/using a sampling function takes five minutes and saves hours of โ€œwaiting for X to runโ€.
3
4
45
@mlpowered
Emmanuel Ameisen
5 years
That feeling in programming when you've managed to do the thing you've been trying to do for an hour, but you feel deep shame about the hacks you've had to use to make it happen.
1
3
45
@mlpowered
Emmanuel Ameisen
4 months
@ErikSchluntz and I have read the resulting transcript, and Opus manages to incorporate all of these requests, and produces a great blog post. The blog post is formatted as asked, with a subset of images selected and captioned
Tweet media one
1
3
43
@mlpowered
Emmanuel Ameisen
5 years
Data leakage is one of the most dangerous ML errors for 2 reasons: You often will catch too late once your model is in production. It can happen in many subtle ways, not just obvious ones That's why I dedicate a full section to it in my book!
0
8
44