Emmanuel Ameisen @mlpowered Twitter profile

Pinned Tweet

Emmanuel Ameisen

5 years

13 months. 250 pages. I wrote an ML book! Want to learn how to ship ML in practice? Check it out! Includes tips from @WWRob , @mrogati , @cdubhland and more! It'll be out in winter & you can preorder it now. Amazon: O'Reilly:

18

95

556

Last Seen Profiles

@robhoegee

@FoxyMenagerie

@JPhysChem

@jandakembangstw

@OmeTvDoods

@drjahanpur

@huriyeyys

@ElviraMaddigan

@CruzRojaArg

@FLAccessNetwork

@thelittlenai

@pausemaguk

@blittlemermaid

@neondex

@ioridonodono

@jandakembangstw

@_Karamori

@russell_tether

@beriaozdenn

@rotten_daisies

@and_Premium

@pengen_stw

@vmtgame

@Vixsi1

@ri_p03

@neelshah886

@mike_inva

@yaayuumii

@_20c_

@calle7panama

@hosseinyektapo1

@Chuhuemendoza

@prodlenzi

@CruiseLordNG

@unlayerapp

@nbguypw

Emmanuel Ameisen

@mlpowered

4 months

Claude 3 Opus is great at following multiple complex instructions. To test it, @ErikSchluntz and I had it take on @karpathy 's challenge to transform his 2h13m tokenizer video into a blog post, in ONE prompt, and it just... did it Here are some details:

51

264

2K

Emmanuel Ameisen

@mlpowered

29 days

Today, we announced that we’ve gotten dictionary learning working on Sonnet, extracting millions of features from one of the best models in the world. This is the first time this has been successfully done on a frontier model. I wanted to share some highlights 🧵

33

258

2K

Emmanuel Ameisen

@mlpowered

5 years

How to ship ML in practice: 1/ Write a simple rule based solution to cover 80% of use cases 2/ Write a simple ML algorithm to cover 95% of cases 3/ Write a filtering algorithm to route inputs to the correct method 4/ Add monitoring 5/ Detect drift ... 24/ Deep Learning

10

275

1K

Emmanuel Ameisen

@mlpowered

2 years

Most ML folks I know have @AnthropicAI 's Toy Models of Superposition paper on their reading list, but too few have read it. It is one of the most interesting interpretability paper I've read in a while and it can benefit anyone using deep learning. Here are my takeaways!

5

62

463

Emmanuel Ameisen

@mlpowered

5 years

When moving from traditional ML (GBDT) to deep learning for categorical data, the vast majority of improvements usually come from learned embeddings of categorical variables. Once you have the embeddings, you can feed them to any model and it will perform noticeably better.

8

62

382

Emmanuel Ameisen

@mlpowered

4 years

This is one of the main tricks I recommend when trying to improve current model performance. Don't start with hyperparameter search, start by looking at individual examples with large losses, and most of the time you'll understand what feature your model is missing.

Andrej Karpathy

@karpathy

4 years

When you sort your dataset descending by loss you are guaranteed to find something unexpected, strange and helpful.

30

227

2K

5

70

374

Emmanuel Ameisen

@mlpowered

5 years

Today, a friend transitioning to ML asked me about a data challenge. Him: I trained a perceptron to predict breast cancer, and reached 99% accuracy on test, what's next? Me: Time to learn about class imbalance, data leakage and metric selection!

8

40

342

Emmanuel Ameisen

@mlpowered

5 years

One of the biggest tool gaps in ML right now is tin building utilities to more easily inspect and understand data. I gave a talk about just this at: It also quotes your great data in industry vs data in academia slide in the conclusion @karpathy

Andrej Karpathy

@karpathy

5 years

We see more significant improvements from training data distribution search (data splits + oversampling factor ratios) than neural architecture search. The latter is so overrated :)

41

365

2K

10

59

335

Emmanuel Ameisen

@mlpowered

5 years

Config files are underrated in ML. You start with a simple model, and soon enough you are trying 13 hyperparameters, 7 models, and 9 data augmentation strategies. Use a config file, and experimentation becomes much easier. Python's implementation:

13

55

316

Emmanuel Ameisen

@mlpowered

1 year

Do you want to understand how to train models like ChatGPT and stable-diffusion? Good news, I wrote an illustrated notebook which explains different parallelism approaches and give a functional example for each. I've summarized some takeaways below NB:

GitHub - hundredblocks/large-model-parallelism: Functional local implementations of main model...

Functional local implementations of main model parallelism approaches - hundredblocks/large-model-parallelism

github.com

4

58

309

Emmanuel Ameisen

@mlpowered

5 years

More and more of my network is transitioning away from looking for deep learning jobs and focusing more on ML infra and platforms. The ML Engineering hype cycle is just starting :)

12

45

247

Emmanuel Ameisen

@mlpowered

1 year

I just finished watching @karpathy 's let's build GPT lecture, and I think it might be the best in the zero-to-hero series so far. Here are eight insights about transformers that the video did a great job explaining. Watch the video for more. (1/9)

Let's build GPT: from scratch, in code, spelled out.

We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections t...

www.youtube.com

4

31

240

Emmanuel Ameisen

@mlpowered

5 years

Currently training a Q&A model, and it is producing crazy impressive results! Q: How do you find a good title? A: See attached None of the samples can be found in the training set that I used. 😱😱😱😱😱

3

39

230

Emmanuel Ameisen

@mlpowered

4 years

What do you mean my analysis isn’t reproducible? I ran a query five month ago on a redshift table that’s now deprecated, wrote a notebook in Python 2 to pre-process the data, and used a DNN implementation from a GitHub repo that’s since been deleted. What’s not reproducible?

4

35

220

Emmanuel Ameisen

@mlpowered

4 years

Wow, the book went from best new release to best seller!!! Looks like the free first chapter helped some folks decide. If you are still on the fence, feel free to check the free PDF out below, the book is also currently 40% off!

6

30

202

Emmanuel Ameisen

@mlpowered

5 years

Wow, @lyft built an ML system that automatically: Finds the best potential users to target. Allocates the right budget for each ad. Sets the right amount to bid on each platform to maximize the use of the budget. Thank you @seanjtaylor for the find!

5

29

200

Emmanuel Ameisen

@mlpowered

29 days

The paper contains *a lot* more experimental results, and feature examples, including interactive visualizations of feature neighborhoods. If you've made it this far in the thread, you should give it a read

7

22

201

Emmanuel Ameisen

@mlpowered

27 days

On Tuesday, we announced our results finding interpretable features in Claude 3 Sonnet. One of the features we identified is about the Golden Gate bridge. When activated, the model starts being obsessed with the bridge. For a limited time, we've made this available to everyone

12

14

196

Emmanuel Ameisen

@mlpowered

5 years

The solution I recommend for anyone with a similar problem: Let’s say you have 3 classes Label 20 examples, including at least 2 examples of each class Train a simple model on your labels and have it predict the rest Look through predictions and label a few wrong ones Repeat

alex hayes

@alexpghayes

5 years

I have 1,200 unlabelled observations that I want to label. The labels are categorical. What's the most efficient way to do this? Any chance someone's written a Shiny app for this?

18

4

20

6

29

195

Emmanuel Ameisen

@mlpowered

2 years

How do you run models that are too big to fit in RAM or in GPU memory? Great @huggingface article explaining their approach to distributing computation

How 🤗 Accelerate runs very large models thanks to PyTorch

huggingface.co

0

33

191

Emmanuel Ameisen

@mlpowered

5 years

When using websites that I suspect have a team of data scientists optimizing them, I make sure to spend some time clicking around randomly. We all have to do our part to keep datasets messy.

9

16

183

Emmanuel Ameisen

@mlpowered

5 years

The Illustrated Word2vec. If you've heard my talks, you've seen my many attempts at giving a visual representation of word vectors. This post from @jalammar takes this to a new level. Really comprehensive and accessible overview!

0

48

165

Emmanuel Ameisen

@mlpowered

11 months

10 years of ML experience and all I have to show for it is a sharp intuition for when projects will be harder than they sound.

8

7

162

Emmanuel Ameisen

@mlpowered

4 months

This was done in one prompt that @zswitten @ErikSchluntz and I wrote. If you'd like to try to improve it, here is the prompt And the full blog post

GitHub - hundredblocks/transcription_demo

Contribute to hundredblocks/transcription_demo development by creating an account on GitHub.

github.com

11

20

160

Emmanuel Ameisen

@mlpowered

6 years

I wrote a tutorial on leveraging #DeepLearning to build a powerful image search engine quickly. It includes a notebook to walk you through and a codebase to play with. Also comes with a shoutout to @jeremyphoward and his great class on the topic!

Building an image search service from scratch

The unreasonable effectiveness of Deep Learning Representations

blog.insightdatascience.com

7

43

152

Emmanuel Ameisen

@mlpowered

5 years

Most experienced Data Scientists understand why I dedicate an entire section of my book to cover how to deploy models. Others ask: “Can’t you just wrap it in a Flask server?” This post from @ravelinhq shows why that’s not enough brilliantly.

1

43

150

Emmanuel Ameisen

@mlpowered

4 years

I wrote "How to solve 90% of NLP problems: a step-by-step guide" after seeing dozens of applied NLP projects at @InsightFellows . It has been read by over three hundred thousand people! It presents a cookie cutter NLP approach, along with reference code

How to solve 90% of NLP problems: a step-by-step guide

Using Machine Learning to understand and leverage text.

www.mlpowered.com

2

40

144

Emmanuel Ameisen

@mlpowered

4 years

This article by @a16z has the best ML infra charts I’ve seen in a very long time. If you’d like to know more about the challenges that come with ML, and the tools to solve them, this is a great start

0

27

140

Emmanuel Ameisen

@mlpowered

4 years

Impressive work! Combine: - 3 dimensional convolutional auto-encoder - @spacy_io embeddings - An RNN encoder - Sprinkle some t-SNE on top Get a model that can generate 3D models from text descriptions! The interactive app is really fun, try it!

AI for 3D Generative Design

Making the design process faster and more efficient by generating 3D objects from natural language descriptions.

blog.insightdatascience.com

1

30

141

Emmanuel Ameisen

@mlpowered

5 years

2018 has been a continuing flurry of exciting work in Machine Learning. If you are interested in being part of the field in 2019, I've written about how some of the most impactful trends of 2018 will impact this year!

4

38

140

Emmanuel Ameisen

@mlpowered

5 years

Benchmark datasets are frustrating. On one hand, datasets often drive significant innovation initially. On the other hand, they usually become completely overfit, leading everyone to overestimate state of the art performance. There should be a hype cycle for datasets.

9

35

141

Emmanuel Ameisen

@mlpowered

4 years

Wow, this is unexpected... and wonderful!

Alex Gude

@alex_gude

4 years

@mlpowered @Grammarly @textio Congrats!

0

1

14

2

6

133

Emmanuel Ameisen

@mlpowered

5 years

As always, @fastdotai posts are a pleasure to read, and present results clearly and fairly. This post covers language modeling for low-resource languages and provides useful info on learning rates, loss functions, and model architecture choices.

2

26

132

Emmanuel Ameisen

@mlpowered

15 days

The most broadly applicable prompting technique: 1. Collect a random subset of failing examples from a training set 2. Add the examples and a correct response to your prompt 3. Repeat Doing the above 5 times has solved 90% of prompting challenges I’ve seen

2

13

123

Emmanuel Ameisen

@mlpowered

29 days

If you use the same prompt, but force the Golden Gate Bridge feature to be maximally active, Claude starts believing that it is the bridge itself! We repeat this experiment in domains ranging from code to sycophancy, and find similar results.

8

11

118

Emmanuel Ameisen

@mlpowered

6 years

Continuously impressed by @distillpub publications. This one on feature transformation does such a good job of centralizing and explaining trends around multimodal learning. An informative read that will leave you wanting to try out a lot of ideas.

0

30

115

Emmanuel Ameisen

@mlpowered

3 months

I don’t think people have fully internalized yet how cheap and how good Haiku is. Opus gets all the press for good reason, but Haiku pushes the intelligence / cost boundary even more imo

Sully

@SullyOmarr

3 months

Kinda bonkers, but this agent workflow only cost me ~$0.09 (9 cents) It's about ~ 350k tokens, and the outputs are on par with gpt-4, at a fraction of the price. GPT = $10/1m = $3.50 Haiku = $0.25/1m = $0.09 40x cheaper. This wasn't even possible 1 month ago.

19

33

442

5

9

110

Emmanuel Ameisen

@mlpowered

5 years

To iterate faster on models, here is how I examine results: Summary metrics for an overview Confusion matrix and calibration curve to find challenging data types Model explainers to inspect features Manually inspect top and worst performing examples Decide on next steps!

1

15

110

Emmanuel Ameisen

@mlpowered

5 years

ML is 5% modelling, 95% data cleaning, product thinking and model serving and 999999999% dealing with Python environment errors

5

21

109

Emmanuel Ameisen

@mlpowered

6 years

Bayesian reasoning and Deep Learning can seem very different. One excels at accurately measuring uncertainty, while the other mostly seems concerned with optimization. @yaringal 's writing connects both elegantly, showing how dropout can define uncertainty

0

28

106

Emmanuel Ameisen

@mlpowered

29 days

For context, the goal of dictionary learning is to untangle the activations inside the neurons of an LLM into a small set of interpretable features. We can then look at these features to inspect what is happening inside the model as it processes a given context.

2

5

102

Emmanuel Ameisen

@mlpowered

4 years

I gave a talk sharing tips for teams to ship ML. At most companies I've been, the main challenge to getting models out is organizational. The talk focuses on ways to enable product, ml, and infra teams to work together. Slides are available below!

ML in Production

ML in production Emmanuel Ameisen

docs.google.com

1

22

97

Emmanuel Ameisen

@mlpowered

6 years

An amazing deck from the @netflix team on how they keep their recommendation systems fresh by continuously monitoring them, and incorporating corrupted data at training time to mimic real world conditions

0

24

102

Emmanuel Ameisen

@mlpowered

5 years

Using a learned variant of SVD to compress embedding size by 90%. For many speech and NLP tasks, the embeddings represent most of the size of a network, so this clever take on SVD is really impactful! Great to see a clear post accompany the paper as well.

1

18

98

Emmanuel Ameisen

@mlpowered

6 years

Oldie but goodie, this writeup about a winning @kaggle submission is one of the first I've seen successfully apply deep learning to tabular data. More examples have started cropping up recently, so I would not be surprised if the trend continues

1

18

97

Emmanuel Ameisen

@mlpowered

4 years

Experimentation and statistics could really benefit from a beginner friendly attitude. Not just in the form of textbooks, but an openness to discuss concepts that are often confusing (Lack of) Knowledge of stats causes imposter’s syndrome for too many folks in my network :(

6

12

92

Emmanuel Ameisen

@mlpowered

1 year

I'm excited to announce I've joined @AnthropicAI as a Product Research Engineer! I loved my time on @Stripe 's Radar team, working to improve fraud detection models, and in time I hope to write more about some lessons I learned there.

9

4

90

Emmanuel Ameisen

@mlpowered

5 years

A hack that many teams use to deploy ML more easily: Create good embeddings for future use. See @facebook using code embeddings to surface best practices Note that they don't really use any deep learning and focus on handcrafted features. @HamelHusain

2

26

89

Emmanuel Ameisen

@mlpowered

5 years

It is fascinating to me that we spend so much time talking about models and so little time talking about test sets In industry, your test set is supposed to represent your performance in prod Designing a test set that does this well is extremely hard, and not talked about much

5

22

85

Emmanuel Ameisen

@mlpowered

6 years

Many researchers including @fchollet have highlighted the importance of having an explanation (and ideally a tuning mechanism) for content recommendation. @mcinerneyj gives an overview of @Spotify 's category based recommender, a step towards explanations

0

29

85

Emmanuel Ameisen

@mlpowered

4 months

First, we grabbed the raw transcript of the video and screenshots taken at 5s intervals. Then, we chunked the transcript into 24 parts for efficient processing (the whole transcript fits within the context window, so this is merely a speed optimization).

3

78

Emmanuel Ameisen

@mlpowered

6 years

Slowly but surely, deep learning is joining the last domain to resist it, tabular data. The networks are still relatively simple which makes me think we need better primitives for structured data. What is the convolution equivalent for spreadsheets?

3

29

79

Emmanuel Ameisen

@mlpowered

5 years

95% of the work of a Data Scientist is gathering, cleaning and presenting data. Some think that's a bad thing, but I wish every practitioner would embrace it! Data Science is about using data to produce useful things, there is a good reason the title isn't Model Scientist.

2

14

73

Emmanuel Ameisen

@mlpowered

5 years

This work from @mcleavey of @OpenAI has me floored. Train GPT-2 on a large dataset, and you get an amazing composer that lets anyone create music. Here is a riff on Poker Face created in 2 minutes. Notice the coherence of the composition!

1

15

74

Emmanuel Ameisen

@mlpowered

4 years

For the book I built four successive versions of an ML app: First a heuristic with a bunch of rules. Then a simple model. Then, a much more complex model. Finally, a model that simplified the previous approach. Same lifecycle I’ve seen in industry!

Building Machine Learning Powered Applications: Going from Idea to Product

www.amazon.com

4

7

76

Emmanuel Ameisen

@mlpowered

29 days

We find features for almost everything you can think of: geographical concepts (cities and countries), architecture, sports and science. They combine like you’d expect: "an athlete from California" triggers both the athlete feature and the California feature But there's more!

1

5

76

Emmanuel Ameisen

@mlpowered

6 years

Pre-training Language Models to build classifier is quickly becoming more accessible, and has huge promise. The next step in my opinion, is to make it as easy to use a pre-trained model on a custom dataset as VGG transfer learning in Keras for example.

0

18

73

Emmanuel Ameisen

@mlpowered

5 years

Hey! If this resonates with you, take a look at my book ( currently on early-release). It has many many more tips about building sane ML products! Amazon: O'Reilly Early Release:

0

7

73

Emmanuel Ameisen

@mlpowered

5 years

My todo list is a classic example of why ML is hard: Spin up a webserver to serve current prototype of model: 5 hours Add some simple monitoring: 2 hours Add input validation logic and error handling: 3 hours Iterate on the ML and data side to get a better model: ???

1

13

71

Emmanuel Ameisen

@mlpowered

5 years

One of the biggest differences I see between more experienced folks and novices in ML: error analysis If your model shows .94 precision, don't try a new set of hyperparameters Look at the data Then look at it again, using different methods The data always has the answers

4

8

70

Emmanuel Ameisen

@mlpowered

4 months

Here is a subset of some of what we asked the model, in one prompt (full prompt attached) - directly write HTML - filter out irrelevant screenshots - transcribe the code examples in images if they contain a complete example - synthesize transcript and image contents into prose

1

6

68

Emmanuel Ameisen

@mlpowered

4 years

Building my own prototype of an ML guided editor as an example for my book () has helped me realize how good @Grammarly and @textio are. Using ML to assist and guide a user requires a lot more than just modeling, need to think deeply about product.

Building Machine Learning Powered Applications: Going from Idea to Product

www.amazon.com

4

3

69

Emmanuel Ameisen

@mlpowered

5 years

Big fan of . I love how all the questions sound reasonable at a glance, but completely nuts once you actually dive in. I think I now know where @YouTube comments come from...

1

22

67

Emmanuel Ameisen

@mlpowered

4 years

Getting into spaced repetition for memory thanks to @michael_nielsen and @andy_matuschak ’s work. It feels like unsupervised vs supervised learning Normal reading is unsupervised Spaced repetition provides labels you get tested on at successive epochs, to minimize memory loss

1

2

66

Emmanuel Ameisen

@mlpowered

6 years

Regarding the debate on the rigor of Deep Learning, I recommend this paper by Leo Breiman making the case for focusing less on Data Modeling (understanding how data is generated) in favor of Algorithmic Modeling (measuring predictive accuracy).

0

23

65

Emmanuel Ameisen

@mlpowered

11 months

The alignment and interpretability work at @AnthropicAI was one of the main reasons I joined The linked thread does a good job of breaking down some of the details, but I wanted to take a stab at explaining a couple findings in even simpler terms in my own words 🧵

Anthropic

@AnthropicAI

11 months

When language models “reason out loud,” it’s hard to know if their stated reasoning is faithful to the process the model actually used to make its prediction. In two new papers, we measure and improve the faithfulness of language models’ stated reasoning.

12

130

725

1

11

64

Emmanuel Ameisen

@mlpowered

4 months

We gave Opus the transcript, video screenshots, as well as two *additional* screenshots: - One of Andrej's blog to display a visual style to follow - The top of the notebook @karpathy shared with a writing style example On top, we added lots of instructions (prompt in repo)

2

5

61

Emmanuel Ameisen

@mlpowered

5 years

Found on the @Yelp app. Named Entity Recognition is hard! At a brewery, an “Amber” might be a popular beer but at this restaurant it’s actually the name of the waitress...

1

13

61

Emmanuel Ameisen

@mlpowered

6 years

Visual search is the most natural way to search for fashion items, and is becoming more and more popular. If you are curious how to build a visual search app yourself, check out and tell me what you think!

Building an image search service from scratch

The unreasonable effectiveness of Deep Learning Representations

blog.insightdatascience.com

Ryan Hoover

@rrhoover

6 years

Snap just announced visual search: Reminds me of Pinterest Lens 🤔

12

26

177

0

17

60

Emmanuel Ameisen

@mlpowered

4 months

System prompt design is fascinating. It’s basically prompt engineering for the broadest use case you can think of. That means you need to be subtle with, as each word affects every single use case. We spent quite a bit of time iterating on this version. See 🧵 for more.

Amanda Askell

@AmandaAskell

4 months

Here is Claude 3's system prompt! Let me break it down 🧵

121

556

3K

2

4

60

Emmanuel Ameisen

@mlpowered

6 years

Ensembles are very powerful in #ML and routinely win @kaggle competitions. This paper is one of the first I saw mention the idea of using #DeepLearning checkpoints to form an ensemble. The idea has a lot of supporters including @jeremyphoward ! #ClassicML

1

7

59

Emmanuel Ameisen

@mlpowered

5 years

Heard about @YouTube 's recommendation algorithm, heavily criticized for promoting radical and hateful content? Take a look through this fascinating paper describing it. Steps are taken to reduce clickbait, but the only target metric is watch time...

4

21

57

Emmanuel Ameisen

@mlpowered

5 years

This post by @alex_gude should be required reading for Data Scientists. There are two ways you can learn this lesson: You could use a simple model metric, ship a useless model and make people sad. OR You could read Alex’s post.

Interview Question: What Machine Learning Metric to Use

One of my favorite questions to ask in an interview is “What metric should you use to decide if your model works?”. Read on to find out what a good answer looks like!

alexgude.com

1

15

57

Emmanuel Ameisen

@mlpowered

5 years

Many ML practitioners ignore latency, but most apps try to make every interaction no longer than 100ms. It makes using the app more enjoyable! ML should aim for it too, especially for creative uses. This is why I’m excited for @jeremyphoward and @clattner_llvm ‘s Swift work!

2

5

58

Emmanuel Ameisen

@mlpowered

4 years

Last week the @feedly team had me over to chat about some of the practical ML tips I’ve been writing about. The recording is available now. It’s a short video about why and how you should look at your data, including a slide copied from @karpathy :)

Building intuitions before building models

In this video, we discuss a visualization technic to help to find new features, corrupted data points, which data to annotate before building a model...Our g...

www.youtube.com

2

11

57

Emmanuel Ameisen

@mlpowered

5 years

@collision Not TikTok but the YouTube recommendation paper is one of the most detailed and insightful I've seen to this day.

1

5

55

Emmanuel Ameisen

@mlpowered

29 days

We also explore how the model actually uses these features to predict the next word. In other words, we try to separate features that are related to the context from ones that are useful for prediction. Let's look at an example

1

4

53

Emmanuel Ameisen

@mlpowered

5 years

Re-reading World Models by @hardmaru and Schmidhuber, it feels like such an elegant combination of many great ideas about representation learning and dynamic world representations. Feels like something key to build on, especially with all the code here

1

12

52

Emmanuel Ameisen

@mlpowered

5 years

This is what practical ML looks like. Notice how this @UberEng article covers building the tooling to train, run, test and update a model. Not much is said about the architecture -> that is not where most of the gains come from in practice

2

11

51

Emmanuel Ameisen

@mlpowered

4 years

Good post by Gustavo Millen on tests for ML. It covers some of the materials discussed in Building ML Powered Applications including the "ML Test Score" paper by @GoogleAI .

Machine Learning: tests and production

“Creating reliable, production-level machine learning systems brings on a host of concerns not found in small toy examples or even large offline research experiments. Testing and monitoring are key...

millengustavo.github.io

2

14

50

Emmanuel Ameisen

@mlpowered

6 years

A new #ML approach by @jeremyphoward and @seb_ruder , taking transfer learning for #NLP to the next level. Pre-train a general RNN language model on a corpus, fine tune on specific tasks, achieve state of the art results!

0

16

49

Emmanuel Ameisen

@mlpowered

3 months

So excited to finally see tool use out in the world! Improving Claude’s ability to use tools was a core focus when building the Claude 3 family. Looking forward to seeing what you all build with it

Anthropic

@AnthropicAI

3 months

Tool use is now available in beta to all customers in the Anthropic Messages API, enabling Claude to interact with external tools using structured outputs.

70

293

2K

4

50

Emmanuel Ameisen

@mlpowered

5 years

To decide between Data Science and ML Engineering roles, ask yourself whether you want to focus more on product/analytics question or on engineering challenges? There is no right answer, but in my network, most DSs transitioned into product, and MLEs into engineering.

1

7

49

Emmanuel Ameisen

@mlpowered

29 days

In the sentence "Fact: The capital of the state where Kobe Bryant played basketball is", a lot of features are active. e.g. features for: - various words (fact, of, etc.) - trivia questions - basketball and geography But only a subset are useful to predict the next token

1

3

50

Emmanuel Ameisen

@mlpowered

5 years

I literally gave a 90-minute talk yesterday that consisted of over fifty slides repeating this message. The first question I got after the talk? "What's the best NLP model?" ...

2

6

48

Emmanuel Ameisen

@mlpowered

6 years

I wrote a tutorial on #ReinforcementLearning , with the help of many people and based on amazing work by @awjuliani . Includes a shoutout to the awesome World Models paper by @hardmaru . Would love to hear any feedback people have :)

0

13

50

Emmanuel Ameisen

@mlpowered

5 years

Image colorization has really come a long way! Really impressive demo by @citnaj and @jeremyphoward . Most deep learning colorizers I've seen produce videos where colors change a lot between frames. Not the case at all here! Here is an example of one of my favorite French movies

0

9

46

Emmanuel Ameisen

@mlpowered

5 years

Here are the slides for my talk at @QConAI on tips and tricks to make NLP models work in practice. I built this talk to mirror the real life of a Data Scientists, so it is 10% about models, and 90% about data and error inspection. 🔎🔎🔎

1

13

48

Emmanuel Ameisen

@mlpowered

5 years

@CaseyNewton I hear an upcoming feature allows you to see when your matches have shared your profile to a WhatsApp group to make fun of it.

0

1

48

Emmanuel Ameisen

@mlpowered

6 years

The most confusing concept in speech recognition for me is always CTC loss. I learn it and forget it roughly every 6 months, and this article by @_lab41 is always a good way to remind me of how it works.

2

15

47

Emmanuel Ameisen

@mlpowered

4 months

It writes code examples, and relates the content of the transcript to the screenshots to provide a coherent narrative. Overall, the tutorial is readable, clear and much better than anything I've previously gotten out of an LLM.

1

2

46

Emmanuel Ameisen

@mlpowered

4 years

Amazing articles on ML in production, and constraints of applied systems and organizations. I particularly enjoyed reading twelve truths of ML for the real world:

Delip Rao e/σ

@deliprao

4 years

@mlpowered @WWRob @DavidSHolz I believe Rob is referring one of these?

3

1

17

0

7

47

Emmanuel Ameisen

@mlpowered

5 years

This is true at the scale of infrastructure. Many individuals that have led ML teams ( @mrogati , @chrisemoody ) promote the importance of increasing "experiment velocity", the speed at which projects can launch. It's a huge productivity boost. We saw this firsthand @Zipcar !

François Chollet

@fchollet

5 years

It matters because trying more ideas (with fewer mistakes) means you will converge faster towards better ideas (thus winning competitions more often or increasing your paper acceptance rate). I'm thinking Kaggle kernels or Colab would be a way to gather hard data on this...

3

5

65

1

10

45

Emmanuel Ameisen

@mlpowered

6 years

In a great episode of @twimlai , @JeffDean mentions finding embeddings for categorical variables through word2vec style optimization is extremely common at @Google , but that there are few relevant papers. We see the same @InsightDataAI , and often have to build it from scratch.

4

17

45

Emmanuel Ameisen

@mlpowered

5 years

If you look at the dataset to understand how your model performs, you'll often see that your model is actually struggling. Here, BERT's accuracy on the test set drops from 77% to 50% (random) after researchers identify and correct data leakage.

1

10

46

Emmanuel Ameisen

@mlpowered

5 years

Data tip of the day: If you are doing anything involving processing more than a hundred rows of data (SQL, Spark, model training, viz), use only a small subset to iterate faster! Writing/using a sampling function takes five minutes and saves hours of “waiting for X to run”.

3

4

45

Emmanuel Ameisen

@mlpowered

5 years

That feeling in programming when you've managed to do the thing you've been trying to do for an hour, but you feel deep shame about the hacks you've had to use to make it happen.

1

3

45

Emmanuel Ameisen

@mlpowered

4 months

@ErikSchluntz and I have read the resulting transcript, and Opus manages to incorporate all of these requests, and produces a great blog post. The blog post is formatted as asked, with a subset of images selected and captioned

1

3

43

Emmanuel Ameisen

@mlpowered

5 years

Data leakage is one of the most dangerous ML errors for 2 reasons: You often will catch too late once your model is in production. It can happen in many subtle ways, not just obvious ones That's why I dedicate a full section to it in my book!

Building Machine Learning Powered Applications: Going from Idea to Product

www.amazon.com

0

8

44