@arankomatsuzaki
Aran Komatsuzaki
5 months
Scaling Laws for Downstream Task Performance of Large Language Models Studies how the choice of the pretraining data and its size affect downstream cross-entropy and BLEU score
Tweet media one
1
20
100

Replies

@BerivanISIK
Berivan Isik
4 months
0
0
1