@VikParuchuri
Vik Paruchuri
5 months
Surya () already supports line detection , and I'm excited to have it do full end to end OCR. The final model should support ~90 languages (all major languages in use today).
2
1
31

Replies

@VikParuchuri
Vik Paruchuri
5 months
An update on surya text recognition - I'm happy with the data/architecture, and I'm ready to scale up training. Here are some results from a (very) early checkpoint. Left is original, right is OCR (Malayalam)
Tweet media one
Tweet media two
9
24
247
@VikParuchuri
Vik Paruchuri
5 months
And here is a Japanese example. As you can see, there are some issues with the OCR, since this is an early model.
Tweet media one
Tweet media two
1
1
9
@VikParuchuri
Vik Paruchuri
5 months
Obligatory newspaper
Tweet media one
Tweet media two
1
0
9
@VikParuchuri
Vik Paruchuri
5 months
I want to get the training done in the next week or two. My @LambdaAPI 4x A6000s have been great, but I think I'm going to need more to scale up this training. If you know of any discounted/available GPUs (ideally 8x 80GB {A,H}100), please let me know.
4
0
13
@ArYoMo
Ariel Ekgren
5 months
@VikParuchuri Would love if it supported the Nordic Languages (Swedish. Norwegian, Danish and icelandic). Can people help, do you need data?
1
0
2
@VikParuchuri
Vik Paruchuri
4 months
@ArYoMo Hey, I didn't see this message before, but it does support all 4 - just released it.
0
0
1