Surya () already supports line detection , and I'm excited to have it do full end to end OCR. The final model should support ~90 languages (all major languages in use today). Tweet added by Vik Paruchuri @VikParuchuri

Vik Paruchuri

5 months

Surya () already supports line detection , and I'm excited to have it do full end to end OCR. The final model should support ~90 languages (all major languages in use today).

GitHub - VikParuchuri/surya: OCR, layout analysis, reading order, line detection in 90+ languages

OCR, layout analysis, reading order, line detection in 90+ languages - VikParuchuri/surya

github.com

2

1

31

Vik Paruchuri

@VikParuchuri

5 months

An update on surya text recognition - I'm happy with the data/architecture, and I'm ready to scale up training. Here are some results from a (very) early checkpoint. Left is original, right is OCR (Malayalam)

9

24

247

Vik Paruchuri

@VikParuchuri

5 months

And here is a Japanese example. As you can see, there are some issues with the OCR, since this is an early model.

1

9

Vik Paruchuri

@VikParuchuri

5 months

Obligatory newspaper

1

0

9

Vik Paruchuri

@VikParuchuri

5 months

I want to get the training done in the next week or two. My @LambdaAPI 4x A6000s have been great, but I think I'm going to need more to scale up this training. If you know of any discounted/available GPUs (ideally 8x 80GB {A,H}100), please let me know.

4

0

13

Ariel Ekgren

@ArYoMo

5 months

@VikParuchuri Would love if it supported the Nordic Languages (Swedish. Norwegian, Danish and icelandic). Can people help, do you need data?

1

0

2

Vik Paruchuri

@VikParuchuri

4 months

@ArYoMo Hey, I didn't see this message before, but it does support all 4 - just released it.

0

1

Replies