by maouida on 9/19/14, 7:10 AM with 45 comments
by steeve on 9/19/14, 9:12 AM
You can use our python bindings for both[3,4], although they might be slightly outdated:
[1] https://code.google.com/p/tesseract-ocr/
[2] http://libccv.org/doc/doc-swt/
by swalsh on 9/19/14, 2:30 PM
I found a few libraries, but they only worked with relatively perfect scans (my goal is to be able to just use a phone). When I get home definitely going to give this a go.
by jamessantiago on 9/19/14, 8:23 AM
by rikkus on 9/19/14, 8:57 AM
by mdaniel on 9/20/14, 4:33 AM
I freely admit that I do not speak Korean, but if one compares "Chinese Simplified" characters (listed as "Very good") with those in the Korean alphabet, I am surprised those two entries aren't transposed.
Is there something that makes recognizing Korean harder than Chinese Simplified, or was that just a product management decision?
by cipher0 on 9/19/14, 9:36 AM
by reallycurious on 9/19/14, 8:36 AM
by jccodez on 9/19/14, 4:25 PM
by Norm-- on 9/19/14, 6:34 PM
Now I would be more interested in an image correction library
".... Blurry images Handwritten or cursive text Artistic font styles Small text size (less than 15 pixels for Western languages, or less than 20 pixels for East Asian languages) Complex backgrounds Shadows or glare over text Perspective distortion Oversized or dropped capital letters at the beginnings of words Subscript, superscript, or strikethrough text"