Previously, I was using very detailed circular projections (hundreds of directions) that led to false matches because I had to let the error threshold high…</p>

Using very small projections (20-30 different projection angles) requires a very low reject threshold but provides amazingly good results.</p>

Image 5.png(all the “h” were correctly gathered){.imagelink}</p>

My main concern is now the page segmentation into lines and characters. In many cases, the method I use right now fails to separate letters that are stuck together.</p>

I need to be able to detect words, and to make it possible to ask for incremental resegmentation of a given segment.</p>

I’d like to connect my existing code to some new higher-level code that would be vocabulary and grammar-aware, so that likeliness of word-match for words where there is an unmatched segment/character would trigger resegmentation of this segment/character.</p>

Stay tuned…</p>