Mete,
I think your solution is lacking one thing: we can't tell where is alef maksura . Other than that, I don't have any problem. BTW, why is
it important to have normalization?
Hi again,
Maybe I'm not understanding Mete, but I don't see how this could work at all. Aside from the semantics I've mentioned in another post, Farsi yeh takes dots in initial and medial forms, no? So how can it be the seat of a hamza or a small alif in those contexts?
My recommendation is to convert all yehs - alef maqsuras, yeh seats of hamza, yeh seats of small alef, regular yehs, final dotless yehs - to Farsi yeh. Searching is no problem. Here is the algorithm:
-gregg