On Mon, Nov 11, 2002 at 01:20:11AM -0800, sara mraish wrote: > Thanks a lot, > > So to answer my questions you mentioned we can normalize an alef_maksura > with a yeh and a teh_marbuta with a heh? Yes. This is not something I came up with, it is what all the papers I have seen on the subject say. If you like, I can email you privately a couple of papers that you may find useful. > I see teh marbuta online as a teh marbuta I don't see the dots removed and > it's a heh only if the word ends with a heh especially when we have a > masclin word. I see teh marbuta as a teh marbuta. I am just having a hard > time convincing myself that we can remove the dots from the teh marbuta > because if the word is feminin then it should have the two dots on top of > the heh to make the word feminin..therefore, you are saying that we can > remove the dots from the teh marbuta and make it a heh for indexing and > usage in IR applications for Arabic language. > Of course, but you have to consider two things at least: 1. You don't want to limit your search too much 2. The user is less likely to be keen on following such a rule (evident by our writing when we write on paper, often we omit the dots). Then again, you make that choice yourself ;) -- ------------------------------------------------------- | Mohammed Elzubeir | Visit us at: | | | http://www.arabeyes.org/ | | Arabeyes Project | Homepage: | | Unix the 'right' way | http://fakkir.net/~elzubeir/| ------------------------------------------------------- --- Was I helpful? Let others know: http://svcs.affero.net/rm.php?r=elzubeir
Attachment:
pgp00012.pgp
Description: PGP signature