Fixed to work correctly with Cyrillic symbols.#7
Conversation
|
Hi Jake, My workaround is similar to your suggested. method () But my opinion is that the native library without workaround should support Latin and Cyrillic alphabet. |
There was a problem hiding this comment.
This is interesting and good initiative.
I think a better way to do this is by updating the PreprocessMode enum to accept, the enum is confusing and Full vs None does not make much sense.
Also flags makes sense in case I'm working with more than one language.
I propose the following:
[Flags]
public enum PreprocessMode
{
NotSet = 0,
English = 1,
Russian = 2,
Gibberish = 5
}
Then here, in this method use the correct pattern(s).
If PreprocessMode==1 then pattern = "[^ a-zA-Z0-9]"; // English
If PreprocessMode==2 then pattern = "[^а-зА-З0-9]"; // Russian
If PreprocessMode==3 then pattern = "[^a-zA-Z0-9а-зА-З]"; //Both English & Russian
Finally, even the name PreprocessMode isn't very descriptive, maybe LanguageProcessor or something like that would be a better name.

No description provided.