GitHub - ppleskov/Text-Normalization-Challenge-Russian-Language · GitHub
Skip to content

ppleskov/Text-Normalization-Challenge-Russian-Language

Folders and files

Repository files navigation

  1. Download kaggle files from https://www.kaggle.com/c/text-normalization-challenge-russian-language/data to /input folder
  2. Download additional data set from https://storage.googleapis.com/text-normalization/ru_with_types.tgz to /input/ru_with_types folder
  3. Download missing files from https://drive.google.com/open?id=1eIWHqhc_HSa6IJsFXMuNsSe1eKMXukpU to /obj folder
  4. Run rus_base.ipynb
  5. You can run only parts 0 (imports) and 5 (main loop) to get the final result
  6. Parts 1-4 are for preparing frequency dictionaries which all saved as .pkl files in /obj folder
  7. Every cell contains running time information at the beginning

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors