Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train it on other language's? #7

Open
TUNA-NOPE opened this issue May 27, 2023 · 2 comments
Open

How to train it on other language's? #7

TUNA-NOPE opened this issue May 27, 2023 · 2 comments

Comments

@TUNA-NOPE
Copy link

Hello 👋 I would like to know if it possible to train a model on other languages like Hebrew, if u can help me with that I will be very happy 😊 THX🙏

@X-rayLaser
Copy link
Owner

Yes, I think it should be possible. Securing a large dataset continues to be the primary challenge. Your dataset must contain diverse and well-structured handwriting samples, presented as stroke sequences, not just pictures of written text. Follow the guidelines in the Readme section to train your own model using appropriate data. For details on the required data structure, consult the IAM Online Handwriting Database. Also, to acquire additional information about the precise representation of handwriting samples, please refer to the section titled "Implementing Custom Data Provider."

@X-rayLaser
Copy link
Owner

If you already possess data with the necessary representation, implementing a custom data provider class should suffice. Essentially, this class must have two methods - get_training_data and get_validation_data. These methods serve as Python generators, yielding (handwriting, transcription) pairs. You are free to design the implementation however you prefer. Once you complete your data provider class, you may employ readily available Python scripts to proceed with the remaining steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants