Ucharan-ai is an open-source project focused on building a Text-to-Speech (TTS) system for the Nepali language. By converting written Nepali text into natural and intelligible speech, Ucharan-ai aims to make the Nepali language more accessible and interactive in the digital world.
- Natural-Sounding Nepali Speech: Generates high-quality speech output for Nepali text.
- Customizable Voices: Multiple voice tones and styles for different use cases.
- Text Preprocessing: Handles Nepali-specific text rules like Sandhi and Samasa.
- User-Friendly API: Easy integration into applications, websites, and devices.
- Open-Source: Built for the community to innovate and contribute.
Languages like Nepali are often underrepresented in modern AI systems. Ucharan-ai seeks to bridge this gap by providing:
- Accessibility: Empowering visually impaired users with Nepali audio content.
- Educational Tools: Enhancing e-learning platforms with text-to-speech capabilities.
- Language Preservation: Promoting the use of Nepali in the digital era.
- Localized Solutions: Enabling Nepali-specific voice assistants and applications.
We use the OpenSLR dataset as the foundation for building our TTS system. Our work is inspired by the following research:
@inproceedings{dhakal2022automatic, title={Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNet}, author={Dhakal, Manish and Chhetri, Arman and Gupta, Aman Kumar and Lamichhane, Prabin and Pandey, Suraj and Shakya, Subarna}, booktitle={2022 International Conference on Inventive Computation Technologies (ICICT)}, pages={515--521}, year={2022}, organization={IEEE} }
We welcome contributions from the community! Follow these steps to get started:
- Fork the repository
- Create a feature branch:
git checkout -b feature-name
- Commit your changes:
git commit -m "Description of changes"
- Push to your branch:
git push origin feature-name
- Open a pull request
Refer to CONTRIBUTING.md for detailed guidelines.
- Initial idea and planning phase.
- Research and define Nepali-specific TTS requirements.
- Develop a basic TTS model for Nepali text.
- Support for multiple voices.
- Web-based interface for real-time TTS.
- Mobile app integration.
If you have any questions or suggestions, feel free to reach out:
- Email: [email protected]
This project is licensed under the MIT License. You are free to use, modify, and distribute this software.