Arabic Noisy Text Correction using T5 Transformers
Correcting noisy Arabic text using Transformer-based architecture (T5) with a large-scale scraped dataset pipeline.
- Scraped 100,000+ Arabic news articles from Youm7
- Built Arabic dataset pipeline
- Fine-tuned T5 Transformer model
- Improved Arabic text normalization and correction