Handwritten Equation Recognition using CNNs, Vision Transformers, and Seq2Seq Models
One-line summary: Built a multi-stage pipeline to convert handwritten equations into LaTeX using visual recognition and sequence generation models.
Key Results
- ViT-Base reached 72.87% symbol recognition accuracy.
- Seq2Seq model achieved 66.19% exact match.
- BLEU score of 0.8428.
What I Built
- Pipeline for equation image preprocessing and symbol recognition.
- Model comparison across CNN and ViT backbones.
- Seq2Seq generation step to produce full LaTeX expressions.
Technical Approach
- Segmented symbols and normalized handwritten inputs for robust recognition.
- Benchmarked model families for visual recognition quality.
- Integrated sequence modeling for end-to-end transcription.
Key Insight
Combining visual recognition with sequence modeling outperformed simpler OCR-style approaches for handwritten math transcription.
Tools / Models Used
Python, PyTorch, CNNs, Vision Transformers (ViT), Seq2Seq models, BLEU evaluation.