cv
Basics
Name | Thennal D K |
Label | Undergraduate Student, NLP Researcher |
thennal10@gmail.com | |
Url | https://thennal10.github.io/ |
Education
-
2021.12 - 2025.05 *CGPA: 9.16*
Work
-
2024.05 - 2024.07 Research Intern
Language Technology Group, University of Hamburg
Conducted research on embedding models and large language model (LLM) representations.
- Developed an optimal few-shot fine-tuning regime for topic modeling.
- Devised a novel pruning procedure for LLM-based embedding models, reducing model size by 21% with negligible performance drop.
- Published in RepL4NLP 2025. Internship was supported by the DAAD WISE scholarship.
-
2023.04 - 2024.04 Machine Learning Intern
Institute of Human Resource Development
Developed and deployed machine learning models for speech recognition, speaker identification, and face recognition.
- Worked with the Kerala Police Intelligence Department, leading a team of 15 to develop comprehensive AI systems for policing.
- Trained and deployed a state-of-the-art automatic speech recognition (ASR) model for Malayalam.
- Designed and implemented a Malayalam news extraction system using web scraping and OCR.
- Developed a scalable face recognition system optimized for fast inference.
-
2022.12 - 2023.03 Data Scientist Intern
Institute of Human Resource Development
Optimized web infrastructure and data-driven decision making.
- Rebuilt web stack to streamline employee workflows and replace outdated systems.
- Created visualizations and reports to communicate insights to internal stakeholders and government agencies.
- Collaborated with the procurement team to optimize purchasing decisions.
Projects
- 2024.08 - 2025.04
Sparsification for Model Merging
- Investigated the effects of sparsification on four representative model merging techniques with Dr. Suchithra M S as my Honours thesis.
- Demonstrated that sparsifying delta parameters significantly improves the performance of the merged multitask model.
- Established a systematic framework for integrating pruning into model merging workflows.
- Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing.
- 2024.08 - 2024.10
Advocating for Character Error Rate in Automatic Speech Recognition
- Documented shortcomings of the commonly used word error rate metric for multilingual evaluation with Dr. Jesin James, Senior Lecturer at University of Auckland.
- Conducted multilingual surveys collecting human preferences among different ASR models.
- Calculated metric correlations, providing experimental evidence in favor of Character Error Rate.
- Published in NAACL Findings 2025.
- 2023.07 - 2023.10
Fisher Mask Nodes for Model Merging
- Developed a novel and compute-efficient model merging algorithm with Dr. Suchithra M S.
- Evaluated performance on various BERT family models, achieving a performance improvement of +6.5%.
- Achieved speedups between 57.4x and 321.7x.
- Published in LREC-COLING 2024.
- 2022.12 - 2023.12
Whisper Malayalam
- Trained ASR models on Malayalam speech data as part of the Huggingface Whisper Fine-Tune Community Sprint
- Elevated the medium-sized model's performance to the top of the leaderboard, making it the state-of-the-art solution for Malayalam ASR as evidenced by 30k total downloads and counting.
- 2018.03 - 2022.11
ICFOSS Malayalam Speech Corpus
- Collaborated with the International Centre for Free and Open Source Software (ICFOSS) on IMaSC - The ICFOSS Malayalam Speech Corpus, a 50-hour text-to-speech dataset.
- Supervised data collection, speaker recording, and quality control.
- Trained and evaluated multiple models, achieving an average MOS score of 4.50.
- 2018.05 - 2019.02
Data Augmentation for Automatic Voice Disorder Detection
- Evaluated data augmentation techniques for automatic voice disorder detection, focusing on leukoplakia with Dr Vrinda V Nair, Dean (Research) at APJ Abdul Kalam Technological University.
- Developed a custom data augmentation strategy that increased dataset size by 8x and achieved a 46.9% increase in accuracy.
- Published in a peer-reviewed journal.
Awards
- 2024
DAAD WISE Scholarship
German Academic Exchange Service
Publications
-
2025.04.01 Large Language Models Are Overparameterized Text Encoders
Proceedings of the 10th Workshop on Representation Learning for NLP (RepL4NLP-2025)
-
2025.04.01 Advocating Character Error Rate for Multilingual ASR Evaluation
Findings of the Association for Computational Linguistics: NAACL 2025
-
2024.05.01 Fisher Mask Nodes for Language Model Merging
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
-
2022.11.01 IMaSC -- ICFOSS Malayalam Speech Corpus
arXiv Preprint
-
2022.09.01 Performance Enhancement of Deep Neural Network Based Automatic Voice Disorder Detection System with Data Augmentation — A Case Study
Biomedical Engineering: Applications, Basis and Communications, Vol. 35
-
2019.11.01 Memory Based Speech Duration Model using Exemplar Theoretic Approach
International Conference on Artificial Intelligence & Speech Technologies (AIST 2019)
Skills
Programming Languages | |
Python | |
HTML/CSS | |
JavaScript | |
SQL | |
C/C++ | |
C# | |
Java |
Frameworks | |
PyTorch | |
TensorFlow | |
Scikit-learn | |
Vue.js | |
React | |
Unity |
Miscellaneous | |
Linux | |
Shell (Bash/Zsh) | |
LaTeX | |
Git | |
Docker | |
PostgreSQL |