Paper 'AfriMed-QA' won the Best Social Impact Award at ACL 2025; New preprint 'FLEXITOKENS: Flexible Tokenization for Evolving Language Models' to be presented at the ICML 2025 tokenization workshop; 'The NaijaVoices Dataset: Cultivating Large-Scale, High-Quality, Culturally-Rich Speech Data for African Languages' accepted to Interspeech 2025; 'AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset' accepted to ACL Main 2025; Served as Publication Chair for the MRL 2024 Workshop at EMNLP.
Research Experience
Formerly a researcher at Intron, working on building language and speech models for African languages and accents (including MENA). Involved with collaborative initiatives such as Masakhane, ML Collective, and others.
Education
Ph.D. student in Computer Science and Engineering at The Ohio State University, advised by Prof. Sachin Kumar.
Background
Research interests include efficient multilingual representation learning and understanding how knowledge in language models can be transferred across different models. Passionate about mentoring aspiring AI researchers and building open science ML communities.
Miscellany
Hobbies include playing pool, lawn tennis, table tennis (still working on improving skills!), reading books, and listening to podcasts. Podcast recommendation: 'How To Take Over The World'.