About Me

I am a Costa Rican PhD researcher at CNRS and Université de Lorraine, focused on generative AI and multilingual natural language processing. My work centers on making language technologies more inclusive, with an emphasis on multilingual text generation and evaluation. I have published at major conferences like ACL, INLG, and IJCNLP/AACL. Through those publications I have contributed open-source tools and models to the NLP community, including work on Hierarchical QLoRA Training (HQL) and reference-less semantic evaluation for low-resource languages.

My journey started early, from coding as a kid to studying both computer science and languages. I’ve always been drawn to the intersection of language, technology, and communication, and I care deeply about democratizing AI for more people and languages. Alongside research, I enjoy teaching and mentoring, whether guiding students in programming and data science or volunteering to help young people broaden their horizons. When I’m not working, you will likely find me reading, studying new languages, playing board games, or stargazing. I value collaboration, flexibility, and the patient pursuit of meaningful progress, and I’m always eager to join teams driven by good ideas and positive impact.

Publications

  • Semantic Evaluation of Multilingual Data-to-Text Generation via NLI Fine-Tuning: Precision, Recall and F1 scores
    William Soto Martinez,
    Yannick Parmentier, and Claire Gardent
    (to appear, ACL 2025 )

  • Generating from AMRs into High and Low-Resource Languages using Phylogenetic Knowledge and Hierarchical QLoRA Training (HQL)
    William Soto Martinez,
    Yannick Parmentier, and Claire Gardent
    (INLG 2024 )

  • Phylogeny-Inspired Soft Prompts For Data-to-Text Generation in Low-Resource Languages
    William Soto Martinez,
    Yannick Parmentier, and Claire Gardent
    (IJCNLP-AACL 2023 )

  • Language Identification of Guadeloupean Creole
    William Soto Martinez
    (LIFT 2020 )

  • X-ParEval: A Multilingual Metric for Paraphrase Evaluation
    William Soto Martinez
    (Master Thesis )

Collaborations

  • The 2023 WebNLG Shared Task on Low Resource Languages. Overview and Evaluation Results (WebNLG 2023)
    Cripwell et al.
    (MMNLG 2023 )

  • NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
    Dhole et al.
    (NEJLT 2023 )

Teaching

  • 2017-2019
    Universidad de Costa Rica

    Teaching Assistant
    B.Sc. in Computation and Informatics
    CI 1312 - Data Bases I
    CI 1322 - Automata and Compilers
    CI 1441 - Computational Paradigms

  • 2021-2025
    Université de Lorraine

    Teaching Mission
    M.Sc. in Natural Language Processing
    UE 701 - Python Programming
    UE 704 - Methods for NLP
    UE 803 - Data Science