Former projects

  • MaCoCu (2021-2023): Massive collection and curation of monolingual and bilingual data for the under-resourced languages of the Europe Union. Project website.
  • MultitraiNMT (2019-2022): Machine translation training for multilingual citizens. The project aims at developing an innovative syllabus in machine translation, and in particular machine translation based on currently popular deep learning techniques, i.e., neural machine translation. Project website.
  • GoURMET (2019-2022): The aim of GoURMET is to use and improve neural machine translation for low-resource language pairs and domains. It has five objectives: (i) Advancing low-resource deep learning for natural language applications; (ii) develop a high-quality machine translation for low-resource language pairs and domains; (iii) develop tools for media analysts and journalists; (iv) create a sustainable and maintainable platform and services; and (v) inform stakeholders and user group of project results. It was funded by the EU; grant agreement number 825299. Project website.
  • Paracrawl (2017-2019; 2018-2019; 2019-2021): ParaCrawl created and released large parallel corpora to/from English for all official EU languages by a broad web crawling effort. Project website.
  • Effortune (2015-2018): Effortune project (Optimización de la Traducción Automática Estadística Guiada por el Esfuerzo) was a project aimed at exploring new evaluation metrics for machine translation that correlate better with post-editting effort. Project website.
  • Abu-MaTran (2012-2016): Abu-MaTran seeks to enhance industry-academia cooperation as a key aspect to tackle one of Europe’s biggest challenges: multilinguality. Project website.
  • Ayutra (2013-2015): Using machine translation and other sources of bilingual information for computer aided translation. Project website.
  • Succeed (2013-2014): Succeed is a support action funded by the European Union. It promotes the take up and validation of research results in mass digitisation, with a focus on textual content. Project website.
  • Synglossia (2010-2013): Collaborative creation of linguistic and educative resources in the web 2.0. Project website.
  • Planteamiento, Diseño y Ejecución de los Aspectos Tecnológicos de la Fase Inicial del Plan Estratégico de la Biblioteca Virtual Miguel de Cervantes (2008-2010).
  • Profit project (2007-2008): Tools for creating didactic resources in digital libraries.
  • EurOpenTrad (2007): open-source machine translation for the european integration of the languages of the spanish state.
  • Sistemas de código abierto para la creación, mantenimiento y aprovechamiento de bibliotecas digitales, herramientas lingüísticas y educativas (2006-2009): using open-source for creating, maintaining and using digital libraries, linguistic tools and educative tools.
  • MultiMatch (2006-2008): On the web, cultural heritage content is everywhere, in traditional environments such as libraries, museums, galleries and audiovisual archives, but also in popular magazines and newspapers, in multiple languages and multiple media. The aim of the MultiMatch project is to enable users to explore and interact with online accessible cultural heritage content, across media types and languages boundaries. Project website.
  • Projecte de traducció automàtica de codi obert per al català (2006): creation of an open-source machine translation system for Catalan.
  • Informatización de material digital multimedia para el aprendizaje de español (2005-2006):
  • Traductores de estados finitos a partir de bitextos alineados recolectados en internet (2003-2006): finite-state-automata-based machine translation from aligned bitexts harvested from the Internet.
  • Traducción automática de código abierto para las lenguas del Estado español (2004-2005): open-soure machine translation for the languages of the spanish state.
  • Informatización Nivel Avanzado C) y mejora B) Curso Español del Instituto Cervantes en Internet (2003-2004).
  • Desarrollo de un sistema de traducción automática en Internet entre el español y el portugués (2002-2004). Project website.
  • Desenrotllament d`un sistema d’assistència ortoèpica per a la lectura en veu alta del valencià (2002-2003).
  • Herramientas para la gestión y explotación de textos estructurados en bibliotecas digitales (2001-2003).
  • SISHITRA (2001-2004): Hibrid systems for translating Catalan-Spanish from voice/text.
  • Tecnología, educación, desarrollo e innovación (2000-2003).
  • Desarrollo de un sistema de traducción automática del castellano al balear, catalán y valenciano (1998-2003). Project website.
  • Redes neurales en entornos de comunicaciones y en interfaces persona-máquina (1997-2000): Application of neural networks in environments and interfaces for person-machine interaction.