The main objectives of LiLowLa are: The development of a smart crawling method able to prioritize the most productive websites; the development of data augmentation techniques for training neural machine translation systems for low-resource languages; to devise a method for distilling the translation knowledge encoded in large pre-trained models; to enable translation memory-based computer-aided translation tools to exploit target-language monolingual corpora; and to deepen the understanding of how NMT systems behave at prediction time and during training.
The kick-off meeting of the European project GoURMET was held at the Universitat d’Alacant on January 22-23. The Transducens Research Group is one of the partners of the project “GoURMET: Global Under-Resourced MEdia Translation“, that focuses on building machine translation systems to translate global news into scarce-resourced languages.
The project, that will last for three years. The consortium consists of the University of Edinburgh (coordinator), the Universitat d’Alacant, the University of Amsterdam, the British Broadcasting Corporation (BBC), and Deutsche Welle (DW). The project is funded in the framework of the European initiative Research & Innovation Action H2020-ICT-2029.
The 18-month ParaCrawl project (Action entitled “Provision of Web-Scale Parallel Corpora for Official European Languages”, Action No 2016-EU-IA-0114) started on September 15, 2017. Transducens group is one of the members of the consortium, together with the University of Edinburgh (coordinating), TAUS, Prompsit and Johns Hopkins University (subcontrator).
ParaCrawl will create parallel corpora to/from English for all official EU languages by a broad web crawling effort. State-of-the-art methods will be applied for the entire processing chain from identifying web sites with translated text all the way to collecting, cleaning and delivering parallel corpora that are ready as training data for CEF.AT and translation memories for DG Translation. It will also make available consortium partners’ open-source tools to CEF Automated Translation and all other interested parties.
The free/open-source machine translation platform Apertium, originally created by the Transducens group, has been selected one more year as one of the projects supported by Google in their Google Summer of Code program. Students from around have applied for one of the 10 projects granted to collaborate with this free/open-source machine translation platform. The complete list of ideas submitted to this edition of Google Summer of Code can be checked here. Chosen projects can be checked at https://summerofcode.withgoogle.com/archive/2017/organizations/6618812501721088/#projects.
The Transducens group is proud to announce that EAMT 2018, the 21th Annual Conference of the European Association for Machine Translation, will take place in Alacant, from May 28 to May 30, 2018. Calls for research and user papers are available at the conference website: http://eamt2018.dlsi.ua.es/
Prof. Mikel L. Forcada, member of the Transducens group, has been elected to be the next President of the European Association for Machine Translation in the last General Assembly held in Antalya (Turkey) on May 12, 2015. Prof. Forcada will hold office for the next two years.
The group leaded by Pedro A. Pernías Peco, one of the members of our group, has been awarded in the MOOC Focused Research Awards organised by Google. Their work Improvement of students’ interaction in MOOCs using participative networks has been one of the 7 projects awarded, which will be supported by Google for their development. This project will focus on studying the involvement of MOOC students in the platform UniMOOC.
The free/open-source machine translation platform Apertium, originally created by the Transducens group, has been selected again as one of the projects supported by Google in their Google Summer of Code program. Students from around the world can now apply for one of the 5,500 USD grants for working for three months in one of the ideas proposed for the project (check the list of ideas here). Proposals and new ideas can be discused with the mentors in the IRC chanel #apertium in freenode.net or through the mail-list firstname.lastname@example.org. The next steps will be:
From February 24 to March 10: discussion of the ideas with project developers.
From March 10 to March 21: submission of proposals.
Interested students can contact with Mikel Forcada.
The Apertium project, a free/open-source rule-based machine translation platform in which the Transducens group has been strongly involved since its inception, is, for the fourth year in a row, one of the 10 free/open-source organizations selected by Google for the Google Code-In.
Google Code-In is a contest to introduce pre-university students (aged 13 to 17) to free/open-source software development. Students from all around the world can participate by tackling small tasks, which may include code writing, debugging, documentation, production of training material, etc. For each three tasks, students get a Google Code-In T-shirt. Each participating organization will select two winners of the Grand Prize: a trip to the Google headquarters for the students and a parent or tutor.
The Apertium project has proposed a wide variety of tasks, including the creation of documentation to help users and developers, the development of dictionaries and rules for new or existing languages, the development of programs to transform other existing free/open-source resources into Apertium format, the creation of extensions to ease the use of Apertium from third-party software, etc. There are also debugging and quality assessment tasks, and tasks in which texts are annotated so that they can be used to test and train Apertium modules.
The first students have already come by the Apertium IRC channel (#apertium at irc.freenode.net) to ask about the different tasks, even if the contest does not officially start until November 16, and mentors (approximately 20 for Apertium this year) have already started to guide them.
If you are a pre-university student interested in contributing to the development of our free/open-source machine translation system, come by and participate!