First Shared Task on Multilingual Easy-to-Read Translation

Advance methods for producing easy-to-read versions of texts, with a focus on Catalan, Italian, and Spanish (plus a surprise language).

Registration Mailing List
Languages: CA · IT · ES (+ surprise) Up to 3 submissions / language Data: iDEM corpus (JSON)

Quick Links

Supported by: Horizon Europe iDEM project

Overview

Why Easy-to-Read?

Accessible language supports participation for people with language comprehension difficulties (e.g., intellectual disabilities, low literacy), aligning with accessibility and inclusion goals.

What’s new here

MER-TRANS is a multilingual shared task targeting Romance languages (Catalan, Italian, Spanish) and introducing multilingual easy-to-read translation at shared-task scale.

Task

Objective

Automatically produce easy-to-read versions of texts or sentences. Inputs are complex excerpts; outputs should be simplified, readable, and meaning-preserving.

  • Primary languages: Catalan, Italian, Spanish
  • Surprise task language: disclosed closer to test release
  • Max submissions: up to 3 runs per language per team

Scope

Texts come from a domain-focused corpus (democratic participation) simplified by experts following easy-to-read recommendations and validation procedures.

Tip: design systems that generalize—avoid overfitting to a single dataset style.

Data & Resources

Corpus: iDEM (E2R)

  • Original + simplified versions aligned at sentence level
  • Not parallel across languages (but each language has original↔simplified pairs)
  • Authentic variation; multiple text types (informative, news, policy, etc.)
  • Format: JSON, one file per language with metadata

Training data policy

No task-specific training set is released. Teams may use existing simplification/adaptation resources (including cross-lingual augmentation).

Examples of relevant external datasets:

Trial & Test

  • Trial: small set with complex excerpts + reference easy-to-read versions
  • Test: complex excerpts only (systems generate simplified outputs)

Evaluation

Surface similarity

BLEU — compares system output to reference simplifications.

Simplification-focused

SARI — measures add/keep/delete operations vs input and references.

Semantic similarity

BERTScore and MeaningBERT — meaning preservation signals.

Readability / Complexity

Readability metrics and complexity classifiers may complement the core metrics to assess accessibility.

Schedule

Timezone: Europe/Madrid
Milestone Date

Note: All dates are tentative and may be updated. Please check this page regularly for the latest schedule.

Participation

  • Registration window: Feb 16–28, 2026
  • Submissions: up to 3 per language per team

Papers

  • Paper due: Jun 1, 2026
  • Acceptance: Jun 14, 2026
  • Camera-ready: Jun 21, 2026

Organization team

  • Horacio Saggion
    Horacio Saggion — Universitat Pompeu Fabra, Spain
  • Nelson Perez Rojas
    Nelson Perez Rojas — Universidad de Costa Rica, Central America
  • Stefan Bott
    Stefan Bott — Universitat Pompeu Fabra, Spain
  • Nouran Khallaf
    Nouran Khallaf — University of Leeds, England
  • Mehrzad Tareh
    Mehrzad Tareh — Universitat Pompeu Fabra, Spain
  • Daniel Adanza
    Daniel Adanza — Universitat Pompeu Fabra, Spain
  • Almudena Rascon
    Almudena Rascon — Plena Inclusion Madrid, Spain
  • Sandra Szasz
    Sandra Szasz — Universitat Pompeu Fabra, Spain

Contact

Primary contact

Horacio Saggion
Universitat Pompeu Fabra (UPF)
Email:

Don’t forget to include the shared task name in the email subject.

Ethics

The dataset was created within the iDEM project under strict ethics protocols and in compliance with European data protection requirements.