MCIF documentation

MCIF is a comprehensive benchmark for evaluating multimodal, crosslingual instruction-following systems, which covers 3 modalities (text, speech, and video), 4 languages (English, German, Italian, and Chinese), and 13 tasks (organized into 4 macro-tasks).

Check out the Usage section for instructions on how to use the repository and the Installation section for further information about how to install the project.

Github: https://github.com/hlt-mt/mcif.git
PyPi: https://pypi.org/project/mcif

Credits

The library is released open source under Apache 2.0 License. If you use this library, please cite:

@misc{papi2025mcifmultimodalcrosslingualinstructionfollowing,
    title={{MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks}},
    author={Sara Papi and Maike Züfle and Marco Gaido and Beatrice Savoldi and Danni Liu and Ioannis Douros and Luisa Bentivogli and Jan Niehues},
    year={2025},
    eprint={2507.19634},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2507.19634},
}