Dataset Card for ITALIC

ITALIC is a benchmark evaluating language models’ understanding of Italian culture, commonsense reasoning and linguistic proficiency in a morphologically rich language.

Above are example questions from ITALIC. Note: every example is a direct translation; the original questions are in Italian. The correct option is marked by (✓).

Dataset Details

Dataset Description

We present ITALIC, a large-scale benchmark dataset of 10,000 multiple-choice questions designed to evaluate the natural language understanding of the Italian language and culture. ITALIC spans 12 domains, exploiting public tests to score domain experts in real-world scenarios. We detail our data collection process, stratification techniques, and selection strategies.

ITALIC provides a comprehensive assessment suite that captures commonsense reasoning and linguistic proficiency in a morphologically rich language. It serves as a benchmark for evaluating existing models and as a roadmap for future research, encouraging the development of more sophisticated and culturally aware natural language systems.

Curated by: CRISP research centre https://crispresearch.it/
Language(s) (NLP): Italian
License: MIT

Dataset Sources

Huggingface: https://huggingface.co/datasets/Crisp-Unimib/ITALIC
Zenodo [ADD THIS LATER]
Paper: [ACCEPTED AT NAACL25]

Citation

BibTeX:

[COMING SOON]

APA:

[COMING SOON]

Tags: dataset card italian culture commonsense reasoning linguistic proficiency