MECORE DATABASES

SIGTYP 2023: The database contains information about ∼50 clause-embedding predicates in 16 languages: Catalan, Dutch, English, French, German, Greek, Hebrew, Hindi, Italian, Japanese, Kîîtharaka, Mandarin, Polish, Spanish, Swedish and Turkish.

  • Database: The database can be accessed here:
  • Accompanying paper: Please refer to our SIGTYP paper (Özyıldız et al. 2023) for details of the database.
  • Contents: The database is saved as a zip file containing 16 directories each corresponding to a single language. Each directory contains three files:
    • a csv file summarising ∼15 semantic properties and ∼12 combinatorial properties of predicates;
    • a pdf document that details the example sentences used to annotate the properties; and
    • a README file that contains information about the language as well as language-specific features of the dataset.
  • Questionnaire: The questionnaire used to elicit the properties can be accessed at OSF here.
  • Tools: Tomasz Klochowicz’s MECORE Analysis Tools provide tools to automatically analyze the data collected in the MECORE Database. They allow users to safely unify, merge and compare data.
  • How to cite the database: Özyıldız, Deniz, Ciyang Qing, Floris Roelofsen, Maribel Romero and Wataru Uegaki. 2023. A Crosslinguistic Database for Combinatorial and Semantic Properties of Attitude Predicates. Proceedings of the 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP (SIGTYP 2023), pages 65–75, Association for Computational Linguistics.
    • Tomasz Klochowicz’s tools have to be credited independently.