The aim of this research Center is to combine empirical linguistics and NLP, two approaches based on large authentic textual material in language corpora. The project brings together teams working on usage-based approaches in cognitive linguistics (supervisor M. Fried), corpus and quantitative linguistics (supervisor V. Cvrček) and computational linguistics and NLP (supervisor Z. Žabokrtský). A key prerequisite for the implementation of the project is an existing infrastructure enabling the research of large data; in this respect, the project will benefit from CNC and LINDAT-CLARIAH, two language-oriented infrastructures that are among the world leaders in the field of language resources production. The ambition of the project is to cover a wide range of languages and linguistic topics that can be analysed on the basis of existing language resources (incl. contrastive approaches based on parallel corpora). Furthermore, it will extend the range of language resources and empirical linguistic expertise to languages and areas that are not yet covered and which promise excellent results (e.g. research on aphasia, school communication, language acquisition, public discourse or spontaneous interaction).

Team

Senior researchers

Junior researchers

PhD Students

Klára Pivoňková (FF)
Martin Sedláček (FF)
Petra Čechová (FF)
Hana Hledíková (MFF)
Michal Olbrich (MFF)
Václav Horký (FF)
Jan Henyš (FF)
Khatia Buskivadze (FF)
Abishek Stephen (MFF)
Konstantin Sulimenko (FF)
Vojtěch John (MFF)
Federica Gamba (MFF)

Project number: UNCE/24/SSH/009

Project description

Team

Jazyky