PoliModalCorpus

Verso la costruzione del primo corpus multimodale di dominio politico in italiano

Autori

  • Daniela Trotta Laboratorio di Linguistica Computazionale “M. Gross”, Dipartimento di Scienze Politiche, Sociali e della Comunicazione, Università degli Studi di Salerno, Fisciano (SA), Italia
  • Teresa Albanese Laboratorio di Linguistica Computazionale “M. Gross”, Dipartimento di Scienze Politiche, Sociali e della Comunicazione, Università degli Studi di Salerno, Fisciano (SA), Italia
  • Annibale Elia Laboratorio di Linguistica Computazionale “M. Gross”, Dipartimento di Scienze Politiche, Sociali e della Comunicazione, Università degli Studi di Salerno, Fisciano (SA), Italia

Parole chiave:

Political communication, Corpus Linguistics, Multimodal corpora, XML-TEI annotation, Natural Language Processing

Abstract

This work introduces the PoliModalCorpus, the first multimodal political domain corpus in Italian. The corpus was constructed to fill the lack of Italian linguistic resources for political–institutional communication. The data includes the transcripts of 59 face–to–face interviews in the political talk show “In mezz’ora in più”. This paper illustrates the methodology employed for data collection, the corpus construction, and the annotation scheme proposed to structure the data. A new level of analysis is proposed, which consists in a linguistic and terminological analysis, not only on a quantitative level through textual statistics based on a morpho–syntactic analysis, but by inserting a level of annotation focusing on non–verbal aspects — pauses, non–lexical backchannels — that occur during the political interviews. Non–verbal expressions do not simply accompany attempts to interrupt the speakers but are indicators of the success of their intentions, including persuasive strategies.

Pubblicato

30-09-2018

Fascicolo

Sezione

Contributi