Please use the following text to cite this item or export to a predefined format:
dc.contributor.author | Trotta, Daniela |
dc.date.accessioned | 2021-03-25T09:33:53Z |
dc.date.available | 2021-03-25T09:33:53Z |
dc.date.issued | 2019-11-01 |
dc.description | The corpus includes the transcripts of 56 TV face-to-face interviews for a total of 14 hours, taken from several broadcasts of the Italian political talk show Mezz'ora, from 24 September 2017 to 14 January 2018 aired on Rai 3 channel. The show follows a fixed format, with interviews conducted by a journalist, Lucia Annunziata, to a guest, typically a prominent figure in the political or cultural scene (such as Matteo Renzi, Luigi Di Maio, Pierluigi Bersani, Walter Veltroni, Alessandro Di Battista, Angelino Alfano, Matteo Salvini, etc.). The audio signal has been transcribed using a semi-supervised speech-to-text methodology (Google API + manual correction). Annotation has been done using XML as markup language and following the TEI standard for Speech Transcripts in terms of utterances. The linguistic resource has currently 100,870 tokens. For each interview, the following information was manually annotated and is included in the XML resource file (every file was named with the broadcast date, the description lists the names of the guests interviewed): 1. metadata: these include useful information for the quick identification of transcriptions, for example, the tools used for the transcription, a link to the interview, the owner account, the title of the talk show, the date of airing, the guests, etc. 2. pause: this tag is used to mark a pause either between or within utterances. Speakers differ very much in their rhythm and in particular in the amount of time they leave between words, so the following element is provided to mark occasions where the transcriber judges that speech has been paused, irrespective of the actual amount of silence 3. vocal: with this tag we mark any vocalized but not necessarily lexical phenomenon, for example, non-lexical expressions (i.e. burp, click, throat, etc.) and semi-lexical expressions (i.e. ah, aha, aw, eh, ehm etc.) 4. del: phenomena of speech management include false starts, repetition, and truncated words included in the transcription, but marked - in the TEI Guidelines - as editorially deleted and therefore indicated with the tag del 5. overlap: this phenomenon is present when the speaker conveys (in a verbal or non-verbal manner) that he/she is about to finish his/her turn and the co-locutor starts speaking so that there is a slight overlap of utterances. Only for interviews longer than 50 turns, the second level of annotation was added automatically using ANVIL software (Kipp, 2001) - inspired by the MUMIN annotation scheme (Allwood et al., 2007). These files - listed with "name surname" - provide an alignment of the transcript with the original audio-video source (accessible from the link in the metadata). Below we summarize the list of gestures annotated, as described in (Allwood et al., 2007): 1. facial displays: they refer to timed changes in eyebrow position, expressions of the mouth, movement of the head and of the eyes (Cassell and others, 2000). The coding scheme includes features describing gestures and movements of the various parts of the face, with values that are either semantic categories such as Smile or Scowl or direction indications such as Up or Down 2. hand gesture: we follow a simplification of the scheme from the McNeill Lab (Duncan, 2004). The features, 7 in total, concern Handedness and Trajectory, so that we distinguish between single-handed and double-handed gestures, and among a number of different simple trajectories analogous to what is done for gaze movement. The value Complex is intended to capture movements where several trajectories are combined 3. body posture: this tag comprises trajectory indications for the movement of the trunk. The categories are mutually exclusive to facilitate the annotation work. |
dc.identifier.uri | http://hdl.handle.net/20.500.11752/OPEN-534 |
dc.language.iso | ita |
dc.publisher | Università di Salerno |
dc.publisher | Fondazione Bruno Kessler - Trento |
dc.relation.isreferencedby | http://ceur-ws.org/Vol-2481/paper73.pdf |
dc.relation.isreferencedby | https://www.aclweb.org/anthology/2020.lrec-1.532.pdf |
dc.rights | Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.rights.label | PUB |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ |
dc.source.uri | https://github.com/dhfbk/InMezzoraDataset |
dc.subject | multimodal corpus |
dc.subject | face-to-face interviews |
dc.subject | political domain |
dc.title | PoliModal Corpus |
dc.type | corpus |
local.branding | OPEN |
local.contact.person | Daniela Trotta dtrotta@unisa.it Università di Salerno |
local.files.count | 27 |
local.files.size | 2073788 |
local.has.files | yes |
local.language.name | Italian |
local.size.info | 100870 tokens |
metashare.ResourceInfo#ContentInfo.mediaType | text |
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- salvini1.xml
- Size
- 49 KB
- Format
- text/xml
- Description
- XML
- MD5
- 9d16fe60cefbf857c54953d07bb01e56

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- bersani.xml
- Size
- 70.71 KB
- Format
- text/xml
- Description
- XML
- MD5
- 20e178bb97149d8eefe1395f75859abc

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- renzi.xml
- Size
- 142.65 KB
- Format
- text/xml
- Description
- XML
- MD5
- 6eac5ef0a6cf21b0b540f139617f46fa

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- dibattista.xml
- Size
- 116.78 KB
- Format
- text/xml
- Description
- XML
- MD5
- b531e484469b4d438c0f8c876ed39f66

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- calenda.xml
- Size
- 106.71 KB
- Format
- text/xml
- Description
- XML
- MD5
- a372ead7f181be8a166795beb13d932e

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+10_12_17.xml
- Size
- 103.18 KB
- Format
- text/xml
- Description
- XML
- MD5
- ffb01281ca24f8b791813d5e7dd87d79

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- padoan.xml
- Size
- 57.44 KB
- Format
- text/xml
- Description
- XML
- MD5
- 4ee9662f972bdbdf921097fefb870dab

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- alfano.xml
- Size
- 84.15 KB
- Format
- text/xml
- Description
- XML
- MD5
- f4292ad97caa6a187166a20852878907

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- distefano.xml
- Size
- 55.03 KB
- Format
- text/xml
- Description
- XML
- MD5
- f1ee37e98bffb9c7a0d2563abc6c63c0

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- dimaio.xml
- Size
- 57.57 KB
- Format
- text/xml
- Description
- XML
- MD5
- 596709da8e73c14d0d26bed911742900

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- veltroni.xml
- Size
- 55.91 KB
- Format
- text/xml
- Description
- XML
- MD5
- 4b7f46312f365286ac8f156350572110

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- tremonti-orfini.xml
- Size
- 63.22 KB
- Format
- text/xml
- Description
- XML
- MD5
- bf8d9d32cbd05ad8157c0490bc854de7

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+05_11_17.xml
- Size
- 73.13 KB
- Format
- text/xml
- Description
- XML
- MD5
- 190fb239963fd9e5249646d6acea9d09

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- salvini2.xml
- Size
- 67.51 KB
- Format
- text/xml
- Description
- XML
- MD5
- fa60a30325073f736af9e531a7401e60

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+15_10_17.xml
- Size
- 96.7 KB
- Format
- text/xml
- Description
- XML
- MD5
- c3c7eb800c4f395f407cba8d2ed89bcc

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+14_01_18.xml
- Size
- 89.54 KB
- Format
- text/xml
- Description
- XML
- MD5
- 6cfe6db04f9deae9a817da0d661db465

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+12_11_17.xml
- Size
- 81.44 KB
- Format
- text/xml
- Description
- XML
- MD5
- ee4998aae9775b3b1572ddaa4c7b5294

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- unicum.dtd
- Size
- 1.68 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 5e4394ebc94ce3e2ecd3cfe3b0319cc3

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+17_12_17.xml
- Size
- 85.12 KB
- Format
- text/xml
- Description
- XML
- MD5
- ab6bf839ebfa7baa84d2197f5e3c6f3c

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+08_10_17.xml
- Size
- 107.38 KB
- Format
- text/xml
- Description
- XML
- MD5
- 11d7111148b563a5978e05f6928f7d36

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+29_10_17.xml
- Size
- 76.69 KB
- Format
- text/xml
- Description
- XML
- MD5
- bdbbd73b93017a0aa86eb66807fd8eb1

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+26_11_17.xml
- Size
- 77.36 KB
- Format
- text/xml
- Description
- XML
- MD5
- b6e8760a62923fcfeda55e29c1fd3c34

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+24_09_17.xml
- Size
- 72.01 KB
- Format
- text/xml
- Description
- XML
- MD5
- 67599069766d310442e972f428e01029

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+22_10_17.xml
- Size
- 89.07 KB
- Format
- text/xml
- Description
- XML
- MD5
- ff6b21ff9369b9b7dbfc8a568ea9e4b5

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+19_11_17.xml
- Size
- 78.83 KB
- Format
- text/xml
- Description
- XML
- MD5
- aac0abd7f0bf27a6def7c10297b6a9af

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+03_12_17.xml
- Size
- 66.34 KB
- Format
- text/xml
- Description
- XML
- MD5
- d19d71a6766b28322a3c90541b5afc0d

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk
- Name
- 30+01_10_17.xml
- Size
- 76.22 KB
- Format
- text/xml
- Description
- XML
- MD5
- 1a5ebf27d7fda00360453d9552fd52a4

The file preview has not been generated yet. Please try again later or contact the system administrator test@test.sk