ADAM Corpus

Please use the following text to cite this item or export to a predefined format:
Cattoni, Roldano; Danieli, Morena and Soria, Claudia, 2001, ADAM Corpus, CLARIN DSpace, http://hdl.handle.net/20.500.11752/ILC-999
Date issued
2001
Size
10230 turns,
78787 words,
27288.628 seconds
Language(s)
Description
The ADAM spoken corpus is a collection of 450 spoken dialogues: they are both human-human (200 dialogues) and human-machine (250 dialogues). All the dialogues are recordings and transcriptions of telephone conversations in the semantic domain of tourism and railway transportation. The format of the audio files is the standard format for telephone signal data recommended by the SPEECHDAT3 project directions. Each dialogue is annotated at five levels of linguistic information: prosody, morphosyntax, syntax, semantics and pragmatics. For each level a corresponding annotation scheme has been defined that provides annotation instructions, examples and criteria. The result of each annotation is an XML file that encodes the content of a dialogue with respect to a particular level according to the annotation scheme of that level. The human-human dialogues are simulated telephone conversations between two experimental subjects, playing the roles of a travel agent and of a caller, respectively. The human-machine dialogues were collected on the field: they are interactions between callers and the automatic telephone information service of the Italian railway company, recorded during an experimental phase of that service. Each dialogue in the ADAM corpus is represented by an orthographic transcription (physically an XML file), which in turn is linked to an audio file containing the corresponding recording. In addition, the transcription of each dialogue is associated to five XML annotation files, according to five different levels or layers of linguistic information, namely prosody, morphosyntax, syntax, semantics and pragmatics.
Acknowledgement
 Files in this item
Loading files... This may take a few seconds as file previews are being generated. If the process takes too long, please contact the system administrator