EvaLatin 2020: data

Sprugnoli, Rachele

Please use the following text to cite this item or export to a predefined format:

Sprugnoli, Rachele; Pellegrini, Matteo; Cecchini, Flavio Massimiliano and Passarotti, Marco, 2020, EvaLatin 2020: data, CLARIN DSpace, http://hdl.handle.net/20.500.11752/OPEN-526

Share

dc.contributor.author	Sprugnoli, Rachele
dc.contributor.author	Pellegrini, Matteo
dc.contributor.author	Cecchini, Flavio Massimiliano
dc.contributor.author	Passarotti, Marco
dc.date.accessioned	2021-03-09T10:26:37Z
dc.date.available	2021-03-09T10:26:37Z
dc.date.issued	2020
dc.description	Training and gold test data released in EvaLatin 2020, the evaluation campaign of NLP tools for Latin. The two shared tasks proposed in EvaLatin 2020, i. e. Lemmatization and Part-of-Speech tagging, were aimed at fostering research in the field of language technologies for Classical languages. The shared dataset consists of texts taken from the Perseus Digital Library, processed with UDPipe models and then manually corrected by Latin experts. The training set includes only prose texts by Classical authors. The test set, alongside with prose texts by the same authors represented in the training set, also includes data relative to poetry and to the Medieval period.
dc.identifier.uri	http://hdl.handle.net/20.500.11752/OPEN-526
dc.language.iso	lat
dc.publisher	CIRCSE Research Centre, Università Cattolica del Sacro Cuore
dc.relation	info:eu-repo/grantAgreement/EC/H2020/769994
dc.relation.isreferencedby	https://www.aclweb.org/anthology/2020.lt4hala-1.16.pdf
dc.rights	Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.label	PUB
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.source.uri	https://github.com/CIRCSE/LT4HALA/tree/master/data_and_doc
dc.subject	Latin
dc.subject	POS tagging
dc.subject	Lemmatization
dc.title	EvaLatin 2020: data
dc.type	corpus
local.branding	OPEN
local.contact.person	Rachele Sprugnoli rachele.sprugnoli@unicatt.it Università Cattolica del Sacro Cuore
local.demo.uri	https://github.com/CIRCSE/LT4HALA/blob/master/data_and_doc/gold_EvaLatin/Horatius-Carmina_GOLD.conllu
local.files.count	1
local.files.size	0
local.has.files	yes
local.language.name	Latin
local.size.info	341,419 tokens
local.size.info	16 files
local.sponsor	euFunds EC/H2020/769994 European Union LiLa - Linking Latin. Building a Knowledge Base of Linguistic Resources for Latin info:eu-repo/grantAgreement/EC/H2020/769994
metashare.ResourceInfo#ContentInfo.mediaType	text