COLA

Full Name
Corpus Oral de Lenguaje Adolescente
Composer
Coordination: A. Myre Jørgensen (University of Bergen)
Language
Spanish
Iberian Spanish
Latin American Spanish
Register
Spoken
Genre
Conversation
Style
Informal
Period
2000-2100 AD
Number of words
500.000 - 1.000.000
Number of words (details)
600,000 words of transcription as of 2014; 76 hours of recordings
Annotation
Tokenization
Annotation remarks

COLA tries to document conversations of informal speaking youth (13 to 19 years) from Madrid and three capitals in Latin America (Buenos Aires, Santiago de Chile and Guatemala). The recordings were made without the knowledge of the participants. The transcription is orthographic and follow the recommendations of the TEI. The website provides access to audio; transcripts and audio files are synchronized at various times. Sound bites are in MP3 or WAV format and can be introduced into other software (such as PRAAT) for a more detailed phonetic analysis. Each transcript is classified as parameters sociolinguistic social class, sex, age and education. The browser (in English and Norwegian) allows advanced searches based on criteria as the informant, age, sex or social status of the informant or the conversation itself. You can search by (parts of) word, prefix or suffix and individual matches. The search results can be exported to an Excel file. Some conversations are available in full.

Format
Download
Data collection
Spontaneous
Multimedia
Synchronized
Availability
Free subscription