SFB "German in Austria" Corpus

The corpus has been, developed in the context of the Special Research Project “German in Austria” (FWF F60) and comprises more than 1000 hours of spoken language variation of the German language in Austria. For the creation of the corpus, more than 850 respondents (from different age and occupational types) from all language areas in Austria were recorded in various survey settings (mainly interviews, conversations among friends, language production experiments, reading and translation tasks, reading aloud tasks a.o. A significant part of the data is transcribed (according to orthographic standard or with GATII) and automatically enriched with PoS-tags.
The corpus is built as a relational, PostGreSQL database. SpaCy was used for the automatic annotation. All audio files are in the .ogg-format.

SFB DiÖ

Data repository

GitHub

Name	Purpose	Storage duration	Type	Provider
CookieConsent	Remembers your consent to the use of cookies.	1 year	HTML	Web Consent
fe_typo_user	Assigns your browser to a session on the server. This only affects the content you see and is not evaluated or processed by us	-	HTTP	Web User

Name	Purpose	Storage duration	Type	Provider
_pk_id	Used to store a few details about the user like unique visitor ID.	13 months	HTML	Matomo-id
_pk_ref	Used to store information about the user's referring website.	6 months	HTML	Matomo-ref
_pk_ses	Short-term cookie to save temporary data from the visit.	30 minutes	HTML	Matomo-ses
_pk_cvar	Short-term cookie to save temporary data from the visit.	30 minutes	HTML	Matomo-cvar
_pk_hsr	Short lived cookie used to temporarily store data for the visit.	30 minutes	HTML	Matomo

Name	Purpose	Storage duration	Type	Provider
YouTube	A connection to YouTube will be established to view videos.	-	Connection	YouTube
SoundCloud	A connection to SoundCloud will be established to play audio files.	-	Connection	SoundCloud
Twitter	A connection to Twitter will be established to display tweets.	-	missing translation: type.	Twitter

Data repository

Helpdesk