The Vienna Corpus of Arabic Varieties (VICAV) was launched over a decade ago with the aim of collecting digital language resources documenting varieties of spoken Arabic. While it is basically open to any kind of text, the focus in the early years was on bibliographic data, so-called language profiles, standardised feature lists and lexical data. In the future, current projects (SHAWI, WIBARAB) will also increase the availability of corpus data.

VICAV had a strong methodological component, consisting of the development of re-usable digital data and tools. Much of this effort has focused on modelling data, providing documentation for development and pedagogical purposes, developing specialised workflows, and further developing open-access XML editors. VICAV is strongly committed to the dissemination of the Text Encoding Initiative (TEI) community standard and endeavours to promote Lex-0, which is also being further developed as part of the Elexis project.

VICAV was founded as a cooperation between the Institute of Oriental Studies at the University of Vienna and the Austrian Centre for Digital Humanities and Cultural Heritage of the Austrian Academy of Sciences (ACDH-CH) and provides the technical framework for the TUNICO, TUNOCENT, SHAWI and WIBARAB projects.

VICAV Platform

 

Data repository

GitHub