The most wide-spread contemporary use of English throughout the world is that of English as a lingua franca (ELF), i.e. English used as a common means of communication among speakers from different first-language backgrounds. While linguistic descriptions before the mid-2000s focused almost entirely on English as spoken and written by its native speakers, the VOICE project initiated by Barbara Seidlhofer at the English Department at the University of Vienna has sought to redress the balance by compiling the first general corpus capturing spoken ELF interactions. The original VOICE corpus has been released from 2005 to 2011 in two versions; one focussing on highly structured  encoding of speaker interaction, and a second, flattened representation of the corpus featuring lemmatization and part-of-speech tagging.

Funded by the CLARIAH-AT consortium and developed under the direction of by Marie-Luise Pitzl, VOICE 3.0 XML is the latest version of the dataset, which merges the previously separate representations of the corpus into one version and for the first time allows to exploit both layers of annotation in a new  NoSketchEngine-based system architecture developed and hosted at the ACDH-CH.