The TUNICO Corpus was created as part of the TUNICO project (Linguistic dynamics in the Greater Tunis Area: a corpus-based approach) between 2013 and 2016. It consists in transcriptions of recordings of  more than 30 hours of dialogues and narrative interviews, which were  collected during a field-trip to Tunis in 2013. It is made up of digital documents that represent the language of speakers from different social backgrounds under the age of thirty-five who have grown up and still live in the Greater Tunis area.

The digital corpus was encoded in accordance with the Guidelines of the Text Encoding Intiative (TEI) and furnished with lemma and POS information. It was developed in tandem with the TUNICO dictionary which contains large amounts of data taken from this corpus. In addition to the dictionary, a number of lists very generated from the corpus giving general statistics on wordforms, statistics of foreign loans, most frequent verbs, nouns and adjectives. The corpus texts were linked to the dictionary, which allows the user to see the dictionary data in the text interface.