go!digital 3.0: Selected Projects


Open Pashto-English Dictionary: A gateway to Afghanistan’s history and dialectology

Jeremy Bradley (University of Vienna, Dept. of European and Comparative Literature and Language Studies)
Veronika Milanova (OeAW, Institute of Iranian Studies)

This project creates an online dictionary of Pashto for both the academic and speaker communities (Pashto speakers in Afghanistan and Pakistan as well as in the global diaspora), including professionals such as translators and interpreters. While core data comes from Aslanov’s Pashto-Russian dictionary (1985), we will add a sizable number of lemmas from other sources including academic papers as well as our own fieldwork with native speakers. This way, we make sure that our dictionary contains the necessary neologisms describing recent technological, political and societal developments which are ubiquitous in contemporary Pashto media. Additionally, in the course of our fieldwork with native speakers from both Afghanistan and Pakistan we have been able to acquire knowledge about semantic nuances, slang and dialectal expressions which were absent from previous dictionaries.

We put emphasis on both technical and colloquial vocabulary which has not been recorded in previous dictionaries (offline or online). “Technical terminology” as we define it here means neologisms (for diverse material and immaterial innovations of the 21st century) which are commonly found in Pashto traditional and social media but are missing from even the newest dictionaries. “Colloquial vocabulary” not only covers dialectal vocabulary but also everyday lexemes and phrases which previous lexicographers failed to include, either because the scholars were unaware of them or because the exclusion was deliberate. For both domains, we will rely on our fieldwork and especially on input from the community, including translators who are often struggling to find Pashto equivalents for legal terms and other technical terminology.

Every entry consists of short grammatical information and dialectal classification for the lemma as well as an English translation. The grammatical information includes the nominal class of a given noun or adjective, which tells the user how a nominal is declined properly. Given the complex nominal declension of Pashto, it is of crucial importance for a user to know how a given noun forms, e.g., its oblique singular or direct (nominative) plural. A table with the nominal classes with example paradigms is provided on the website. German translations and information on a word’s origin are included if they are contained in our sources; both aspects will be strengthened in future continuations of the project. For words subject to unusually large dialectal variation, we include audio recordings from different native speakers.


Kaleidoscopic Patterns of Protest: Qualifying and Quantifying Visual and Textual (Self-)Representations in Eastern European Protest Cultures

Gernot Howanitz (University of Innsbruck, Dept. of Slavic Studies)
Magdalena Kaltseis (University of Innsbruck, Dept. of subject-specific Education)

Political protest movements are on the rise in Eastern Europe, specifically in Russia, Ukraine, and Belarus. In these countries, protesters occupied public spaces and displayed visual symbols and slogans in order to motivate other people to join them. Simultaneously, the governments employed these self-representations so as to delegitimize the protest movements. Thus, the mediatized (self-)representations—YouTube videos, blog posts, communication via social networks, TV news broadcasts, or feature-length documentaries—form an integral part of the protests themselves. In our research project, we focus on these different (self-)representations by zooming in on three different media: First, we look at the self-representation of protest cultures on YouTube and social networks. Second, we concentrate on the official representation of the protest cultures by analyzing state-run and independent TV news broadcasts. Third, we explore the protest cultures’ cinematic representation in three feature-length documentaries. Based on these different perspectives, our project title metaphorically refers to the “kaleidoscopic patterns of protest”, which we try to make visible through our research. To this end, we will quantitatively and qualitatively investigate both the visual and textual (self-)representations of protest movements. We combine methods from computer science, especially automatic symbol recognition with artificial neural networks, with multimodal discourse analysis from linguistics. On the one hand, our project aims at analyzing large amounts of data to explore the visual and textual (self-)representations of protest cultures in Eastern Europe and find out how images and texts are (re)used in different media and contexts. On the other hand, we see our project as a best-practice example that will serve to collect large amounts of data to preserve them for future scientific research.


Beyond the Item. Biographies and Itineraries of Cultural Heritage Objects in Museums and beyond

Viola Winkler (Natural History Museum Vienna, NHM)
Roland Filzwieser (Ludwig Boltzmann Institute for Archaeological Prospection & Virtual Archaeology, LBI ArchPro)

The bITEM project revolves around museum objects of great significance to cultural heritage and their biographies, which are to be digitised from a holistic perspective. The aim is not only to create digital twins (e.g. 3D scans) of the objects, but holistic representations of the objects in combination with their history.

Today, we have the technical prerequisites to depict physical things in digital form and to document not only their physical characteristics in detail but also their "life story" and the actors, events and changes involved in them as networks.

bITEM, based at the Natural History Museum Vienna and the Ludwig Boltzmann Institute for Archaeological Prospection and Virtual Archaeology in cooperation with the Austrian Centre for Digital Humanities and Cultural Heritage, brings together a multidisciplinary team of archaeologists, computer scientists, biologists, earth scientists, historians, and their focused expertise to combine different methodological, conceptual, and technological frameworks. This allows for the first time to digitally analyse, document, represent, visualise, and present outstanding museum objects of the NHM of different origin (artefacts, biofacts, geofacts - e.g. The Venus of Willendorf as well as, among others, the unique skeletons of the extinct moas and dodos) and their biographies.

bITEM will combine "object biographies" and "actor network theory" to digitally map object biographies as networks within the CIDOC CRM, based on the technological framework of the OpenAtlas system.

Numerous detailed virtual representations (3D scans, CT scans) will be created along with scientific analyses of the objects' physical properties to virtually represent the objects within these networks as holistically as possible.

Their "biographies" are explored and embedded in this network, which includes connections to other material things, actors, events, places, and concepts from the beginning of the object's existence to the present.

Through the use of established vocabularies and linked open data, these networks are also embedded in the semantic web.

In addition to recording and analysing the data and object biographies, the team will develop a public web application that will allow the networks of objects to be interactively represented, visualised, and explored.

It will serve as an Open Data portal to disseminate the objects, their stories, their networks, and their "lives" from antiquity to the present.


Cognitive Plausibility of Deep Learning Language Models

Weyers, Ivonne MA (University of Vienna, Dept. of Linguistics)
Erion Çano (University of Vienna, Dept. of European and Comparative Literature and Language Studies)

Human language processing is a highly complex cognitive operation, which we have yet to understand in its entirety. Psycho- and neurolinguistic research of the past decades has focused on identifying the nature, as well as the order of the underlying computational steps involved in this operation, e.g. lexical activation or syntactic processing. These investigations have mainly been based on testable hypotheses provided by theoretical linguistic models of language.

At the same, there has been remarkable progress in Natural Language Processing (NLP) research using deep learning language models. An innovative approach to model training (pre-training fine-tuning) has yielded neural language models (LMs) that acquire general linguistic knowledge and can subsequently easily be fine-tuned to various NLP tasks, from structural part-of-speech tagging to semantically-related natural language understanding.

What remains a largely open question, however, is what kind of linguistic knowledge LMs acquire during pre-training and which aspects of it they use to solve the tasks. Interestingly, while able to perform sophisticated machine translation tasks, these LMs fail at even very basic language tasks, for instance learning of abstract sameness relations, which in linguistics are assumed to be primitives underlying successful (human-like) language processing. In other words, it seems that despite their impressive developments, LMs’ underlying processing is not necessarily comparable to human cognitive operations involved in language processing and much less psycholinguistically plausible. In fact, deep neural language models are only seldom informed by concurrent findings from experimental linguistics.

The present interdisciplinary project aims to bridge this gap with an innovative, integrative research approach that combines the strengths of modern psycholinguistics and deep learning NLP research. Specifically, we aim to identify cognitive mechanisms involved in human language processing and to subsequently integrate these into state-of-the-art deep natural language processing models in order to make these NLP models more human-like. This integration will be achieved either through adapted model training, or through adaptations at the architectural level (or both). We expect that firstly, our approach will yield improved NLP models that show increased performance in common NLP benchmark tests, which we will test directly in a comparison with the original unmodified models. Secondly, we expect that the thus created more cognitively plausible language models will generate new testable hypotheses for linguistics. Based on these, psycholinguistic experiments with human participants will be conducted to investigate our models’ predictive value and prediction accuracy. Accordingly, the integrative approach of this project has the potential to open up new pathways in both domains, natural language processing and linguistics.


Manorial Networks in the medieval Tyrol: Mapping and Visualisation

Tobias Pamer (University of Salzburg, Dept. of History)
Elisabeth Gruber-Tokic (University of Innsbruck, Dept. of Linguistics)

Imagine a digital map of medieval Europe, where you may zoom into single households and retrace not only where they were located, but also assess their economic situation. The proposed project creates an interactive map focusing on the Starkenberg family and Friedrich IV of Habsburg that is linked to historical data that are used to answer a broad variety of questions, like the variation of settlements, levies, economic inequality or agricultural production. It provides information on the regional linguistic landscape (various types of places), climate history (changes in agricultural structures and output) or power relations (income, manorial rights). The project aims to make a first step towards such a map with the respective Knowledge Graph (KG) in the background by building and testing a prototype of an expandable digital infrastructure for the systematic recording, presentation of landed property and the associated levies. The prototype is generated by using late medieval data of the Tyrol. The data originate from 1379 (contract of Neuberg) to 1426 (Expulsion of the Starkenberg from Tyrol). The following steps are planned: 1) Princely and noble estate registers are selected as primary sources and further digitised. 2) After transcribing and semantically annotating these unpublished sources 3) the relevant information is transferred into a KG. The latter forms the data basis for 4) the interactive map that includes charts, diagrams and special map layers. 5) This digital infrastructure serves as a prototype for a spatially and chronologically expandable system that provides information for 6) historical analysis.


Opening the Southern Jauntal as a Micro-region for Future Archaeology

Dominik Hagmann (State Museum of Carinthia, LMK)
Franziska Reiner (OeAW, Austrian Archaeological Institute, OeAI)

IUENNA (OpenIng the soUthern JauNtal as a micro-regioN for future Archaeology) is an innovative project to strengthen the application of digital methods in archaeology in Austria, by pursuing complex cultural-historical questions, shaping Digital Humanities in Classics, and securing cultural knowledge for the long-term. IUENNA is based on the archaeological micro-region of the Jauntal (Carinthia/Austria). It involves the State Museum for Carinthia (LMK), the Austrian Archaeological Institute (ÖAI) at the Austrian Academy of Sciences (ÖAW), the Austrian Center for Digital Humanities and Cultural Heritage (ACDH-CH) at the ÖAW, the Federal Monuments Authority (BDA), and the archaeological company ARDIG.

IUENNA follows an extensive open science approach, using the Late Antique ‘pilgrimage center’ of the Hemmaberg with its decade-long excavations and related sites (Globasnitz/Iuenna, Jaunstein, and St. Stefan) as a case study. IUENNA will provide, for the first time in Austria for Classics, an outstanding model study and a sustainable long-term archive of an elaborated excavation at one of the most critical Late Antique sites of the Southeast Alpine region and its vicinity and integrate all data. All archaeological research data available will be digitized, structured in an all-new inclusive and hierarchically organized file folder system, and enhanced with metadata, which can serve as an example for future projects for Austrian archaeology and beyond. Data will be made available online in full open access using the repository ARCHE (A Resource Centre for the HumanitiEs) of the ACDH-CH and an open-source web-mapping application.

The Hemmaberg is undoubtedly one of the best-researched Late Antique hilltop settlements of the 4th-6th cent. AD and a leading reference site for early Christianity in the Southeast Alpine region. The Hemmaberg and its late antique settlement (‘pilgrimage center’) is a world-renowned site, featuring 5 early Christian churches, auxiliary buildings, and the Gothic pilgrimage church of St. Hemma and Dorothea, as well as the Rosalia Grotto. However, the Hemmaberg may not be seen on its own since it is part of a much more extensive settlement area with sites from prehistoric times to the early Middle Ages, which form the micro-region of the Jauntal with more than 2000 years of cultural history. Over 100 years of studies reflect a remarkable research history: After first investigations at the beginning of the 20th cent. by citizen scientists, continuous activities took place primarily from the later 1970s onwards by the LMK with a focus on Late Antiquity in the area of the Hemmaberg itself and Globasnitz/Iuenna, a Roman settlement/road station (including remains of the Roman link road Virunum—Celeia), and a massive Late Antique burial ground. Other nearby sites are a recently discovered Late Antique (?) and massive Roman ‘super-villa’ near St. Stefan. At Jaunstein, archaeological features exemplify the early Middle Ages.


Sigmund of Tyrol's Court: Prosopographical Database

Markus Debertol (University of Innsbruck, Institute for Historical Sciences and European Ethnology)
Nadja Krajicek (Provincial Archive of Tyrol)

SiCProD is a cooperative project between the University of Innsbruck (UIBK), the Provincial Archive of Tyrol (TLA) and the Austrian Center for Digital Humanities & Cultural Heritage of the Austrian Academy of Sciences (ACDH-CH). Its aim is to create a prosopographical database on the court of

(Arch-)Duke Sigmund of Tyrol (r. 1439/1446-1490).

Sigmund's court is suitable for such a project for several reasons. The source corpus on which the project is based, is clearly delimited and manageable, and for the most part well indexed. It is also mainly available in one place, the TLA. While many studies have been published for Sigmund’s successor Maximilian I, especially in recent years, they are largely missing for Sigmund. A comprehensive personal research or even network analysis of the Tyrolean court of the 15th century is therefore a great desire.

Furthermore, the necessary tools are available, primarily APIS (Austrian Prosopographical Information System), developed at ACDH-CH. The adaptations and further developments required for SiCProD will be available for other projects in the future.

The finished database will make it possible to trace the court personnel, office bearers as well as institutional structures and networks of people in detail. It will also offer extensive possibilities for visualising the data material. The database including scans of the underlying archival materials will be freely accessible online. Biographical information, such as biographical data, origin, functions, career paths of chancellors as well as ordinary court employees of the database can support historians all over the world in their research on a wide range of issues.

To guarantee the long-term availability and accessibility of the data it will be serialized in CIDOC CRM and ingested into ARCHE (digital preservation system hosted at ACDH-CH).


The Affective Construction of National Temporalities in Austrian Postwar Radio (1945–1955)

Elias Berner (University of Music and Performing Arts, Vienna, MDW)
Birgit Haberpeuntner (University of Vienna, Theater, Film and Media Studies)
Stefan Benedik (House of Austrian History, hdgö)

In a collaborative project, researchers and curators from the University of Music and Performing Arts Vienna, the University of Vienna and the Haus der Geschichte Österreich examine the role of Austrian radio in the formation of a national consciousness after WWII. In particular, it examines how imaginations and projections of the country’s past and future are affectively mediated on an auditory level, and the extent to which the medium contributes to the shaping of such a new national temporality. Important aspects to focus on, in this respect, are the difficult relation with a tabooed but ever present Nazi past, as well as Austria’s positioning as ‘belonging to the west’ in the looming cold war.

It must be noted that, at the time, all radio stations were under allied administration, yet they mostly employed Austrians. The struggle for ideological influence on the radio programs is well researched, especially by scholars of contemporary history. The programs themselves, however, have hardly been comprehensively analyzed. This is largely due to the fact that the primary sources have either not been available at all, or extremely difficult to access. It was only in the 1990s that the Österreichische Mediathek was able to acquire an estate that held 215 original tapes of the ‘American’ station, RWR, most of which have since been digitized. 600 tapes of the Russian-controlled RAVAG have also become available when the ORF Archives were digitized, and another 100 individual items may be found in the DokuFunk archive. In 2016, the holdings at the ORF Archives and the Österreichische Mediathek were included in the UNESCO World Heritage List.

However, these collections are so fragmented and diverse, both structurally and in terms of content, that they have to be processed in a standardized way before they may be productively used for further research. We aim at doing so with the help of the annotation software LAMA (Linking Annotations for Media Analysis), which was developed for the Digital Musicologyproject Telling Sounds. Using Linked Open Data and Semantic Web methodologies, the documents will thus be indexed and related with each other on the level of production and content. Once the materials are structured in a standardized way, they may be analyzed with regard to the outlined research questions. At the same time, our materials, as well as the newly generated standardized metadata, are indexed and linked with Authority Files, which makes them accessible for future research. 

In close cooperation between researchers and curators, the results of our analysis, as well as select sources, will be presented to a broad public at the Haus der Geschichte Österreich. A physical exhibition will be installed at the Almá-Rose-Plateau in the Neue Burg imperial palace at Vienna’s Heldenplatz square, and a digital exhibition will go online towards the end of 2023, in time for the anniversary celebrations of 100 Years of Radio in Austria.


Bibliotheca Eugeniana Digital

Simon Mayer (Austrian National Library, ÖNB)
Florian Windhager (Danube University Krems, Dept. for Knowledge & Communication Management)

Goal of the project Bibliotheca Eugeniana Digital (BED) is a digital reconstruction and visual representation of Prince Eugene’s historic book collection (UNESCO “Memory of Austria”), which ranges among the most famous collections of the Baroque era. Since 1738, the collection has been part of the Imperial Library of the Habsburgs, the predecessor of the Austrian National Library (ONB). Thousands of visitors every year are told that Eugene’s famous library could be admired in the middle-oval of ONB’s baroque State Hall. However, this is not true. Until today, neither the library’s composition nor its size nor the locations of the printed books in ONB’s collections have been analysed, an undertaking too huge and complex for traditional methods. Digitisation of sources combined with novel digital methods allows for new and more effective exploitation of huge cultural heritage collections like the Bibliotheca Eugeniana.

The project aims to use tools and methods from the Digital Humanities and Data Sciences for a systematic digital reconstruction and visual exploration of this historic library from different sources to investigate its composition and history.

Most of the printed books of Bibliotheca Eugeniana have been digitised as part of ONB’s large-scale Austrian Books Online (ABO) project. BED will use machine learning (ML) to identify Eugeniana supralibros bindings in digitised books of the ABO corpus. In addition, the historical manuscript catalogue of the Eugeniana and archival sources on its transformation in the 19th century will be transcribed by ML based models for handwritten text recognition and published at ONB’s infrastructure for Digital Editions.

All data gained will be merged with metadata from ONB’s open access catalogue. Titles from the digital edition and full texts from ABO will be classified by subject with ML based and natural language processing algorithms. The attribution to subject classes will offer insights into the library’s internal structure and its correlation with the colour system of the supralibros bindings.

Danube University Krems (UWK) will develop multiple coordinated visualisations from this multi-layered dataset, through which the composition, transformation and localization of the Bibliotheca Eugeniana collection can be further analysed and explored. To communicate these findings to the general public, a complementary narrative visualisation will be developed.

BED will disseminate the results using various formats for domain experts and the general public. All data created by the project will be made available via ONB’s Labs and shared with European Research Infrastructures according to FAIR principles. As a cooperation of a cultural heritage institution with a research institution, BED contributes to the DH Austria 2021 strategy by fostering knowledge transfer between both sectors.

Kontakt

Österreichische Akademie der Wissenschaften
Forschungsförderung – Nationale und Internationale Programme
Dr. Alexander Nagler
alexander.nagler(at)oeaw.ac.at
T +43 1 51581-1272

Event-Kalender

MTWTFSS
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 1718192021 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5