“Sharing the Experience: Workflows for the Digital Humanities”
Université de Neuchâtel, December 2019
In December 2019, the University of Neuchâtel hosted a second Swiss DARIAH workshop, organised by DARIAH-EU, the University of Neuchâtel and the SIB Swiss Institute of Bioinformatics within the DESIR project. Young scholars were invited to present their research in depth and to discuss together methodological, data management and research workflow issues. We are happy to present the papers resulting from this meeting.
- Session 1Voting on Faith: Digital Humanities and the Mapping of Reformation Ballots
- Session 2Supporting Sustainable Digital Data Workflows in the Art and Humanities
- Session 3A Conceptual Framework for Multilayer Historical Networks
- Session 4From Me to You: Peer-to-Peer Collaboration with Linked Data
- Session 5ArkeoGIS, feedback on the difficulty and interest of sharing digital data from archaeology
- Session 6The Grammateus Project: from Ancient Greek Papyri to a Web Application
- Session 7Absorbed in Goodreads. A Computational Approach for the Study of Online Social Reading
- Session 8Digital Critical Edition of Apocryphal Literature: Sharing the Pipeline
- Session 9The WoPoss Workflow: the Semantic Annotation of Modality in a Diachronic Corpus
You can annotate and discuss all the material here with the hypothes.is plugin.
Voting on Faith: Digital Humanities and the Mapping of Reformation Ballots
In the 16th century, ballots were held in a significant number of cities and villages under supervision of the Swiss cantons in order to choose between the reformed ideas and the teachings of the Roman Church. This workflow paper aims to demonstrate how using the symogih.org information system gives new insights in crucial aspects of the voting process. Developed by the LARHRA in Lyon, symogih.org is an open modular platform which stores geohistorical information and allows researchers to share knowledge in a collaborative environment, using a CIDOC CRM compatible, event-centred extensible model. It allows to collect and link together data in order to identify and analyse key configurations and processes about these ballots on faith, such as the definition of questions submitted to the voters; the identification of the actors allowed to cast a vote; the networks of alliances and overlapping jurisdictions characterising the relations between communes, parishes and magistrates. The paper will therefore show how digital humanities contributes to a better understanding why a vote was considered as the right path to a wise religious choice and under which circumstances such ballots were organized in Reformation times.
Speakers for this session
Marc Aberlé works at the University of Neuchâtel as a member of the SNF-funded Reformation, local assemblies and ballots project. He is currently writing a PhD thesis about the link between Protestantism and Democracy. In order to deliver a better understanding of this relation, he analyses the complex thought patterns and intellectual networks at the core of this idea, which brought many protestant thinkers and political actors to find themselves sharing common values.
Francesco Beretta is a CNRS research fellow. Since 2009 he is head of the Pôle histoire numérique within the Laboratoire de recherche historique Rhône-Alpes (LARHRA CNRS UMR 5190 – Universités de Lyon et Grenoble). Specialist in the history of Roman Inquisition, in the intellectual history of Catholicism and the history of science, he has taught in different universities in Fribourg, Lausanne, Paris, Lyon and Neuchâtel. In digital humanities, his domains of competence are notably in the field of data modelling and curation, ontologies, relational databases, GIS and semantic text encoding in XML/TEI. He contributed significantly to the establishment of the symogih.org and dataforhistory.org projects.
Fabrice Flückiger works at the Ludwig-Maximilian-Universität in Munich on “Les miroirs du magistrat”, a project about practices and representations of the Good Government in 16th century reformed cities funded by the SNF. He wrote a PhD on religious disputations held in the Swiss Confederacy during the early years of the Reformation, before joining the “Reformation, local assemblies and ballots” research project at the University of Neuchâtel. He is also an associate researcher at the Centre Européen des Études Républicaines (CEDRE) in Paris.
Supporting Sustainable Digital Data Workflows in the Art and Humanities
The Data and Service Center for the Humanities (DaSCH) operates as a platform for humanities research data and ensures access to this data and promotes the networking of data with other databases (linked open data), to ad value for further research and the interested public. As a competence center for digital methods and long-term use of digital data, it supports the hermeneutically oriented humanities in the use of state-of-the-art digital research methods. Focusing on qualitative data and associated digital objects (images, sound, video, etc.) in the cultural heritage field.
Long-term archiving or access is a major topic after the digital turn in the humanities, as many funding agencies such as the Swiss National Science Foundation and the European Commission are now requiring that a data management plan be in place in order to receive research funding. This new imperative raises many questions in the scientific community. This papers points out the contributions of the DaSCH for digital humanities researchers and the advantages of interoperability.
Speaker for this session
Vera Chiquet is a research assistant at the DigitalHumanities Lab of the University of Basel. She is doing research in the field of photography, visual studies, digital humanities, art history, cultural heritage preservation and computational photography. Besides research, lecturing and academic administration she works at the DaSCH as research assistant and consulting for project partners.
A Conceptual Framework for Multilayer Historical Networks
The technicality of network visualization applied to history and its relative novelty often result in a superficial use of a software, limited to describing a situation immediately extracted from a data set. This approach is justified in the exploratory phase of an analysis in most cases where the network is very explicitly present in the object studied. But the complexity of the entanglement of historical actors, places, institutions or temporal sequences makes finer modeling necessary if we want to go beyond a simplistic “datafication”.
To encourage curiosity towards other modes of analysis and put the data modeling (and therefore the historical sources) at the center of the research process, this article attempts to discuss what makes a historical network, its components, its relationships, its layers and its different facets. It offers a kind of visual guide to help historians follow a multilayer framework to think their research object from another (multidimensional) angle.
Speaker for this session
Martin Grandjean is a Junior Lecturer in Contemporary History at the University of Lausanne, Switzerland. His research focuses on the structuring of scientific and diplomatic circles in the interwar period. He specializes in the analysis of large volumes of archival data and works on the uses of network analysis and data visualization in history.
From Me to You: Peer-to-Peer Collaboration with Linked Data
In recent years, Digital Humanities’ collaborative nature caused a wake digitally-native research practice, where interdisciplinary workflows commonly feed into centralized data repositories. Connecting these repositories, the W3C’s Web Annotation specification builds upon Linked Data principles for targeting any web resource or Linked Data entity with syntactic and semantic annotation. However, today’s platform-centric infrastructure diminishes the distinction between institutional and individuals’ data. This poses issues of digital ownership, interoperability, and privacy of data stored on centralized services. With Hyperwell, we aim to address these issues by introducing a novel architecture that offers real-time, distributed synchronization of Web Annotations, leveraging contemporary Peer-to-Peer technology. Extending the Peer-to-Peer network, institutions provide Hyperwell gateways that bridge peers’ annotations into the common web. These gateways affirm a researcher’s affiliation while acting as a mere mirror of researchers’ data and maintaining digital ownership.
Speakers for this session
Jan Kaßel is pursuing his Master’s degree in Computer Science at Leipzig University, Germany. Jan’s work on Hyperwell, his thesis project, concerns digital ownership questions by introducing a local-first, Peer-to-Peer system supporting W3C standards.
Dr. Thomas Koentges is an Assistant Professor in Digital Humanities at Leipzig University, Germany. Thomas is a driven researcher and teacher who likes to advance digital tools in Classics and the Arts.
ArkeoGIS, feedback on the difficulty and interest of sharing digital data from archaeology
After more than a decade online, the ArkeoGIS project illustrates the benefits of data sharing. Thanks to free software bricks, and with the precious help of the TGIR HUMA-NUM of the CNRS, this spreadsheet sharing platform has shown its efficiency. Each user can freely choose his language, his chronology and the data he wishes to share. With more than 100 database extracts from professionals, research grants and advanced students, the tool now offers more than 100,000 spatialized information about the past - in the Upper Rhine and also worldwide according to users needs. In this contribution, good practices, brakes and accelerators of data sharing by archaeologists and (paleo-) environmentalists within the ArkeoGIS platform will be discussed, with the hope of generating more sharing in the digital humanities.
Speaker for this session
Loup BERNARD is teaching at the University of Strasbourg since 2007. His work as an archaeologist on the Celtic settlements and territories in Provence and Southern Germany led him to develop an online GIS, ArkeoGIS. It is a multilingual, multichronological online and free tool. The platform allows the aggregation and querying of database extracts in order to share existing data between archaeologists and paleoenvironmentalists. After over a decade of use, ArkeoGIS has proved helpful for datasharing and linking open spatialized data and offers nowadays over 100 different datasets to professional researchers.
The Grammateus Project: from Ancient Greek Papyri to a Web Application
This paper describes the workflow of the Grammateus project, from gathering data on Greek documentary papyri to the creation of a web application. The first stage is the selection of a corpus and the choice of metadata to record: papyrology specialists gather data from printed editions, existing online resources and digital facsimiles. In the next step, this data is transformed into the EpiDoc standard of XML TEI encoding, to facilitate its reuse by others, and processed for HTML display. We also reuse existing text transcriptions available on http://papyri.info/. Since these transcriptions may be regularly updated by the scholarly community, we aim to access them dynamically. Although the transcriptions follow the EpiDoc guidelines, the wide diversity of the papyri as well as small inconsistencies in encodings make data reuse challenging. Currently our data is available on an institutional GitLab repository, and we will archive our final dataset according to the FAIR principles.
Speaker for this session
Elisa Nury is a postdoctoral researcher at the University of Geneva for the Grammateus project on Greek documentary papyri. In 2018, she completed a Ph.D. in Digital Humanities at the University of King’s College London, UK, on the topic of automated collation tools and digital critical editions. She graduated from the University of Lausanne, Switzerland, with a specialisation in History of the Book and Critical Edition. Her research interests include Latin literature, digital humanities and digital scholarly editing.
Absorbed in Goodreads. A Computational Approach for the Study of Online Social Reading
We present our method and interim results of the “Mining Goodreads” project, aimed at developing a computational approach to measure reading absorption in user-generated book reviews in English. Annotation of 600 texts showed the difficulties in finding an agreement in the tagging of sentences. However, the parallel work of five annotators offered the opportunity to distant read the language used by reviewers when talking about absorption. Machine learning approaches were applied on the annotated corpus, producing promising results.
Speakers for this session
Simone Rebora holds a PhD in Foreign Literatures and Literary Studies (University of Verona) and a BSc in Electronic Engineering (Polytechnic University of Torino). Currently, he works as a research fellow at the Digital Humanities Lab of the University of Basel and he teaches comparative literature at the University of Verona. His main research interests are theory and history of literary historiography and reader response studies. In the field of digital humanities, he focused on tools and methods like OCR, stylometry, and sentiment analysis.
Piroska Lendvai studied Anglo-Saxon and Slavic philology (Pécs, Hungary). Being inspired by seminal science fiction literature such as 'Blade Runner' and 'Neuromancer', she decided to do a PhD in a field that combines languages and AI - natural language processing (Tilburg, Netherlands). After spending several years on training algorithms to understand what people do when they must talk to a machine, for example about about train connections ('Computer, I never told you I wanted to go to Mügli am See! Restart again!'), or when they must sort zoology metadata into museum databases ('Computer, 'dead humid leaves' should be the finding place of this animal, not the cause of its death!'), she decided to go back to her roots, and has taken up a position at the Research Institute for Linguistics (Budapest, Hungary). Currently, she is affiliated to the Digital Humanities Lab of the University of Basel (Switzerland), where she supports research in Humanities and Social Sciences via tools and approaches from language technology.
Moniek’s research focuses mainly on absorbing reading experiences. She did her PhD in empirical literary studies at Utrecht University in The Netherlands, investigating the textual features that can lead to absorption during reading. Her first post-doc at the Max Planck Institute for Empirical Aesthetics in Frankfurt, Germany focused on the personality traits that predict absorbed reading and the eye movement correlates of absorbed reading. She is currently a post-doctoral researcher at the Digital Humanities lab in Basel, Switzerland, where she is a PI on a SNSF-funded “Digital Lives” research project teaching machine learning algorithms to detect instances of absorption in online reader reviews. She is also a board member of IGEL (the International Society for the Empirical Study of Literature) in charge of their training school program, and a member of PALA (Poetics and Linguistics Association) and E-READ (Evolution of Reading in the Age of Digitization). Some of her other research interests involve (absorbed) reading and well-being, bibliotherapy, story literacy, reading habits, and psychometrics.
Digital Critical Edition of Apocryphal Literature: Sharing the Pipeline
The emerging field of Digital Scholarly Editing, concerned with the application of the digital paradigm to textual criticism, offers a range of software solutions for assisting critical editors in their task of documenting textual variation. But how to go from a set of disparate of tools and resources to an integrated pipeline, where the data travels seamlessly from one format to another while meeting the requirements of each component? In this paper, we explain how we build and share an integrated processing pipeline that takes us from manuscript transcriptions in TEI XML format to a graph representation of textual variation, which constitutes the basis for the editorial work and the creation of the actual edition. With Docker Compose as the only technical prerequisite, running the pipeline is only one command away: the environments needed to run each software component are set up automatically, the processing begins, and at the end, a web server is launched which displays the automatically‑built variant graphs ready for manual analysis using a dedicated online tool, Stemmaweb. This is an example of how technological advances are exploited to alleviate the technical burden put on editors.
Speaker for this session
Violeta Seretan is a Senior Researcher in Digital Humanities at the University of Lausanne and a member of the ENLAC SNSF-funded project on digital critical editing of apocryphal literature (2017-2021). She earned a M.Sc. in Computer Science from the University of Iasi and a Ph.D. in Computational Linguistics from the University of Geneva. Between 2002 and 2010, she conducted research in Natural Language Processing at the Department of Linguistics of the University of Geneva, with a focus on the automatic identification of phraseological units and their treatment in syntactic parsing and machine translation. Her monograph “Syntax‑Based Collocation Extraction” was awarded the Latsis Prize, the most prestigious academic distinction in Switzerland. After a post-doctoral fellowship at the Institute for Language, Cognition and Computation of the University of Edinburgh, in 2011 she joined the Department of Translation Technology of the University of Geneva, where she conducted research on pre-editing and post-editing for machine translation in the framework of the ACCEPT European project she was a coordinator of. She has taught extensively and authored more than 50 publications on topics related to human language technology and digital humanities, including syntactic parsing, machine translation, lexical acquisition, text simplification, and digital scholarly editing.
The WoPoss Workflow: the Semantic Annotation of Modality in a Diachronic Corpus
The FNS project A world of possibilities (WoPoss) studies the evolution of modal meanings in the Latin language. Passages expressing modal notions such as ‘possibility and ‘necessity’ are annotated following a pipeline that combines both automatic and manual annotation. This paper discusses the creation, annotation and processing of the WoPoss corpus. Texts are first gathered from different online open access resources to create the initial dataset. Due to the heterogeneity of formats and encodings, these texts are regularized before the application of an automatic linguistic annotation. The annotated files are then uploaded to the annotation platform INCEpTION. Through this platform, annotators add the relevant linguistic and semantic information following the WoPoss guidelines. The results of the automatic annotation are also curated. The fine-grained semantic annotation is the core activity of the WoPoss workflow, thus this paper focuses on the preparation of files and how the semantic annotation task is tackled.
Speakers for this session
Francesca Dell’Oro is SNSF assistant professor of diachronic linguistics at the University of Lausanne and the PI of the WoPoss SNSF project on modalisation paths in the Latin language. She is a historical linguist with a strong interest in semantics. Her research and expertise spans from the linguistic analysis of ancient documents in their material context and the epistemological analysis of key concepts in the history of linguistics to the development of complementary methods which combine computational tools and philological expertise.
Helena Bermúdez Sabel
Helena Bermúdez Sabel holds a PhD in Medieval Studies from the Universidade de Santiago de Compostela (2019). Her doctoral research focused on the exploration of digital scholarly editing models for the study of linguistic variation. She is currently a postdoctoral researcher in the SNSF project A World of Possibilities (WoPoss) hosted at the University of Lausanne. Before this position, she has participated in projects at the intersection of computational methods and philology. Helena is proficient in Semantic Web and XML technologies. She is especially interested in data modeling and the formalization of annotation schemes.
Paola Marongiu is a PhD student in Linguistics at the University of Lausanne, Section des Sciences du Langage et de l’Information, and Assistant in the FNS project A world of Possibilities (WoPoss). Her research interests include Theoretical Linguistics, with a focus on Latin and Modality, Digital Humanities and Computational Linguistics. In her PhD thesis she performs a quantitative and qualitative analysis of the co-occurrence of modal markers in Latin, employing computational methods and resources. In 2018 she obtained a Master degree in Theoretical and Applied Linguistics from the University of Pavia, with a thesis focused on the conversion of the Index Thomisticus Treebank into the Universal Dependencies annotation style. She has a Bachelor in Modern Studies, obtained from the University of Bologna in 2016.