E-Book
7 Tage kostenlos im Abo
E-Book
9,99 €

Digitale Edition in Österreich. Digital Scholarly Edition in Austria. E-Book

0,0

Erhalten Sie Zugang zu diesem und mehr als 300000 Büchern ab EUR 5,99 monatlich.

Herausgeber: Books on Demand
Kategorie: Fachliteratur
Serie: Schriften des Instituts für Dokumentologie und Editorik
Sprache: Deutsch

Between 2016 and 2020 the federally funded project "KONDE - Kompetenznetzwerk Digitale Edition" created a network of collaboration between Austrian institutions and researchers working on digital scholarly editions. With the present volume the editors provide a space where researchers and editors from Austrian institutions could theorize on their work and present their editing projects. The collection creates a snapshot of the interests and main research areas regarding digital scholarly editing in Austria at the time of the project.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Kindle™-E-Readern
(für ausgewählte Pakete)

Seitenzahl: 362

Veröffentlichungsjahr: 2023

Das E-Book (TTS) können Sie hören im Abo „Legimi Premium” in Legimi-Apps auf:

Android

iOS

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Ähnliche

BESTSELLER

Desire – Die Zeit der Rache ist gekommen

Lisa Jackson

BESTSELLER

Wolkenschloss (Ungekürzte Lesung)

Kerstin Gier

BESTSELLER

The Deadly Side of Love

Francis Eden

BESTSELLER

A Dark and Secret Magic (Ungekürzte Lesung)

Wallis Kinney

BESTSELLER

Not Quite Dead Yet (Ungekürzt)

Holly Jackson

BESTSELLER

Versprich mir, dass du tanzt (Ungekürzte Lesung)

Dani Atkins

BESTSELLER

Die Verlorene (Autorisierte Lesefassung)

Miriam Georg

BESTSELLER

Lost Girls - Breathing for the First Time - Lost-Girls-Dilogie, Band 1 (Ungekürzte Lesung)

Nikola Hotel

BESTSELLER

Davyan (Band 1): Der Aschenprinz

C.M. Spoerri

BESTSELLER

Der Laden in der Mondlichtgasse (Ungekürzte Lesung)

Hiyoko Kurisu

BESTSELLER

Nightblood Prince - Nightblood Prince, Band 1 (Ungekürzte Lesung)

Firebird - Flammensturm, Teil 1 (Ungekürzt)

Der Sheriff und die Fremde

Gentle Heart - Scarlet Luck, Teil 3 (Ungekürzt)

Until I Get You - Fairview Hockey, Teil 1 (Ungekürzt)

herausgegeben von:

Bernhard Assmann

Roman Bleier

Alexander Czmiel

Stefan Dumont

Oliver Duntze

Franz Fischer

Christiane Fritze

Ulrike Henny-Krahmer

Frederike Neuber

Christopher Pollin

Malte Rehbein

Torsten Roeder

Patrick Sahle

Torsten Schaßan

Gerlinde Schneider

Markus Schnöpf

Martina Scholger

Philipp Steinkrüger

Nadine Sutor

Georg Vogeler

Preface

Einleitung – Introduction

Georg Vogeler Einleitung: Gibt es eine österreichische Editionskultur?

Methodische Aufsätze – Methodological essays

Tara L. Andrews Where are the Tools? The Landscape of Semi-Automated Text Edition

Peter Hinkelmanns Editionen und Graphentechnologie: Vorteile und Hürden digitaler Editionstechniken abseits von TEI-XML

Claudia Resch Digitale Editionen aus der Perspektive von Expert:innen und User:innen — Rezensionen der Zeitschrift RIDE im Meta-Review

Markus Ender, Joseph Wang Experience with a Workflow using MS Word and a DOCX to TEI Converter

Bernhard Oberreither A Linked Data Vocabulary for Intertextuality in Literary Studies, with some Considerations Regarding Digital Editions

Daniel Schopper, Thomas Wallnig, Victor Wang “Don’t Worry, We’re also Doing a Book!” — A Hybrid Edition of the Correspondence of Bernhard and Hieronymus Pez OSB

Sarah Lang Digital Scholarly Editions of Alchemical Texts as Tools for Interpretation

Martina Bürgermeister Versionierung von digitalen Editionen in der Praxis

Projektberichte – Project reports

Tara L. Andrews Die Chronik des Matthäus von Edessa (Matt'ēos Urhayec'i) Online

Roman Bleier, Eva Ortlieb, Florian Zeilinger Der Regensburger Reichstag 1576 — digital

Astrid Böhm, Julia Eibinger, Helmut W. Klug, Christian Steiner CoReMA — Cooking Recipes of the Middle Ages

Artur R. Boelderl MUSIL ONLINE – interdiskursiver Kommentar

Ingo Börner, Vanessa Hannesschläger, Isabel Langkabel, Katharina Prager Intertextuality in the Legal Papers of Karl Kraus: A Scholarly Digital Edition

Ulrike Czeitschner travel!digital

Ursula Doleschal, Lisa Rieger Zweisprachiger Spracherwerb: Longitudinalstudie anhand schriftlicher Texte der Hermagoras-Volksschule — Digitalisierung, Auszeichnung, Auswertung

Claudia Dürr, Wolfgang Straub Kommentierte Werkausgabe Werner Kofler (Prosa, Lyrik, Radio, Film, Theater)

Markus Ender Kommentierte Online-Edition des Gesamtbriefwechsels Ludwig von Ficker

Konstanze Fliedl, Ingo Börner, Anna Lindner, Marina Rauchenbacher, Isabella Schwentner Arthur Schnitzler — Kritische Edition (Frühwerk) III

Desiree Hebenstreit, Laura Tezarek, Christiane Fritze, Christoph Steindl Andreas Okopenko: Tagebücher aus dem Nachlass (Hybridedition)

Peter Hinkelmanns, Katharina Zeppezauer-Wachauer Mittelhochdeutsche Begriffsdatenbank (MHDBDB)

Mario Klarer, Aaron Tratter, Hubert Alisade Ambraser Heldenbuch: Transcription and Scientific Dataset

Carina Koch, Lisa Brunner, Anna Huemer, Christoph Würflinger Digitale Edition und Analyse der Medialität diplomatischer Kommunikation: Kaiserliche Gesandte in Konstantinopel in der Mitte des 17. Jahrhunderts

Philipp Koncar, Roman Bleier InCritApp — Interactive Critical Apparatus

Sarah Lang, Ursula Gärtner Grazer Repositorium antiker Fabeln (GRaF, 2017–2019)

Verena Lorber, Joseph Wang-Kathrein Franz und Franziska Jägerstätter Edition

Oliver Matuschek, Christopher Pollin, Lina Maria Zangerl Stefan Zweig digital

Frederike Neuber Stefan George Digital

Helmut Neundlinger, Selina Galka Karl Wiesinger: Digitale Edition der Tagebücher (1961–1973)

Werner Petermandl, Elisabeth Steiner Celtic Divine Names in the Inscriptions of the Roman Province Germania Inferior

Claudia Resch ABaC:us — Austrian Baroque Corpus

Claudia Resch, Nora Fischer, Dario Kampkaspar, Daniel Schopper DIGITARIUM — Das Wien[n]erische Diarium digital

Sabine Seelbach Virtuelle Benediktinerbibliothek Millstatt

Thomas Wallnig Die gelehrte Korrespondenz der Brüder Pez (Hybridedition)

Appendix

Biographical Notes

Publications of the Institute for Documentology and Scholarly Editing / Schriftenreihe des Instituts für Dokumentologie und Editorik

Preface

The present volume, Digital Scholarly Edition in Austria, is a collection of essays that originated in the context of the federally funded project “KONDE - Kompetenznetzwerk Digitale Edition” that was conducted at the University of Graz in the years 2016-2020 and included partners from 8 Austrian organizations ranging from literary archives to universities. The overall focus of the project was on digital scholarly editing, but the goals of the project were manifold originating from the partner organisations diverse backgrounds. One major goal was the development of best-practice solutions for publication platforms where digital scholarly editions are closely integrated into repositories for long-term preservation. Another was the development of a strategic concept for the bundling of competences to establish a national infrastructure for digital scholarly editions which meets the constantly changing requirements of modern scholarly editing and research. This includes the development and maintenance of tools for the processing of digital material and their systematic evaluation. One major output oft the project was the Weißbuch Digitale Edition, which can be seen as an introductory tool for scholars and students who want to venture into the world of digital scholarly editing.

With the present volume the editors provided a space where researchers and editors from Austrian institutions could theorize on their work and present their editing projects. The collection creates a snapshot of the interests and main research areas regarding digital scholarly editing in Austria at the time of the project.

We are grateful to our colleagues in the scholarly network KONDE. Their interests and expertise provide the context of this publication. Special thanks are due to the staff of the Centre for Information Modelling at the University of Graz and the members of the IDE who supported the publication and, in many cases, contributed to it as peer-reviewers or authors. Furthermore we want to thank all authors and peer-reviewers for their professional cooperation during the publication process; it is solely the editors who are to blame for any delay in the publication process. We also want to thank the many people involved in creating the present volume: Karin Kranich and Nicholas Martin for language corrections and formal suggestions, Bernhard Assmann and Patrick Sahle for support and advises during the typesetting and publication process, Stefan Dumont for creating the cover, Elisabeth Raunig for verifying and archiving this volume’s referenced URLs, if possible, in December 2022 on the Internet Archive Wayback Machine (archive.org).

Graz, January 2023, the editors

Einleitung

Introduction

Einleitung: Gibt es eine österreichische Editionskultur?

Georg Vogeler

Die internationale Forschung zum Thema „digitale Edition“ ist aktiv. Erst jüngst hat Christopher Ohge (2021) einen Band vorgelegt, der das Thema als Problem der Publikationstechnologien zu beschreiben versucht. Andreas Oberhoff (2021) macht Vorschläge, wie das spezifische Verhältnis zwischen Veränderung und Referenzierbarkeit, das einer digitalen Edition inhärent ist, technologisch realisiert werden könnte. Eine Schweizer Bestandsaufnahme ist von der Schweizerischen Akademie der Geistes- und Sozialwissenschaften (2021) vorgelegt worden. Die Rolle von XML/TEI als Standard digitalen Edierens ist durch die verschiedene Beiträge zur Einbindung von digitalen Editionen in das Semantic Web und die Verwendung von Graphentechnologien als Werkzeuge digitaler Editionen (Spadini, Tomasi und Vogeler 2021) herausgefordert worden. Verschiedene technische Lösungen verbreiten sich als standardisierte Werkzeuge zur Verbreitung von digitalen Editionen (z.B. EVT1 oder TEI-Publisher2). Automatische Transkriptionsverfahren (HTR) dringen zunehmend in die Planung und Durchführung digitaler Editionen ein, ohne dass sie methodisch schon abschließend positioniert worden sind (z.B. Beloborova, Dillen und Schäuble 2018, 9). Die Forschung diskutiert das Thema immer auch in Auseinandersetzung mit den editionswissenschaftlichen Debatten der Philologien, wofür die Einführung in die digitale Edition von Pierazzo und Mancinelli (2020) ein gutes Beispiel ist, während andere Wissenschaften, wie z.B. die Geschichtswissenschaften das Problem erst jüngst aufgreifen (Vogeler, Pollin und Bleier 2022). Pierazzo und Mancinelli diskutierten explizit die Frage einer nationalen Editionskultur im Kontrast zur internationalen Forschung zur digitalen Edition (17). Gibt es also eine nationale österreichische Editionskultur?

Dieser Band versammelt Forschungsergebnisse aus einem vom österreichischen Bundesministerium für Bildung, Wissenschaft und Forschung finanzierten Projekt. Das „Kompetenznetzwerk ‚Digitale Edition‘“ (KONDE) hatte sich zum Ziel gesetzt, die österreichischen Akteure im Feld digitaler Edition so mit einander zu vernetzen, dass Synergien sichtbar werden, die Partner sich im Austausch gegenseitig befruchten sollten und nationale Infrastrukturen entstehen. Der vorliegende Band ist ein Ergebnis des Projektes. Weitere sind unter https://digitale-edition.at zugänglich, wozu insbesondere das „Weißbuch ‚Digitale Edition‘“ (Klug 2021) zu zählen ist, das in 219 für die digitale Edition relevante Begriffe einführt, sie untereinander vernetzt und in 25 Projektbeschreibungen exemplifiziert. Der hier vorgelegte Band versammelt nun in einem methodischen Teil wissenschaftliche Beiträge von Kolleginnen und Kollegen aus den Partnerprojekten und überführt die Projektbeschreibungen der digitalen Ressource in die Gutenberg-Galaxis.

Im methodischen Teil werden grundsätzliche Fragen verhandelt. Ich selbst hatte 2017 einen Überblick zu geben versucht, aus welchen Komponenten digitales Editieren besteht und wie weit die technischen Entwicklungen in den jeweiligen Feldern sind (Vogeler 2019). Ich hatte das 2017 in einem Vortrag in folgende Übersicht zu bringen versucht (Abbildung 1):

Abbildung 1: Schematische Darstellung von Komponenten einer digitalen Edition.

Digitale Edition wird darin mit Patrick Sahle als „Erschließende Wiedergabe historischer Dokumente unter einem digitalen Paradigma“ verstanden (Sahle 2016). Die historischen Dokumente sind die Quelle der digitalen Edition, die erarbeitet, archiviert und publiziert wird, um dann für weitere Analyse und Benutzung zur Verfügung zu stehen. Die blauen Kästen des Diagramms stehen also für technische Systeme, die bestimmte Stadien in der Erstellung der digitalen Edition abbilden. Die dabei entstandenen Daten werden von einer dieser Komponenten in die nächste übertragen, wenn sie nicht in integrierten Systemen zusammenfallen. Die orangen Blöcke bezeichnen Aktivitäten, in denen Daten (grün) entstehen oder verändert werden: die digitale Dokumentation der Quellen in Bilddigitalisaten und Metadaten zu den Objekten, die Ergebnisse der Transkription und Textkritik in den Textdaten mit einschlägigen Metadaten die Annotation und Ergebnisse der Textkritik in Annotationsdaten und einschlägigen Metadaten. In der Erfassungsumgebung wird die eigentliche Editionsarbeit dokumentiert und in einer „Staging“-Umgebung werden erste Ansichten und Benutzerinterfaces erzeugt, die die Editionsarbeit unterstützen sollen. Diese müssen nicht identisch sein mit der Form, in der die digitale Edition für die Öffentlichkeit zugänglich gemacht wird, wenn die Vorschau z.B. organisatorische Funktionalitäten wie Statusnotizen oder Validierungshinweise enthält. Das Diagramm ließe sich an vielen Editionsprojekten und technischen Lösungen exemplifizieren: ediarum3 unterstützt z.B. die Transkription und Annotation und enthält eine Vorschau, die auch als Publikationsumgebung genutzt werden kann. Das integrierte System GAMS (Stigler und Steiner 2018)4 verbindet Publikation und Archivierung. Workflow-Management-Systeme wie Kitodo legen ihren Schwerpunkt auf die Dokumentation der Quelle. Publikationsframeworks wie das Edition Visualisation Toolkit (EVT) (Rosselli del Turco und di Pietro 2019) oder der TEI-Publisher konzenrieren sich auf die Unterstützung der Benutzung durch flexible und mächtige graphische Interfaces. Die Beiträge dieses Bandes diskutieren nun verschiedene Aspekte des Diagramms.

Die Komponenten des Schemas realisieren sich in verschiedenen technischen Werkzeugen, die kontinuierlich in Entwicklung sind, aber an sich das gesamte Feld abzudecken scheinen (Vogeler 2019).5 Tara Andrews greift diesen praktischen Zugang zur digitalen Edition in ihrem Beitrag auf und versucht auf dem aktuellen Stand Orientierung zu geben, welche in Österreich entwickelten Werkzeuge in einem typischen Prozess editorischen Arbeitens zur Anwendung kommen könnten und damit die Planung einer digitalen Edition zu unterstützen. Sie konzentriert sich auf in Österreich entwickelte Tools, die gute Beispiele für allgemeine Probleme sind, insbesondere die Abhängigkeit der Werkzeugwahl von den in der jeweiligen Edition angewendeten Konzepten. In Österreich sind Werkzeuge wie der Classical Text Editor, Transkribus, Recogito oder die DSE-Baseapp entstanden, die bei unterschiedlichen Aufgaben die Editorinnen und Editoren unterstützen. Sie bringen aber alle Eigenschaften mit sich, die zu Entscheidungen zwingen. Diese Entscheidungen liegen bei den Editorinnen und Editoren und ergeben sich aus den Spezifika der jeweiligen Edition.

In den letzten Jahren werden zunehmend Verfahren gestestet, Graphentechnologien in digitalen Editionen einzusetzen. Erst jüngst haben Elena Spadini, Francesca Tomasi und Georg Vogeler (2021) eine Sammlung von Beiträgen zum Thema vorgelegt. PETER HINKELMANNS gibt einen Überblick über diese Technologien im Feld digitaler Editionen, der einen ersten Leitfaden für ihren Einsatz bietet. Graphentechnologien sind insbesondere für überlappende Strukturen, Variantengraphen und semantische Annotationen im Bereich der digitalen Edition im Einsatz und können, so Hinkelmanns, in Kombination mit XML/TEI eine gute Wahl für digitale Editionen sein.

Aus Benutzersicht nähert sich Claudia Resch digitalen Editionen über Rezensionen, die sie korpuslinguistisch auswertet. Mit diesem Verfahren kann sie in den 35 von ihr untersuchten Texten im Review Journal of the IDE (RIDE) Muster von Unzufriedenheit und Verbesserungsvorschlägen ausmachen. Sechs Kritikpunkte werden immer wieder aufgegriffen: mangelnde theoretische Einordnung, zu wenig Transparenz der editorischen Verfahren, Schwächen im Interface, unzureichende Suchfunktionen, zu geringe Auswahl an Darstellungsformen und fehlender Zugang zu den Rohdaten. Claudia Resch schlägt vor, die Nutzerinnen und Nutzer frühzeitig in die Entscheidungen über die Präsentation der Edition einzubinden, weist aber auch auf einen langsamen Bewusstseinswandel unter den Editorinnen und Editoren hin in ihrer Bereitschaft, Quelldaten zugänglich zu machen. Dass hier schon die Rückwirkung aus den sechs Jahren RIDE-Rezensionen vorliegt, ist für sie anzunehmen.

In einem österreichischen Projekt ist als Editionsumgebung eine Pipeline von XSLT-Transformationen entstanden, mit der Joseph Wang und Markus Ender XML/TEI aus DOCX-Dokumenten extrahieren: „DOCX2TEI“. Die Methode weicht vom in der TEI-Community verbreiteten Verfahren in Oxgarage6 insofern ab, als sie projektspezifische Textstrukturen und semantische Annotationen in Form von Word-Kommentaren verarbeiten kann. Mit DOCX2TEI machen sie damit MS Word zu einem projektspezifischen XML-Editor, der eine Brücke zwischen fachlichen Projektmitarbeiterinnen und Projektmitarbeitern und technischer Infrastruktur einer digitalen Edition baut.

Der Umstand, dass digitale Editionspraxis von den verwendeten Datenmodellen abhängig ist, verleiht dem Vorschlag für eine formale Beschreibung von intertextuellen Bezügen von BERNHARD OBERREITHER besondere Bedeutung. Intertextualität ist der Kern der Kommentierungsleistung, die „Quellen“ eines Textes zu identifizieren sucht. INTRO — the Intertextual Relationships Ontology versucht, eine pragmatische Balance zwischen einfachen Grundkonzepten und komplexen Abstraktionen zu finden — und dabei insbesondere dem Umstand Rechnung zu tragen, dass die Feststellung intertextueller Bezüge eine Interpretation der Editorin und des Editors ist.

Die Definition digitaler Edition von Patrick Sahle (2013, 2016), grenzt Editionen im digitalen Paradigma unter anderem dadurch ab, dass sie Eigenschaften besitzen, die sich nicht im Druck realisieren lassen würden. Daniel Schopper, Thomas Wallnig und Victor Wang machen einen Vorschlag, wie man beide Medienwelten in hybriden Editionen verbinden kann. Am Beispiel der Kooperation zwischen der Edition der Korrespondenz der Gebrüder Pez und dem Wiener Böhlau Verlag können sie zeigen, dass eine gedruckte Edition als Derivat einer digitalen Edition durchaus sinnvoll ist. Sie gehen dabei von Benutzungsszenarien aus, in denen das physische Objekt „Buch“ dem digitalen Medium überlegen ist und schlagen ein Szenario vor, in dem diese Eigenschaften in eine komplexe Infrastruktur von digitalem Archiv und graphischer Benutzerschnittstelle integriert sind. Für die Nachnutzbarkeit der Infrastruktur legt sich die Pez-Edition auf ein Datenmodell fest und leitet eine Strategie für die Normalisierung zukünftiger Datensätze daraus ab. So verteilen sich die Aufgaben der Publikation einer Edition auf Verlag und Forschungsdatenrepositorium.

Digitale Editionen können Teil einer automatischen Verarbeitung von Text sein. Sarah Lang diskutiert, mit welchen Textmining-Methoden die Texte des Hieroalchemisten Michael Maier untersucht werden könnten. Sie müssen dabei die Texte semantisch anreichern, da reine Bag-of-Words-Methoden nicht ausreichen, die Verschleierungstaktiken der Alchemisten zu durchschauen. Mit der Verbindung zu formalen Wissensressourcen können die sogenannten „Decknamen“ kontextualisiert und damit verständlich gemacht werden. Es wird damit noch einmal deutlich, wie wichtig die kritische Erschließung aus der oben zitierte Definition von Patrick Sahle für das Verständnis digitalen Edierens ist.

MARTINA BÜRGERMEISTER diskutiert die Folgen der Veränderbarkeit digitaler Edition. Ihre Auswertung von existierenden Praktiken von Versionierung kann vier Strategien identifizieren, die auch in einander greifen können: Zusammenfassende Dokumentation in einer Änderungsbeschreibung, Dokumentation einzelner Änderungsschritte in Revision Descriptions der einzelnen Dokumente, technische Versionskontrolle bei der Datenerzeugung und Versionierung des Gesamtsystems. Die von ihr beobachteten Strategien werden aber nur in einem Viertel der untersuchten digitalen Editionen überhaupt angewendet. Das demonstriert die Reichweite von Martina Bürgermeisters Ergebnissen: So sehr die Dokumentation von Änderungen die Nachvollziehbarkeit von Forschungsarbeit und die Zitierbarkeit von digitalen Editionen unterstützt und damit das Vertrauen in digitale Editionsformen stärkt, so sehr ist auch notwendig, Praktiken der Versionierung überhaupt erst zu etablieren.

Die Landschaft digitalen Edierens in Österreich bilden die Kurzbeschreibungen von 25 Projekten ab, die sich im zweiten Teil dieses Bandes versammeln. Dazu gehören Editionen „österreichischer“ Autorinnen und Autoren wie Arthur Schnitzler (FLIEDL, BÖRNER, LINDNER, RAUCHENBACHER und SCHWENTNER), Karl Kraus (BÖRNER, HANNESSCHLÄGER, LANGKABEL, PRAGER), Robert Musil (BOELDERL), Werner Kofler (DÜRR und STRAUB), Ludwig von Ficker (ENDER), Karl Wiesinger (NEUNDLINGER und GALKA), Andreas Okopenko (HEBENSTRIET, TEZAREK und FRITZE), Franz und Franziska Jägerstätter (LOBER und WANG-KATHREIN), oder des Ambraser Heldenbuchs (KLARER, TRATTER und ALISADE). Die Gebrüder Pez (WALLNIG) sind Österreicher, aber zeigen in ihrer internationalen Vernetzung ebenso wie die Korrespondenz Habsburgischer Diplomaten in Konstantinopel (KOCH, BRUNNER, HUEMER und WÜRFLINGER), dass das Konzept „österreichische Autoren“ nicht sehr weit reicht, um eine österreichische Editionslandschaft zu beschreiben. In Österreich werden nämlich auch die Chronik des Matthäus von Edessa (Andrews), die Akten des Reichstags von 1576 (BLEIER, ORTLIEB und ZEILINGER), mittelalterlichen Kochrezepte (Böhm, Eibinger und Klug), Stefan George (Neuber) oder Inschriften der römischen Provinz Germania Inferior (PETERMANDL und STEINER) ediert, deren Texte kaum noch expliziten Österreichbezug aufweisen. Schließlich finden sich unter den Projekten auch Aktivitäten, die am Rande dessen bewegen, was man mit der Edition von Patrick Sahle als digitale Edition fassen kann: Linguistisch annotierte Korpora wie das Austrian Baroque Corpus ABACUS (Resch) oder das Korpus der Baedeker-Reiseführer (Czeitschner) sind literarische und kulturhistorische Quellen ersten Ranges. Das gilt auch für retrodigitalisierte Textsammlungen wie das DIGITARIUM, die digitale Repräsentation der ältesten Ausgaben der Wiener Zeitung (Resch, Fischer und Kampkaspar). In der Mittelhochdeutschen Begriffsdatenbank werden die Texte als Belege für lexikographische Arbeit zusammengestellt (HINKELMANNS UND ZEPPEZAUER-WACHER), im Grazer Repositorium antiker Fabeln die für Unterrichtszwecke kommentiert (LANG und GÄRTNER). Als Vorbereitung für linguistische Arbeit dient das Korpus schriftlicher Texte der Hermagoras-Volksschule (DOLESCHAL und RIEGER). Am Rande digitalen Edierens steht auch die Dokumentation und Digitalisierung von Originalen wie in der virtuellen Benediktinerbibliothek Millstatt (SEELBACH) oder in der Nachlassdokumentation „Stefan Zweig digital“ (MATUSCHEK, POLLIN und ZANGERL). Das Werkzeug zur Visualisierung eines textkritischen Apparats (KONCAR und BLEIER) ist selbst keine digitale Edition, kann aber gut ein Teil einer digitalen Edition werden. All diese Projekte bilden Teile dessen ab, was ich oben in der Graphik der Komponenten digitalen Edierens zusammenzufassen versucht habe, auch wenn sie nicht immer das zum Ziel haben, was Sahles Definition einer digitalen Edition beschreibt. Die Abgrenzung zwischen „Digitalisierung“ und „Digitaler Edition“, wie sie von Kenneth Price (2009) und Patrick Sahle (2007) und jüngst von Woud Dillen (2019) diskutiert worden ist, bleibt weiterhin schwierig und hat z.B. im Rezensionorgan RIDE, das sich auf kritische Vorstellung von digitalen Editionen spezialisiert hat,7 dazu geführt, dass in den Jahren 2017 und 2018 „Textsammlungen“ anhand eines speziellen Kriterienkatalog rezensiert wurden,8 der sich an den vom IDE für digitale Editionen entwickelten anlehnt (Henny-Krahmer und Neuber 2017). Die Projekte zeigen aber auch, dass sich Abgrenzungsprobleme insbesondere daraus ergeben, dass das Bedürfnis nach einer wissenschaftlich verlässlichen digitalen Textrepräsentation auch von Forschungsinteressen geleitet sein kann, in denen Texte anders aufgefasst werden als in der klassischen Editionsphilologie.

Gibt es nun eine österreichische Editionskultur? Die in Österreich diskutierten methodischen Überlegungen, die im Laufe des Projektes angewendeten Techniken und Methoden, das Weißbuch und die Code-Beispiele im Projektrepositorium9 sind keineswegs auf Österreich beschränkt. Die im Laufe des Projektes entstandenen digitalen Editionen und editionsähnlichen Datensätze haben thematisch einen leichten „nationalen“ Bezug, denn Editionen von österreichischen Autorinnen und Autoren sowie Textüberlieferung in Österreich haben selbstverständlich eine nationale Komponente. Diese Komponente produziert aber noch keinen nationalen Diskurs, denn die digitalen Verfahren sind Beiträge zur internationalen Forschung. „Nationale“ Wissenschaft ist aber immer mehr ein soziales als ein epistemologisches Phänomen. Es kann sich also gut noch aus der Zusammenarbeit im Projekt eine nationale Editionskultur entwickeln, die wie im Projekt selbst in fruchtbarem Austausch mit den Kolleginnen und Kollegen außerhalb Österreichs steht. Die hier versammelten Beiträge bilden einen guten Startpunkt, darüber nachzudenken, auf was sich eine solche österreichische Kultur digitalen Edierens konzentrieren könnte: Workflows, Infrastrukturen, Werkzeuge, Methoden, Forschungsinteressen, Abgrenzungen digitalen Edierens stehen zur Debatte.

Literatur

Beloborodova, Olga, Wout Dillen und Joshua Schäuble. 2018. „CATCH 2020 project, manuscript genetics, digital scholarly editing, Samuel Beckett (1906–1989)“. Folien zur Präsentation auf der Transkribus User Conference, 2018. Zugriff: 12. Dezember 2021. https://readcoop.eu/wp-content/uploads/2018/11/BELOBORODOVA-DILLEN-SCHAUABLE.pdf.

Dillen, Wout. 2019. „On Edited Archives and Archived Editions.“ International Journal of Digital Humanities 1 (2): 263–77. doi:10.1007/s42803-019-00018-4.

Klug, Helmut W., Hrsg. 2021. KONDE Weißbuch. GAMS. 562.50. Graz: Zentrum für Informationsmodellierung. https://hdl.handle.net/11471/562.50.

Mancinelli, Tiziana und Elena Pierazzo. 2020. Che cos’è un’edizione scientifica digitale. Roma: Carocci.

Oberhoff, Andreas. 2021. Digitale Editionen im Spannungsfeld des Medienwechsels: Analysen und Lösungsstrategien aus Sicht der Informatik. Bielefeld: transcript. doi:10.14361/9783839459058.

Ohge, Christopher. 2021. Publishing Scholarly Editions: Archives, Computing, and Experience. Cambridge: University Press. doi:10.1017/9781108766739.

Price, Kenneth M. 2009. „Edition, Project, Database, Archive, Thematic Research Collection: What’s in a Name?“ Digital Humanities Quarterly 3 (3). Zugriff: 12. Dezember 2021. http://digitalhumanities.org/dhq/vol/3/3/000053/000053.html.

Rosselli del Turco, Roberto. 2019. „La visualizzazione di edizioni digitali con EVT: una soluzione per edizioni diplomatiche e critiche“. Ecdotica 16: 148–73. doi:10.7385/99301.

Sahle, Patrick. „What Is a Scholarly Digital Edition?“ 2016. In Digital Scholarly Editing: Theories and Practices, hg. v. Matthew James Driscoll und Elena Pierazzo, 19–40. Cambridge: Open Book Publishers. doi:10.11647/OBP.0095.02.

—. 2013. Digitale Editionsformen. Zum Umgang mit der Überlieferung unter den Bedingungen des Medienwandels. Norderstedt: Books on Demand.

—. 2007. „Digitales Archiv — Digitale Edition. Anmerkungen zur Begriffsklärung“. In Literatur und Literaturwissenschaft auf dem Weg zu den neuen Medien. Eine Standortbestimmung, hg. v. Michael Stolz et. al., 64–84. Zürich: germanistik.ch.

Schweizerische Akademie der Geistes- und Sozialwissenschaften, Hrsg. 2021. „Edieren: Geisteswissenschaften im digitalen Wandel | Éditer: les sciences humaines en mutation“. Bulletin der Schweizerischen Akademie der Geistes- und Sozialwissenschaften (SAGW-Bulletin) 27 (3). Bern: SAGW. doi:10.5281/zenodo.5716099.

Spadini, Elena, Francesca Tomasi und Georg Vogeler, Hrsg. 2021. Graph Data-Models and Semantic Web Technologies in Scholarly Digital Editing. Norderstedt: Books on Demand.

Stigler, Johannes Hubert und Elisabeth Steiner. 2018. „GAMS — An Infrastructure for the Long-Term Preservation and Publication of Research Data from the Humanities“. Mitteilungen der Vereinigung Österreichischer Bibliothekarinnen und Bibliothekare 71 (1): 207–16. doi:10.31263/voebm.v71i1.1992.

Vogeler, Georg, Christopher Pollin und Roman Bleier. 2022. „‚Ich glaube, Fakt ist…‘: der geschichtswissenschaftliche Zugang zum digitalen Edieren“. In Digital History. Konzepte, Methoden und Kritiken Digitaler Geschichtswissenschaft, hg. v. Karoline Dominika Döring, Stefan Haas, Mareike König und Jörg Wettlaufer. Berlin, Boston: De Gruyter.

—. 2019. „Digitale Editionspraxis. Vom pluralistischen Textbegriff zur pluralistischen Softwarelösung“. In Textgenese in der digitalen Edition, hg. v. Anke Bosse und Walter Fanta, 117–36. Berlin, Boston: De Gruyter. doi:10.1515/9783110575996-008.

1http://evt.labcd.unipi.it.

2https://teipublisher.com.

3https://www.ediarum.org.

4https://gams.uni-graz.at/doku.

5 Vgl. auch die Tools-Rubrik im KONDE-Weißbuch: oder die Übersicht über Tools zur Textbearbeitung in der Lehrressource forText (Literatur digital erforschen, Hamburg und Darmstadt 2016-, https://fortext.net).

6https://github.com/TEIC/oxgarage.

7https://ride.i-d-e.de.

8https://ride.i-d-e.de/issues/issue-6/, https://ride.i-d-e.de/issues/issue-8 und https://ride.i-d-e.de/issues/issue-9.

9https://github.com/KONDE-AT.

Methodische Aufsätze

Methodological essays

Where are the Tools? The Landscape of Semi-Automated Text Edition

Tara L. Andrews

Abstract

The aim of this article is to answer the question: given that there have been so many tools and methods developed to help prepare scholarly critical editions of texts, why do so many scholars have trouble knowing where to start? The article walks the reader through the typical process of creating an edition, mentioning along the way a variety of tools that have been developed or used in the Austrian landscape in particular, and aims thereby to illustrate many of the considerations that the scholar setting out on an edition project must account for.

Zusammenfassung

Ziel dieses Artikels ist es, eine Antwort auf die Frage anzubieten: Wenn so viele Werkzeuge und Methoden entwickelt worden sind, um kritische Editionen von Texten vorzubereiten, warum finden es dann so viele WissenschafterInnen schwierig zu wissen, wo sie anfangen sollen? Der Artikel führt den Leser durch den typischen Prozess der Erstellung einer Edition, erwähnt dabei eine Vielzahl von Werkzeugen, die insbesondere in der österreichischen Landschaft entwickelt wurden oder verwendet werden, und versucht damit viele der Überlegungen sichtbar zu machen, die zu Beginn eines Editionsprojekts berücksichtigt werden müssen.

As more or less any professor, teaching fellow, or research assistant in the field of Digital Humanities can attest, there is a great deal of interest from scholars in the literary and historical fields about digital editions of texts and how they might be feasibly done. Yet many of these scholars have little idea where to start or what tools exist that are relevant for the particular work they wish to do, despite the fact that textual criticism has been on an increasingly digital trajectory since well before the availability of the World Wide Web. After so many years of development of the digital edition, this seems like a rather odd state of affairs—there are, after all, a plethora of tools available for use (Klug, Galka and Steiner 2021; see also the discussion of several of these tools and methods in Vogeler 2019). Why, then, can it be so difficult to advise scholars about a way forward?

The purpose of this article is to walk the reader through the process of creating a digital edition, from initial transcription to publication and including various methods of source analysis. Along the way we will cover a range of tools that have emerged, especially in Austria, to assist with the creation of digital editions. We must, however, stress the following. Although many textual scholars hope for a single full-featured software package designed to take their editions all the way from conception to publication without the need either to learn their way around computer programming or to hire someone who does, this hope is sadly misplaced. There are almost as many possible forms of digital edition as there are texts tobe edited. Every editor will have a different set of priorities for her edition, not to mention a text (or a corpus) that differs from other texts in ways that are perhaps small, but certainly crucial, for the purpose of preparing that edition. While the argument has been made elsewhere that there can be no “monolithic” general-purpose tool (van Zundert and Boot 2011), we hope with this walk-through of digital editing processes to illuminate why this is the case.

Given the high degree of specialization that any edition project must reach, a scholar who wishes to produce a digital edition should expect to exercise a significant amount of control over the process. As such, the scholar will need to know the principles, and the limitations, of the data modelling system that is used to render texts into the digital medium, and will need to understand how and where the choices made for her particular project might differ from the assumptions built into the tools that are available for analysis and publication of the result. In many cases, in fact, different tools take a different set of assumptions as their starting point, and so the editor will eventually need to understand the data models and their associated technologies well enough to assess how—or indeed whether—these differences can be bridged. We contend here that, in order to create a digital edition on time and within budget, a scholar cannot hope to rely entirely on “IT experts” hired for the purpose. She will need to gain enough knowledge, not only about how text encoding is done, but also about what is done with the result of that encoding and the parameters of the technologies that are used to do it, tobe able to make informed decisions.

1 From scholarly work to digital model

Texts can be published into the digital medium by a variety of means. The basic requirement for any online publication is that it must be expressed as one or more documents rendered in HTML—the standard format for web documents—and hosted on a server connected to the Internet, under a publicly reachable URL. In order for this publication tobe a critical edition, it is only necessary that those documents, in one way or another, contain a faithful representation of the critical text and any apparatuses or other commentary that the editor felt necessary to include.

There are many possible pathways to this end state. For example, it is easy to envision the preparation of the edition in a word processor, where the document is then saved into HTML format and given to a hosting provider. That would result in an online publication that may be “digital enough” for many purposes, but could not be considered “a digital edition” in the sense proposed by Sahle (2016).

A middle ground between the print and the digital can be found with software such as the Classical Text Editor (CTE), developed at the Austrian Academy of Sciences (ÖAW) by Stefan Hagel (1997-). This is a package intended specifically for the creation of critical editions from multiple witness copies. Its user interface is intended to hew as closely as possible to the familiar interface of a word processor, while extending the functionality to provide for the things that an editor will need, such as the definition of sigla to correspond to witnesses, the possibility to record variant readings to the edited text based on those witnesses, the possibility to add other sorts of scholarly apparatus according to the conventions of classical philology.

Although the primary output of CTE is a print-ready document suitable for submission to a book publisher, it also offers the option of export to a TEI-XML format, which could then be transformed to HTML and published online (see below). This feature, added by CTE’s author in the hope of encouraging the proliferation of companion digital publications by CTE users alongside the more usual print publications, has not had widespread use (Hagel 2007, 78). The author attributes this to a lack of institutional interest in digital publications; while this may still have been true in 2007, it is much less true today, and yet those who call themselves digital philologists still do not usually recommend CTE as a means to produce a digital edition. The reasons for this, we would argue, go deeper than institutional interest.

One of the major design decisions of CTE is to allow the scholar to create a text edition that conforms to her exacting scholarly specifications, but without troubling her with “any sort of surfacing tags or other sorts of ‘code’ [which] can be detrimental” to the ability of the editor to “remain devoted to scholarly questions” (Hagel 2007, 79). While this is an admirable goal, it may actually be part of the problem. CTE encourages the scholar to concentrate primarily on how the edition will look once it is printed. Although the software has a reasonably complex conceptual model, the scholar not inclined to study this model or to familiarise herself with the advanced features of CTE will quickly find that she can produce an edition that “looks right” on the page even when the model is violated internally.

For example, CTE provides a mechanism for recording additions, omissions, and transpositions, and for specifying the abbreviations that should appear in the apparatus criticus when one of these situations arises. The user can, on the other hand, achieve the same outward effect simply by inputting these abbreviations as though the abbreviation was itself the text of a variant. The distinction is invisible in the resulting print proof, but in the companion XML output, incorrect use of the data model will instantly undermine any automated attempts to parse the document for the edition text and the variant witness texts. Several potential inconsistencies of this type were encountered in the attempt to write a parser for CTE’s flavour of TEI-XML, for use with the Stemmaweb service (Andrews n.d.).

It is thus clear that, if a user of CTE wishes to produce a useful digital output alongside the print output of an edition, she must take care to understand not only the scholarly features of CTE, but also the data model that underlies what is displayed on the screen. She will soon find that there is a set of technical assumptions embedded in the software about how to represent the data that has been entered by the scholar, and more assumptions about how to translate the text from the CTE model into the TEI double-endpoint-attachment means of encoding variant text. She will then need to draw upon her understanding of the data model as it was expressed in the TEI output, in order to use the tool(s) that she eventually chooses for further analysis of the edited text or for its publication.

2 The typical digital edition

Although the XML export functionality of CTE offers a route to HTML-based online presentation, this might not necessarily be considered a true digital edition. Many scholars follow the definition of “digital edition” offered by Sahle (2016), who argues that “scholarly digital editions are scholarly editions that are guided by a digital paradigm in their theory, method and practice”—for Sahle this also means that a digital edition must offer some significant content and functionality that would be lost in an analogous print edition.

What does this mean in practice? Although Bordalejo (2018) argues, with some justification, that textual scholarship has not actually progressed beyond traditional paradigms on a methodological level, a more or less mainstream set of steps have emerged toward creating something that is commonly acknowledged as a “digital scholarly edition”. It begins with the transcription of sources, usually from digitized images, in which not only the sequence of interpreted text but also the features of the textual expression (such as decorated or otherwise highlighted text, authorial or scribal corrections, marginal notes, and the like) are recorded. The ability to pay close attention to these features of the text brings the scholar immediately beyond the capabilities of word processors, or even the CTE.

Thus, when we speak of a “typical digital edition”, we usually envision the transcription of source text from digitized images, the comparison (if necessary) and analysis (as desired) of these source texts, the production of a commentary usually including information that is linked to specific portions of the text or specific words therein, and the presentation of the whole in an online format, so that a reader (or user) of the edition can view the text and its commentary in whatever form best suits the purpose that the editor had in doing the work.

Several institutions within Austria—the Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH) at the Austrian Academy of Sciences, the Centre for Information Modelling—Austrian Centre for Digital Humanities (ZIM-ACDH) at the University of Graz, and the Digitalisierung und elektronische Archivierung (DEA) group at the University of Innsbruck foremost among them—have taken on the task of supporting Austrian researchers by providing training for scholars new to digital methods, offering archiving of the digital sources, producing systems for transcription of digitized images, and providing solutions for publication of the finished result on the Web. Edition projects and the expertise that goes with them are settled at many other Austrian universities, and these are often an early port of call for the scholar who would begin a new digital edition.

The plethora of services around digital editions in Austria brings us to a first important point: although there exist a great variety of tools for various steps in the process of creating a digital edition, there will almost certainly not be a complete suite of tools that will take the scholar from the beginning of her project all the way through to publication, without the intervention of a programmer hired for the purpose or a specialist consultant, unless the scholar is herself conversant enough with the various relevant technologies to provide her own technical support. We can illustrate this point by considering the steps toward the production of such a “typical” edition, reviewing what tools are most helpful for each of these steps, and observing what is still missing.

3 Transcription as an act of data modelling

The first stage of a digital edition is to express the content of each source document in some sort of digital model. This expression usually takes the form of embedded markup, in which semantic information about the text is entered directly alongside the text itself. Although there are a number of options for embedded markup, such as LMNL (Piez 2014) or a system currently in development known as TagML (Haentjens Dekker et al. 2018), by far the most predominant toolkit for the expression of this model is the Text Encoding Initiative (TEI). The first step for the vast majority of trainee digital philologists is to become familiar with the TEI Guidelines, and how they are expressed in XML. We must stop short of calling the TEI itself a model; although it defines a vast number of scholarly concepts related to text and its features, many of these concepts are intentionally flexible in their definitions (e.g. text block definitions such as <ab> and <div>), and there are often multiple possible ways to express the same textual feature (e.g. lines of text, which can either be enclosed in a <line> element or separated by an <lb/> [line break] milestone element). This is the reason that the authors of the TEI Guidelines advise all editors to produce a custom schema, specific to the needs of the project itself, at its outset.

Since the Text Encoding Initiative has based its work almost entirely on the framework of XML and its related technologies, the first consequence for the newly-digital philologist is that, from the very outset of the transcription work, she must learn a great deal about XML. This includes not only the basics of its grammar, but also the subtleties of how that grammar is employed via the TEI Guidelines to create a document that would be considered “valid”. Here too, the editor must understand the technical mechanism by which validity is ensured: this involves the creation and configuration of a custom XML schema using a tool such as Roma (Mittelbach, Rahtz, and Bernevig 2018), for which the editor will need to understand the technical contents of the TEI modules that she wishes to use. If the editor wishes to extend the TEI to deal with features of her text that are not adequately covered in the Guidelines, she will need to have an even better understanding of the principles of schema description in order to add or modify the required elements, attributes, or dependencies.

In many cases, the editor will do the transcription using an XML editor (the most commonly used editor is oXygen, though there are open-source alternatives) configured to incorporate her custom schema for validation checking. The very fact that XML editors are the primary tool of choice for creating TEI-XML transcriptions of source texts constitutes strong evidence of the impossibility of providing the comprehensive and user-friendly software tools that scholars so often wish for. Industry programmers, as well as colleagues from the field of computer science, are (in our experience) almost invariably stunned to discover that digital philologists write XML directly in an editor—although XML is a text format that is comprehensible by humans, its syntax is exacting enough that software developers almost never write it directly if they can avoid it. Rather, they expect that data in XML format is generated by some sort of intermediate software, and only edited by hand in extremis. This manual process is, however, almost inevitable when the very schema against which the transcription is checked varies from project to project. It is not uncommon for userfriendly alternatives tobe developed within the bounds of individual projects—both the Transcribe Bentham crowdsourcing project (Causer and Wallace 2012) and the New Testament Virtual Manuscript Room (Institut für neutestamentliche Textforschung n.d.) implement WYSIWYG-style transcription interfaces, for example—but none of these interfaces makes a claim to widespread utility.

We say it is “almost inevitable” that the editor will write XML by hand—for the scholar who has facsimile images of her sources and wishes to transcribe them line by line with a view to publication of both facsimile and transcription, there are tools available that handle transcription of the text, together with line-by-line linking of transcriptions to facsimile images. One of the best-known tools for this, particularly in Austria, is Transkribus (Kahle et al. 2017). Transkribus is offered as a desktop application, backed by a central data storage and processing service, for transcription of manuscripts and printed texts directly from images. It offers a plethora of useful tools; these include automated detection of regions of text in an image and lines within those regions (which allows for the association of segments of transcribed text with their corresponding places on the facsimile image), handwritten text recognition (if enough of a particular document has been manually transcribed), and output of the transcription data into several formats, including TEI-XML.

Given the claim above that it is more or less impossible to write a graphical user interface for production of TEI-XML encoded texts, we should look more closely into what Transkribus does offer. Their data model, tuned as it is for recognition of text blocks on page facsimiles and association of text with these blocks, saves information using a standard known as PAGE XML (Pletschacher and Antonacopoulos 2010) that was developed for automated document analysis and text recognition. Any conversion to TEI must therefore involve a transformation of the model of a text as conceived by PAGE into some TEI-compatible model. Indeed, in their introductory How-To guide, the developers note that “Transkribus is […] more than a TEI editor, but also less (we will not support all peculiarities of TEI but just those which are necessary to create a good, standardized transcription)” (“How to Use Transkribus—in 10 Steps (or Less)” 2015).

What this means in practice is that the commonly recommended option of creating a custom XML schema for each individual project is not an option here. The Transkribus developers have had to make certain decisions of their own about how the PAGE XML model can be expressed in terms of TEI-XML concepts, and the user has little choice but to learn and understand the respective models if she wishes to use the TEI-XML output of Transkribus. For instance, the PAGE XML concept of a TextRegion (that is, the area on the facsimile that comprises a text block, or a line inside a block) demands that the content of the line be nested inside this element. This in turn disrupts the usual use of text-structural tags such as <p> (paragraph), <q> (quotation), or <head> (heading), which now cannot be used for text that spans multiple lines without violating the strict hierarchical principle of XML. Even if the user chooses to export a version of TEI-XML that employs <lb/> (line beginning) milestone tags instead of <line> tags containing the line content, she will find that no markup has been allowed to cross a line boundary.

The developers, perhaps sharing the ideals of Stefan Hagel in wanting to ease the complexity of the data model and its expression as far as possible, have preconfigured a few markup tags for the convenience of the user. In some cases these are direct analogues of TEI elements, but in other cases they are simplified pseudo-TEI elements that are converted to their more complex equivalents, sometimes with a loss of explicit information. For example, the user might find the <abbrev> element useful for marking up a line of text.

Figure 1: Adding XML-style annotations to Transkribus text.

In Figure 1 the transcriber has provided two tags. Line 1–2 contains a technical term tagged with the <term> element, and line 1–3 contains an abbreviated word “Art.” (for “Article”) tagged as an abbreviation, with the entire word noted using the “expansion” property. The tagging interface would lead the XML-aware user to expect output such as

<lb n='2'/>otherwise called <term rend='underline'>Quasi Trial</term>.

<lb n='3'/><abbrev expansion='Article'>Art.</abbrev> 4. The original examination performed

This, however, is not actually how abbreviations are expressed according to the TEI guidelines. In order to bridge the gap, Transkribus uses an internal XSLT stylesheet to convert the PAGE XML-based transcription data into valid TEI. The conversion results in this:

<lb facs='#facs_1_r1l2' n='N002'/>otherwise called <term rend='underline'>Quasi Trial</term>.

<expan>Article</expan>

</choice> 4. The original examination performed

Now that the abbreviation is represented as a choice, the implicit distinction between the abbreviation (which actually appears in the text) and the expansion (which does not appear as such in the text, but is an interpretation provided by the transcriber by means of a property on the element) has been lost. The distinction, and the information about which of the variants in the <choice> element actually appeared in the document, can be arrived at only by understanding the process by which <abbrev> became <choice>.