Skip to main content

LEX3: Publishing Legacy Dictionaries with Publex

Authors
Topics:

This course introduces the legacy dictionary viewer Publex, a generic, modular dictionary publication tool for retrodigitized dictionaries.

Learning Outcomes

After using this resource you should:

  • decide whether a dictionary fulfills the requirements to be published with Publex
  • configure the display of their dictionary in Publex
  • use Publex to make a dictionary available in open access

Introduction

Publish your dictionary data with Publex!

Publex is a software that allows anyone to publish his/her dictionary data annotated in XML. You can define the display of your dictionary individually. For this, you do not need to be a professional. The application is very simple, as the program and the user instructions guide you through the entire process. Moreover, no installation is required. Publex is accessed and operated via the web browser, and the data is stored and published on the Elexis server.

Three steps to the publication of your dictionary

1. Upload + Metadata

Upload your XML dictionary data and provide the associated metadata.

Upload

Start here.

2. Configuration

Specify the layout of your dictionary by defining formatting rules for the XML elements.

Configuration

Learn about it here.

3. Publication

Publish the dictionary on the Elexis server. It will have its own URL.

Publication

Learn about it here.

Let’s get started

Creating an account

Register

To start with Publex directly, you first need to create a profile. For this, click on the register button under the login form on the publex page. A form will open where you can enter your login credentials. Before completing the registration, you have to accept the terms of use and the privacy policy. Click the register button to complete the process.

Login

Login

If you already have a user account, you can log in to the home page with your registered e-mail address and your password.

Logout

Logout

Within Publex, you can log out of your profile at any time using the logout icon in the top menu bar.

Forgot your password?

Forgot password

Click “Forgot Password” in the login form and you will be asked to enter the e-mail address you are registered with. Then you will receive an e-mail with instructions on how to create a new password.

Changing your password and e-mail adress

Profile

Logged in to your account in the upper menu bar, the profile button takes you to the profile management, where you can change your account-related data such as your e-mail address and your password.

Creating a new dictionary and importing data

First: Preparing the data

1. Data requirements

Dictionary entries
  • The files are required to be provided in valid XML.
  • Please make sure that every dictionary text you want to be printed is annotated as a text node. Dictionary texts should not be encoded as an attribute value.
  • In case of large dictionaries, the files for the dictionary entries should be split into several files. The individual files should not be larger than 16 MB.
Use of non-unicode characters

If your dictionary contains special characters which are not available in the Unicode standard, it is possible to use our self-defined entities of the KompLett font. To do this, proceed as follows:

  1. Check whether your special characters are contained in the allEntities file. For this, you can explore the KompLett font in the GlyphrStudio web tool:
  • Load the file KompLettR.ttf you find here into Glyphr Studio and make sure you choose to import all glyphs:

Import Komplett Font file

  • After the file has been loaded, you can view the font characters it contains. The special characters defined in the allEntities file are located in the Private Use Area of the font. Therefore, select this area for display:

  1. When you have found your desired character in the overview, find out its position. To do this, move the mouse over the character. The position is displayed in a mouseover after the string “Private Use Area”. In the following example it is “EA4C”.

  1. Using the position number, look up the character’s name in the allEntities file.
  • Every entry in the allEntities file is structured as follows: the declaration of a new entry <!ENTITY followed by the name, the Unicode position (or position in the private used area) and a closing >.

Here you can see an example with the special character “u with superscripted plus”:

u with superscripted plus (entity file)

  1. In the XML file of your dictionary text, replace the corresponding special character with &entity name; (entity name = “plusaboveu” in the following example; used as “&plusaboveu;”).

u with superscripted plus (xml)

And this is how the character is finally displayed using the KomplettFont:

u with superscripted plus (displayed)

2. Store the data in a Bitbucket repository

The data is imported into Publex from Bitbucket, a Git repository.

  1. First of all, to upload data to Bitbucket, you need to install Git on your computer. If you don’t already have it, please download and install it from here.
  2. Log in to your bitbucket account. If you don’t have an account yet, register here.
  3. Create a new repository. To do this, click on “create repository” or the plus button next to “Recent repositories” on the welcome page of your account.

Welcome page Bitbucket

Create new repository

  • Select a project to which the repo is to be assigned to. If you do not have any projects in your Bitbucket account yet, you can also create a new project at this point.
  • Enter a name for the new repository, e.g. you can name it after your dictionary.
  • Choose the access level. If you decide to make your repository public, all the data you store in it will be accessible online for everyone. If you do not want this to be possible, select private.
  • You can include a README with a short information about the content of your repository, but this is not necessary for your upload to Publex.
  • We recommend to choose ‘main’ as default branch name as suggested by Bitbucket.
  • For the field “Include .gitignore?” you can simply use the default setting.
  • Now you are ready to hit the “Create repository”-button.

This is how it looks now:

New repository

  1. Now you can upload your dictionary data to the repository.
  • To do this, you first have to clone the repository which means that you create a copy of it on your local system. When you click the “clone”-button in the right corner, a window will open, copy the clone command that appears.

Copy clone command

If you prefer to use Bitbucket through a user-friendly interface, we recommend to download Sourcetree. In the following, we describe how to perform the steps with the command line:

First open the terminal. This works differently depending on the platform you use:

a) using Windows: Hold down the Windows key on your keyboard and then press the “R” key. Now the “Run” tool will open in a new pop-up window. Type in “cmd” and hit “OK”.

b) using Mac: To open terminal from your applications folder, click your desktop to bring “Finder” into focus. In the menu bar, click “Go” and select “Applications”.

c) using Linux: Press Ctrl+Alt+T.

  • Once opened the terminal window, change into the local directory where you want to clone your repository. cd <path_to_directory>

  • Paste the command you copied from Bitbucket, e.g.: git clone https://Trifoglio@bitbucket.org/Trifoglio/my_dictionary.git

  • Now a new sub-directory should appear on your local drive with the same name as the repository.

  • In this directory, create a new folder which you call “data” for example.

  • Put all the files that belong to your dictionary into this folder. Note: If you want to use entities from the KompLett font, you must also place the file allEntities.xml in this folder.

  • Now you are almost finished. The final step is to transfer the locally added files to the remote Bitbucket Cloud repository. In your terminal window:

  • enter cd <path_to_local_repo>

  • enter git add --all

  • enter git commit -m '<commit_message>' with a commit message that describes your changes, e.g. git commit -m 'upload dictionary data'

  • enter git push

If everything worked fine, you should now see the files in your repository online in your Bitbucket account. Now we are prepared to create a new dictionary in Publex!

Creating a new dictionary

  1. To create a new dictionary, start on the Dictionary Overview page and click on “New Dictionary”.

Dictionary Overview

  1. A new window opens. Type in the name of your new dictionary and optionally a description.

Add new dictionary

  1. Now you have two different options:

    a) You can create an empty dictionary first and import the data at a later time. Then click on “Create empty dictionary”. The new dictionary will appear in the Dictionary Overview afterwards.

    b) You can import your dictionary data directly. Hit the button “Add Dictionary and go to XML file import”. The dictionary is created and the import page opens.

Import

  1. There are two possible ways to get to the import page: a) You create a new dictionary and immediately go to the XML file import (see here). b) In the Dictionary Overview, select a dictionary that has already been created and click on “Edit Dictionary”.

    Edit dictionary

In the menu bar on the left, under “Dictionary Management”, select “XML File Import”.

Left menu bar

  1. Now you have to provide Publex with the information where to find your dictionary data.

Define import

a) The URL of the Bitbucket repository where you stored the data. You get the URL by clicking on the clone-Button in your Bitbucket repository and choosing “SSH” in the dropdown menu in the right upper corner. The last part of the displayed string is the URL you need.

Clone Button

Repo URL

b) The branch in which the data is stored. It’s the one you chose as default branch name when creating your Bitbucket repository (see here).

c) Tell Publex, whether your repository is public or private.

  1. If your repository is public, no further settings are necessary for the import. If it is a private repository, please mark the checkbox “This is a private Git repository!“.

    Private Git

  2. By activating the button, a SSH key becomes visible. Please add this key to your Bitbucket account.

  • Copy the key.
  • In your Bitbucket account, go to your personal settings (see the icon for your account in the left corner on the bottom).

Personal settings

  • In personal settings, go to “SSH keys” and click on “Add key”. A Window opens where you can paste your key you copied from Publex.

    Add key

d) The name of the directory you stored you dictionary files in, e.g. “data” as chosen in our example.

e) The file extension of your files containing the dictionary data which should be xml as recommended.

f) The entity file name: If you would like to use our self-defined entities to display non-unicode characters (see here), enter the file name “allEntities.xml” here. If not, the field should remain empty.

Now you are ready to hit the button Import Data. The import starts now. It may take a couple of time depending on how large your data is. In the output window, you can follow the progress of the import and also see if problems or errors occur. If “Everything successfully imported.” is displayed, the import was successful. Click “Go to Styling Rules” to continue with the configuration of your dictionary.

Publex Test-Git

As assistance we offer a small dictionary test data set. Via the button “Use Publex Test-Git” the corresponding data are entered into the form and the test dictionary can be imported.

  • Test_Git

Deleting and locking a dictionary

A dictionary can be deleted by clicking on the delete button (bin symbol). Make sure that the status of the dictionary is “unpublished”.

Delete dictionary

You have the possibility to lock a dictionary to protect it from unwanted changes. The lock icon changes the status to “not editable”. Click the icon again to unlock.

Lock a dictionary

Locked dictionary

Metadata Management

The metadata for the dictionaries will appear later on the page of the published dictionaries.

To add and edit the metadata of your dictionary, choose your dictionary in the Dictionary Overview, hit the “Edit Dictionary”-Button and select “Metadata” from the left side menu.

Metadata Management

On this page, you are able to change the title and edit the dictionary description. You also can add a project website and classify your dictionary into one of the following categories:

  • General language dictionary
  • Multilingual dictionary
  • Historical language dictionary
  • Etymological dictionary
  • Regional or dialect dictionary
  • Foreign learners’ dictionary
  • Technical and terminological dictionary
  • Authors’ dictionary
  • Unspecified

Dictionary Categories

Define your styling rules

The configuration module is the heart of Publex. Here you can define how each XML element should appear in the online dictionary. The following is a general explanation of how styling with Publex works. It becomes clearer and more concrete in the example below.

Go to the configuration page:

a) You will get here directly after importing your dictionary data by clicking “Next” after the import has been completed.

b) In the Dictionary Overview, select a dictionary that has already been added and click on “Edit Dictionary”. Then, in the menu bar on the left, under “Dictionary Management”, select “Styling Rules”.

There are a few basic things to keep in mind when defining the styling rules:

  • The requirement for the publication of a dictionary is the presence of a lemma rule (i.e. for one of the rules, the field “Add to lemma list” must be selected).

  • The order of the rules is relevant: Subordinate rules override the preceding ones. For example: I want the font size of my entries to be 12 pt, so I define this feature in a rule for all <entry> elements. Thus, all the texts contained in <entry> will be formatted accordingly. However, the lemma (e.g. <form type="lemma">) should be bolded and displayed in a larger font. If I define this in a subordinate rule, the definition from the first rule is overwritten specifically for these elements. Therefore it is recommended: More general rules should be placed at the beginning of the list.

  • The order can be modified at any time using the arrow buttons.

    Change Order

Adding a new rule

A new styling rule is created by using the “Add styling rule” button.

Add new sytyling rule

A window for defining a new styling rule is displayed.

Naming the rule

Type in the name for the rule.

Name rule

Choice of the element tag

The first step is to select the element tag the rule should apply to. When importing the data, Publex captures all the different tags, attributes and associated attribute values your dictionary is annotated with. When creating a styling rule, you are shown all the options to choose in a drop-down menu:

Dropdown menu tags

Now you have different options:

  1. You select only one element tag. The rule then applies to all elements with this tag name, regardless of the attributes:
    Only tag
  2. You specify the selected tag by one or more attributes, you do not prescribe which attribute values are present. To do this, click on the plus symbol to the right of the selected element and select “Add attribute to tag” here.

add attribute

attributes

  1. Additionally, you define which values an attribute should have. To do this, click on the plus symbol on the right of the selected attribute and select “Add value to attribute”. Please note: You can only define one value per attribute.

add value

tag, attribute, value

Attributes and attribute values that have already been selected can be removed with the minus symbol on the right.

delete attribute

Search and lemma field

For each rule, there are different options you can select in the settings:

settings

  • Display text on website This checkbox is activated by default. You should only deactivate it if you do NOT want the text of the element to be displayed.
  • Set as searchable field If you want the selected element to appear as a search field in the dictionary look-up, activate this field.
  • Add to lemma list By activating this field, you define a so-called lemma rule. This means that the text contained in the element appears in the lemma list and makes the associated dictionary entries accessible. Note: At the same time, “Set as searchable field” is activated. Lemmas always form a search field.

Define the styling

Finally, you can define the concrete layout. A wide variety of text formatting functions are available to you for this purpose:

Styling functions I

Styling functions II

Planned additional formatting functions:

Styling functions II

Save and discard

When you have defined a new rule or made changes, save them with the save button at the top of the right.

Save

Single rules can be deleted via the bin button.

Delete rule

If you want to discard all changes since the last save, you can do this with the “Discard changes” button. This is hidden behind the “more” button next to the save button.

Discard changes

Lock a styling rule

As an assistance in creating the styling rules, individual rules can be locked. This prevents them from being accidentally deleted and can serve as your own orientation to check which rules have already been created.

Clicking on the lock symbol locks a rule, clicking again unlocks it for editing.

Lock rule

Export and import a configuration

You have the option of exporting the styling rules as a JSON file in order to save your results and re-import them later if required. This is particularly necessary if you want to update the dictionary data at a later time. We, therefore, strongly recommend exporting and saving the configurations locally after making changes to the styling rules.

The buttons for the export and the import are hidden under the “more” button next to the save button at the top of the right.

Export and Import Styling Rules

Export

To export your styling rules, click on “Export configuration file”. A window opens asking you to specify the local storage location of the file.

Import

To import exported styling rules, click on “Import configuration file”. A window opens in which you have to copy the contents of the exported JSON file. To do this, you can open the JSON file in any text editor and select and copy the text it contains.

Import Styling Rules

After clicking on “OK”, the rules are imported and displayed.

Styling a Dictionary: An Example

For a concrete demonstration of the different functions and the procedure, we use the dictionary entry bluome from the Mittelhochdeutsches Handwörterbuch by Matthias Lexer as an example in the following:

Lexer bluome Foto

This is how the XML annotated entry looks like:

Lexer bluome XML

We first examine which information units have a special typographical representation and which XML elements correspond to them. In this example, we follow the print image, but use the additional advantages of the digital publication form: For example, we additionally display the citations in red and the authors in blue, which serves to improve the visual structuring of the document examples.

information types

typograhocal types

After these pre-considerations, the styling rules can now be defined.

Since the rules should be sorted from general to specific, we first define the basic layout for our entries.

The basic layout

The first rule, therefore, refers to the entry element. If not defined more specifically by further rules, the text of the dictionary entries shall be displayed in KomplettFont, font size 12 pt, font colour black and with no other special features:

entry rule

Now we create more specific rules for individual entry parts that override the entry rule:

The lemma

Each dictionary should have a lemma. The tag for this is <form type="lemma">.

In contrast to the regular entry text, the lemma should appear in bold type in our example. To make it more distinctive, we also choose a larger font size of 16 pt. In addition, we have to activate the box “add to lemma list” so that the lemmas appear in the lemma list and are searchable in the dictionary look-up.

Defining lemma rule

Grammatical information

The grammatical information contained in the gram tag is to be displayed in italics. We also want to enable a search for the grammatical indication in the dictionary and therefore activate the checkbox “Set as searchable field”.

Defining grammar rule

Definition

The text contained in <def> should also be set in italics. We additionally activate the checkbox “Set as searchable field”.

Defining definition rule

Citation

Quoted examples are coded with <cit type="example"> and should be displayed in green.

Defining citation rule

Source authors

The cited examples are followed by the source, which is annotated with <author> and which we want to display in blue font colour. The sources should also be a search field in the look-up.

Defining author rule

Italics, recte and superscript

Finally, we define the rules for the elements that mark typographic features. In our example, these are italics, recte (used to mark punctuation marks as recte in a paragraph displayed italics) and superscript.

Defining italics rule

Defining recte rule

Defining superscript rule

Now the styling for our example entry is defined. We save the rule with the “Save” button and create a backup copy by exporting the configuration as a JSON file.

In the dictionary preview, our example entry bluome now looks like this:

Preview entry bluome

The Dictionary Viewer: Preview your dictionary

In the Dictionary Viewer, the user can see how his/her published dictionary will look later. While defining the styling rules, the user can directly see how they will be implemented for the display of the dictionary entries. After each saved update in the Styling Rules, the changes become visible when reloading the page (e.g. with F5).

You can get there by clicking on “Preview Website” in the left side menu:

Preview Website Left Menu

Or by going to your dictionary in the Dictionary Overview and click on “Preview Website”:

Preview Website Overview

A new tab opens in the browser with the preview:

Dictionary Viewer

The viewer as well as the later published dictionary are divided into three parts:

  1. the display of the entries
  2. the lemma list
  3. The dictionary look-up

Display of the entries

Here, the dictionary entries are displayed as defined in the Styling Rules. Only one entry is visible at a time. The user can navigate to other entries using the lemma list.

Lemma list

All fields a lemma rule has been defined in the styling rules for appear in the lemma list (see here). Clicking on an item in the list displays the corresponding dictionary entry. The entries in the lemma list can be searched by using the search field. A truncated search is possible.

Search lemma list

The dictionary look-up

In the dictionary look-up the user has the choice between a general fulltext search

simple search

and an advanced search.

advanced search

The advanced search offers a fulltext search and search options in all information fields the checkbox “set as searchable field” has been activated for in the styling rules (see here). In our example configuration, these are the lemma, the grammatical information, the definition and the source information.

Further search fields can be added via the plus button. These all are linked with the logical operator AND.

The following applies to all entries for the various search variants:

Upper and lower case are irrelevant for the entry and blank characters are ignored. Special characters can be replaced by their basic characters in the search, e.g. diacritics do not have to be entered (ë = e).

A truncated search is possible. For example, bach* returns all words beginning with this character string and *bach returns the words ending with this string.

When entering multi-word units, it is necessary to enclose the search string in inverted commas to search for the entered word sequence (e.g. “laut und buchstabe”), otherwise the word order will not be taken into account.

The search is started by pressing the enter key.

The search result appears as a list below the search mask. From here, the dictionary entries can be called up directly.

Search result

Publication

Requirements

In order to publish a dictionary, the following requirements must be fulfilled:

  1. The corresponding data must have been imported into Publex.
  2. The styling rules must have been defined. At least a lemma rule has to be defined.

How to publish?

  • To publish your dictionary, go to the Dictionary Overview page, change the button with the label “Publishing status of your dictionary” from “unpublished” to “published”.

Publication

  • Your dictionary is available online now. It appears on the overview page of all dictionaries published with Publex. You can also access this page via the “Published Dictionaries” button in the top menu bar:

Publex Button

The published dictionaries are displayed with basic metadata and can be accessed by clicking on the dictionary name in the blue box.

Published Dictionaries

  • Each published dictionary is also given its own persistent address it can be accessed through.

Reverse a publication

You can reverse a publication of your dictionary by changing the toggle button from “published” to “unpublished”.

Reverse a publication

File Management: Update a dictionary

You would like to update a dictionary that has already been published?

Case 1) You want to change the display of your dictionary, but the content data remains unchanged.

To do this, log in to Publex and call up the corresponding dictionary in the Dictionary Overview. Select “Edit dictionary” and change the styling rules of the dictionary. Click on “Save” to apply your changes to the published dictionary.

Case 2) You want to update the dictionary data.

If you want to publish your dictionary in a new edition, we recommend to create a new dictionary and publish the data in a new version under a different URL.

But you can also reimport the data for the existing dictionary. To do this, follow the steps below:

  1. First, save the styling rules defined for the dictionary by exporting them and saving them locally (see Export of styling rules).
  2. Go to the Dictionary Overview and change the publication status to “Unpublished” (see Reverse a publication).
  3. Go to the import page (see Import), fill in the information where you stored the updated data and hit the “Reimport” button.
  4. After the data import is done, you can reimport the locally stored styling rules (see Import of styling rules). Please check if the rules still match the XML encoding.
  5. Finally publish the dictionary (see Publication).

Further information

For further information see the Publex homepage.

Cite as

Anne Klee, Thomas Burch, Claudia Bamberg, Julia Hennemann, Henrike Sievers and Sandra Weyand (2022). LEX3: Publishing Legacy Dictionaries with Publex. Version 1.0.0. Edited by Anne Klee, Thomas Burch and Claudia Bamberg. DARIAH-Campus. [Training module]. https://campus.dariah.eu/id/IbZ4arshkA06y5Uem0UsT

Reuse conditions

Resources hosted on DARIAH-Campus are subjects to the DARIAH-Campus Training Materials Reuse Charter

Full metadata

Title:
LEX3: Publishing Legacy Dictionaries with Publex
Authors:
Anne Klee
Domain:
Social Sciences and Humanities
Language:
en
Published:
4/25/2022
Content type:
Training module
Licence:
CCBY 4.0
Sources:
DARIAH
Topics:
Lexicography
Version:
1.0.0