New Unicode Texts

Hi folks. As you know, our focus is on Unicode. Our first blogs were about converting reusable digital data of Egyptian texts to Unicode and publishing recommendations for using Unicode hieroglyphs. Among other things, we took the great repository of JSesh texts and converted them to Unicode. In the meantime, a lot has happened in this repository: new texts have been added and existing texts have been licensed under a free license. So we decided to update it. In this blog we explain the results:

The repository formerly-mdc-now_unicode now contains 16 new texts, namely:

We would like to thank the authors Kaan Eraslan, Émil Joubert, R. Monfort, S. Rosmorduc for encoding the new data and Serge Rosmorduc for publishing it in the reusable repository.

As expected, the MdC data of these texts contain a number of encodings that we have not yet encountered in the transformation to Unicode, such as O42C or W17D. In most cases, these are variants. However, according to Unicode principles and our recommendations, the actual characters should be encoded. Accordingly, we have extended our mapping so that an O42C is converted to 𓊏.

In addition, there are some codes that represent characters that cannot yet be represented in Unicode. We had already compiled some characters in our transformation at that time. We offer here the cases of missing Unicode characters resulting from this transformation:

Missing Unicode characters

A82

semantic value: to throw
source: KRI IV, 14.8.

B75

semantic value: female captive
source: KRI II, 93.8.

It seems promising to go through the work of Cauville and Leitz regarding missing Unicode hieroglyphs.

In any case, we have also added the new variants and the placeholders for the missing characters to our MdC converter so that you can correctly convert a “W17D-n:t*y-iwn-nfr” to 𓏃𓈖𓏏𓏭𓉺𓄤 on the web.

Back to the new texts! These are also accessible via our metasearch engine for hieroglyphs. So if you search for the phrase 𓌞𓋴𓂻𓎟𓀀𓆑, you will see that it is used in Sinuhe.

See you next time!

This work is marked with CC0 1.0 Universal

ORAEC and Osiris Spelling Software

Papyrus Prisse

Missing Unicode characters

A82

B75

D283

E80

E100

G106

N6B

O137

R31A

R42

Aa56