We all live in a digital world. Digital has permeated all aspects of life. Research is also permeated by the digital, as we use digital services and draw on digital data. However, the digital as a fluid entity makes research difficult in some ways, because research is based on referenceability. Digital services and digital data must be findable years from now if they are to serve as citable sources for research. How to deal with this in Egyptology is the topic of today’s blog.

How do you cite websites? We all know the request to give the date of the visit. That’s all well and good, but - to be honest - pretty useless. For one thing, there is no guarantee that the content of a website will remain stable. For another, a website can go offline and become inaccessible. The content is no longer available for viewing. In this case, it makes no difference whether you visited the site two or three years ago. Offline is offline. Then a website is as inaccessible as the information in the most famous footnote of all time:

This was once revealed to me in a dream. https://twitter.com/ticiaverveer/status/964967221440204801

Are there any relevant Egyptological websites that are offline today? Yes, of course there are. In the noughties there was the website of the Center for Computer-Aided Egyptological Research: http://www.ccer.nl/. It had a lot of relevant content: the Hieroglyphica sign list, the Prosopographia Aegypti name database, the Multilingual Egyptological Thesaurus, an overview of Egyptological institutions and museums. The site has been offline for about 10 years, see https://ancientworldonline.blogspot.com/2010/09/news-from-centre-for-computer-aided.html. If you want to access this information, you have to use the Wayback Machine, which fortunately has archived the pages of the CCER. As is often the case with the Wayback Machine, the static information is easily accessible. Dynamic content from databases is more difficult. We cannot access the content of the Prosopographia Aegypti as it was found on the CCER. Fortunately, Erhart Graefe now hosts this database and keeps it up to date. Despite the problem with dynamic web pages, archive.org’s Wayback Machine is an enormously important service for discovering an older version of a web page. We benefited greatly from it in a previous blog. So if you want to donate money just before Christmas, archive.org is certainly an address to consider.

The next problem is that sometimes the websites remain the same, but the address changes. The server architectures are not always stable. The Handle System solves this problem. There is a stable identifier and the address of the desired target. The handle system then makes the connection between the identifier and the target. Even if a web site has to move, it remains citable because the identifier remains stable. Only the handle system has to adjust the link. In this way, the task is divided into two areas: the site owner keeps the content, and an external provider takes care of the identifiers and the links. A well-known example, also based on the handle system, is DOI. DOI assigns an identifier to digital objects, usually scientific articles, and guarantees citation. In Egyptology, as far as we know, this system is only used for articles. Entries in databases or “normal” websites do not have a DOI. Or did we miss something?

Permalinks are another model. Like the Handle System, there are stable identifiers. However, no external provider is used. The site owner is responsible for the stability and longevity of the service. In other words, he guarantees that something will be available at a particular address in the future. Many museums use permalinks. Here are two examples: https://collections.louvre.fr/en/ark:/53355/cl010009553, https://id.smb.museum/object/606189. Egyptological databases also use such persistent links. The Cachette database provides such links: https://www.ifao.egnet.net/bases/cachette/ck24. Trismegistos has several types, see https://www.trismegistos.org/about_how_to_cite.php#si-citing. The TLA calls its stable links “Persistent URL”. Such a PURL is similar to a permalink, but has technical differences we are not interested in here.

If you want to keep your site stable yourself, the technical effort involved is not to be underestimated. It is therefore not surprising that only large projects or large institutions choose this route. The usability of stable links depends on the providers’ ability to master this major technical task. As users, we can only trust them. But that trust must be prepared for turbulence, of which we highlight two:

First, even if a permalink is promised to be stable, there are quite a few cases where a permalink is offline today. To give a recent example: The new TLA has had stable links for the individual sentences for a year now. Six months ago, we scraped the sentences of the new TLA to make the content accessible for a hieroglyphic search. Have a look at our metasearch engine. You can search for hieroglyphs and hieroglyphic spellings not only in the TLA, but also in other digital projects. If you search for 𓊹𓍛𓌐𓈖𓇋𓏠𓈖 in our metasearch engine, there are three hits in the new TLA. However, the third hit https://thesaurus-linguae-aegyptiae.de/sentence/IBcBMTWY5pKI2kRBmwwGBj2lOk4 is not available!

Missing permalink

Unfortunately, the Wayback Machine does not have a snapshot. This is not a mistake on our part; this site did exist as a individual sentence in the TLA, the sentence ID IBcBMTWY5pKI2kRBmwwGBj2lOk4 was actually used, as a look at the AED proves. This sentence is the first evidence for the title ḥm-nṯr-tp.j-n-Jmn, and it corresponds to the sentence oraec45-12 in our text corpus. This sentence belongs to the text oraec45, which can be found on a Ramesside statue. In the TLA this is https://thesaurus-linguae-aegyptiae.de/text/E2NKDG4B2RGYNCI3PY32IA5HOE. Meanwhile, the data of the new TLA have been updated, as can be seen at https://thesaurus-linguae-aegyptiae.de/info/tla-development: A year ago they started with version 17, now version 18 is available. Our text has been changed, as the last revision dates back to November last year:

Text E2NKDG4B2RGYNCI3PY32IA5HOE

There is now also the file protocol, which states that Elizabeth Frood and Peter Dils made changes on November 16. However, the last change on November 27 is not mentioned. The change we are interested in here is not mentioned in the file protocol either. In this respect, the file protocol does not seem to be a very trustworthy source of information. If you look at the sentences of the text in the new TLA, you can see that IBcBMTWY5pKI2kRBmwwGBj2lOk4, i.e. oraec45-12, has been split into several sentences:

Sentence split

The German translation, on the other hand, assumes that the separate sentences are subordinate clauses, since they are followed by “indem” and “während”. Whether they are really independent clauses is certainly debatable. More problematic, however, is that the original ID seems to be lost in the splitting. If you promise stable links or, as in the case of the TLA, even PURLs, you should technically be able to keep your promise. Since this is not an isolated case with the new TLA, this is a technical overload. So this is the first problem when permalinks suddenly go offline due to technical inability or for whatever reason.

Second, just as the large-scale Centre for Computer-aided Egyptological Research project is offline today, it is quite conceivable that current large-scale projects could go offline. One reads https://www.trismegistos.org/keeptrismegistosalive.php with great concern. Trismegistos can only be kept alive if there are enough subscriptions. But can we really be sure that there will be enough money to maintain the Trismegistos infrastructure in ten years? Trismegistos is such an important digital service on which so many others depend. You don’t even want to think about the possibility of Trismegistos being offline. But the possibility is real, though fortunately not urgent!

Finally, there is the golden path followed by PNM. PNM not only provides stable links, but also publishes the research data on which its digital offering is based, see https://pnm.uni-mainz.de/info/Database+versions+and+citing+rules. This means that even if stable links are no longer available or PNM is completely offline, you still have the publication of the research data in a repository as reference. This is the best solution! A big compliment to PNM!

What can we learn from this? Well, if you are a data producer or have a digital offering, publish your raw data. This is the only way to ensure long-term citation and reuse. If you don’t, the worst case scenario is that all your data is gone! Sorry! That has to be said! As a user, however, you have no influence on the data providers. If no raw data is published, there is a risk that the data will not be available in the future. What Egyptology really needs is a general archiving strategy for Egyptological digital data. Since this does not exist at the moment, we have to try to build private archives on a decentralized basis. We don’t have a patent solution, but we think Kiwix is worth a look. Kiwix is designed to make Wikipedia knowledge available offline. So Kiwix is an offline browser. In order to view websites like Wikipedia in an offline browser, they must first be downloaded and converted into a special format. Kiwix uses the ZIM format. This works not only for Wikipedia, but also for other more static pages. We once used https://youzim.it/ to convert part of our ORAEC corpus into a ZIM file. It is easy to view in Kiwix. This option is potentially available for all of the Egyptological resources mentioned here. In other words, anything that can be accessed with a stable link can also be packed into a ZIM file so that it can be viewed in Kiwix. So this is an option for self-archiving. Do you have other ideas or suggestions? Write to us! We won’t be able to answer you quickly as we are going on a long Christmas vacation. We’ll probably be back by the end of January! Merry Christmas and a blessed 2024!

This work is marked with CC0 1.0 Universal


<
Previous Post
Minor improvements
>
Next Post
Update for the Metasearch Engine, a Correction and two Announcements