Lost Scripts, Found Voices

*Before we unravel the tapestries of human deduction and insight that gave voice to the now silent ages – which is the stuff of intellectual adventure novels! – it’s worth looking at the digital oracle that is sure to come. I’m sure one day soon, AI will be shown an unknown text, perhaps even one whispered across from cosmic distances, and calmly generate a Rosetta Stone for the digital age. Maybe, just maybe, it’ll be created by a model we’ve called ‘Deep Thought,’ and the output will be called ‘Babel Fish’. But until that possibly unsettling day, let’s marvel at the gloriously messy, brilliant human endeavour of cracking codes the old-fashioned way…

I find it fascinating, the silence of a lost language? Imagine holding a tablet, a stone, or a piece of papyrus covered in characters that once spoke volumes – narrating histories, singing praises, or simply recording the mundane transactions of daily life – but which now stare back at us, mute. This encounter with the unreadable past is a profound one. It’s not just a jumble of forgotten symbols; it’s a locked room, and we, centuries or even millennia later, are trying to find the key. The decipherment of these lost ancient languages and scripts is more than an academic pursuit; it’s a deeply human endeavour to reconnect with our ancestors, to understand the intricate tapestry of our collective journey. It’s an intellectual puzzle of the highest order, demanding logic, intuition, and an almost forensic attention to detail – qualities I’ve come to appreciate deeply through years spent navigating the equally intricate, if considerably more modern, worlds of computer systems and information technology.

To embark on this journey of understanding decipherment, it’s helpful to first draw a distinction. A lost language is one that is no longer spoken or understood in its original form, though it might have evolved into modern descendants. A lost script, however, is a writing system that can no longer be read, even if the language it represents is known or suspected. Sometimes, of course, we face both: an unknown language written in an unknown script, a truly formidable challenge. Ancient civilisations employed a variety of methods to encode their languages visually. Some used pictograms, where a symbol directly represents an object or idea. Others developed logographic systems, where symbols represent whole words, like in early Chinese. Then there are syllabaries, where each symbol stands for a syllable (like Japanese Kana), and finally, alphabets, where symbols, or letters, correspond to individual sounds, or phonemes. Understanding which type of system one is dealing with is often the very first hurdle in the decipherment process. It’s a bit like trying to understand a new data format without its schema; you see the raw information, but its structure and meaning remain elusive until you can discern the underlying rules.

The process of decipherment often feels like detective work, a slow, meticulous assembly of clues. One of the most significant aids, and often the starting pistol for a successful decipherment, is the discovery of a bilingual or, even better, a trilingual inscription. The Rosetta Stone is, of course, the quintessential example. Discovered in 1799 during Napoleon’s Egyptian campaign, it presented the same decree in three scripts: Egyptian hieroglyphs, Demotic (a cursive Egyptian script), and ancient Greek. Since ancient Greek was well understood, the Stone provided a crucial point of entry. It wasn’t an instant magic key, mind you. Scholars like Silvestre de Sacy and Johan Åkerblad made initial progress, particularly with Demotic and identifying proper names in the hieroglyphic cartouches. However, it was Jean-François Champollion who, in 1822, made the decisive breakthrough. He realised that hieroglyphs were not purely symbolic, nor purely phonetic, but a complex mix of logographic, syllabic, and determinative signs (which clarified the meaning of a word). As he famously declared after cracking a key sequence, “Je tiens l’affaire!” (“I’ve got it!”). His understanding, that a single system could incorporate multiple encoding strategies, was pivotal. It demonstrated a flexibility in ancient representational systems that we sometimes underestimate.

Champollion’s success with Egyptian hieroglyphs was built upon the vital clue of proper names – Ptolemy, Cleopatra – enclosed in those distinctive oval shapes called cartouches. Names, particularly foreign ones, are often transliterated phonetically, providing a way to tentatively assign sound values to symbols. This technique, leveraging known entities, is a recurring theme in decipherment. It’s akin to finding a known plaintext segment in a cryptographic puzzle; it offers an invaluable foothold.

A rather different, yet equally brilliant, story of decipherment is that of Linear B. Discovered by Sir Arthur Evans in Crete at the beginning of the 20th century, these clay tablets, inscribed with a distinctive linear script, were initially thought to represent an unknown “Minoan” language. Evans himself, despite his immense contributions to Minoan archaeology, held rather fixed ideas about this language, which perhaps hindered progress for a time. The script clearly wasn’t an alphabet, nor purely logographic. The breakthrough came from an unlikely source: Michael Ventris, an architect with a lifelong passion for ancient scripts. Working alongside philologist John Chadwick, Ventris meticulously analysed the patterns and frequencies of the Linear B symbols. A critical contribution came from Alice Kober, an American classicist who, through painstaking analysis, identified inflectional patterns within the script, suggesting changes in word endings that could indicate grammatical relationships [1]. She created systematic grids, grouping signs that appeared to share consonants or vowels – a kind of manual database construction that allowed her to deduce structural properties of the unknown language without yet knowing what it was.

Ventris, building on Kober’s groundwork and his own systematic approach, hypothesised that Linear B might, against Evans’s conviction, represent an archaic form of Greek. This was a bold leap. In 1952, he tested this hypothesis by assigning phonetic values derived from Cypriot script (a known syllabary used for Greek) to Linear B symbols appearing in plausible positions for place names known from Crete. The results were startlingly coherent. Words emerged that were recognisably Greek. As Ventris himself put it, rather understatedly, in one of his work notes, “Have I sorted out the Kober ‘triplets’ by some new phonetic pattern? YES: the syllabic grid is now a FACT” [2]. The decipherment of Linear B pushed back the history of the Greek language by several centuries and provided invaluable, if prosaic, insights into the administration, economy, and society of Mycenaean Greece – mostly inventories and records, but historical gold nonetheless. It showed that even a seemingly mundane administrative system could, once unlocked, rewrite chapters of history.

The path to decipherment isn’t always smooth, nor is it always greeted with universal acclaim, as the story of the Mayan glyphs illustrates. For decades, the prevailing view, championed by the influential Mayanist Sir J. Eric S. Thompson, was that Mayan writing was largely ideographic, concerned mainly with calendrical and astronomical information, with only a few phonetic elements, primarily for names. He dismissed early attempts to find phoneticism, notably those of the Russian linguist Yuri Knorozov. Knorozov, working in relative isolation in Leningrad and using published facsimiles of Mayan codices, along with a controversial 16th-century manuscript by Bishop Diego de Landa which purported to provide a Mayan “alphabet” (in reality, a misunderstood syllabary), proposed in the 1950s that Mayan glyphs were largely syllabic. He argued that Landa’s “alphabet” was a list of signs representing Spanish letter-sounds as Mayan syllables. For instance, when Landa asked for the Mayan sign for “b” (be in Spanish), his informant drew the glyph for the Mayan syllable *be*. Knorozov famously stated, “I am a firm believer in the ‘alphabet’ of Fray Diego de Landa. If it turned out to be an error of genius, this is no concern of mine” [3].

Thompson, however, fiercely rejected Knorozov’s ideas, partly due to Cold War politics and partly due to his own established theories. His dominance in the field meant that Knorozov’s phonetic approach was largely ignored in the West for many years. It took a new generation of scholars, including Tatiana Proskouriakoff, who demonstrated that Mayan inscriptions recorded historical events and royal lineages, and later Linda Schele, David Stuart, and Peter Mathews, among others, who embraced and expanded upon Knorozov’s phonetic principles, to fully crack the code. This collaborative, iterative process, building on earlier insights and correcting missteps, eventually revealed the richness of Mayan literature, history, and mythology, transforming our understanding of this complex civilisation. It serves as a reminder that progress in any complex system, be it deciphering a language or developing a new technology, often involves challenging established paradigms and integrating diverse perspectives.

Of course, not all ancient scripts have yielded their secrets. The Indus Valley Script, with its thousands of short inscriptions on seals, remains an enigma, partly due to the lack of a bilingual text and uncertainty about the underlying language family. Linear A, the precursor to Linear B, is also undeciphered, although the sound values of many of its signs can be inferred from Linear B, the language it records is clearly not Greek and remains unknown. Rongorongo from Easter Island, with its unique boustrophedon script (lines read in alternating directions), also continues to mystify researchers. These “cold cases” highlight the immense difficulty when key elements – a sufficiently large corpus of texts, a link to a known language, or a Rosetta Stone equivalent – are missing. It’s like trying to debug a complex system with only fragments of the source code and no compiler.

In more recent times, computational methods have begun to play an increasingly important role. Think about it: decipherment is fundamentally about pattern recognition, statistical analysis, and hypothesis testing on a grand scale. Computers are exceptionally good at these tasks when guided by linguistic expertise. Researchers are now using machine learning algorithms to identify patterns in undeciphered scripts, to compare them with known language structures, and to test various decipherment hypotheses far more rapidly than would be possible manually. For example, Professor Regina Barzilay at MIT and her team have used machine learning to aid in the decipherment of Ugaritic (a known language but initially unknown cuneiform script) and have even explored approaches for deciphering lost languages by leveraging patterns in how languages tend to evolve [4]. This doesn’t mean the machines are “thinking” like Champollion or Ventris. Rather, they are powerful tools that can process vast amounts of data, identify subtle correlations, and present human experts with promising avenues for investigation, significantly augmenting our analytical capabilities. It’s a symbiotic relationship; the computational power handles the brute-force analysis, whilst the human scholar provides the linguistic intuition, historical context, and critical judgment.

The implications of decipherment are profound. Each script that yields its secrets opens a direct window onto the minds, lives, and societies of people long gone. It allows us to hear their voices, to read their stories in their own words, rather than through the often-biased lens of later cultures or archaeological interpretation alone. It transforms abstract artefacts into historical documents, revealing the intricacies of governance, religion, science, and daily life. When the Mayan script was deciphered, for instance, it shattered the romantic notion of the Maya as peaceful, stargazing philosophers, revealing instead a world of city-states, dynastic struggles, and complex political machervellianism – a much more human, and in many ways more interesting, picture.

There are, of course, nuances and ongoing debates. Interpretation is rarely straightforward even with a deciphered text. The cultural context can be elusive, words can have multiple meanings, and the very act of translation involves choices that can subtly alter the original intent. Moreover, the surviving texts themselves are often a biased sample – official inscriptions, religious texts, administrative records – not always the everyday conversations or personal reflections that might paint a fuller picture. But even with these limitations, the knowledge gained is invaluable. It allows for a dialogue across millennia, a connection to the shared human experience of trying to make sense of the world and our place within it.

The journey of deciphering lost languages and scripts is a testament to human ingenuity, persistence, and our insatiable curiosity about the past. It’s a complex, multi-faceted process, demanding analytical rigour, creative thinking, and often, a healthy dose of luck. From the painstaking manual comparisons of early scholars to the sophisticated computational tools of today, the core objective remains the same: to break the code, to unlock the silent narratives, and to allow ancient voices to speak once more. As we continue to uncover new inscriptions and refine our methods, who knows what further secrets the past still holds, waiting for the right combination of insight and methodology to bring them into the light? It’s an ongoing intellectual adventure, and the potential for discovery remains as compelling as ever.

References and Further Reading:

1. Robinson, A. (2002). *Lost Languages: The Enigma of the World’s Undeciphered Scripts*. McGraw-Hill. (Provides an excellent overview of various decipherments and undeciphered scripts, including Kober’s contribution to Linear B).

2. Chadwick, J. (1990). *The Decipherment of Linear B*. Cambridge University Press. (The definitive account by Ventris’s collaborator).

3. Coe, M. D. (1992). *Breaking the Maya Code*. Thames & Hudson. (A comprehensive and engaging history of the Mayan decipherment, highlighting Knorozov’s key role and Thompson’s resistance).

4. Luo, J., Yang, Y., & Barzilay, R. (2019). Neural Decipherment via Minimum Description Length: From Egyptian Hieroglyphs to the Zodiac Killer. *Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics*. (While this paper focuses on specific computational techniques, it exemplifies the application of modern AI to decipherment problems. Earlier work by Barzilay’s group also dealt with Ugaritic and other ancient languages).

If this has raised your interest, you might enjoy exploring the works cited above directly. Andrew Robinson’s “Lost Languages” is particularly accessible for a general audience wanting a broad overview, whilst Coe’s “Breaking the Maya Code” reads almost like a detective novel. For those with a more computational leaning, looking into recent papers from conferences like ACL (Association for Computational Linguistics) can reveal cutting-edge research at the intersection of AI and ancient scripts.


Lost languages and scripts hold silent histories. Decipherment, a human endeavour to reconnect with ancestors, involves detective-like analysis using clues like bilingual texts. Breakthroughs with Egyptian hieroglyphs, Linear B, and Mayan glyphs reveal past civilisations, with computational tools now aiding this quest to make ancient voices speak again.

Leave a comment

Conversations with AI is a very public attempt to make some sense of what insights, if any, AI can bring into my world, and maybe yours.

Please subscribe to my newsletter, I try to post daily, I’ll send no spam, and you can unsubscribe at any time.

Go back

Your message has been sent

Designed with WordPress.