Can We Use AI for Khipu Decipherment?
This is our most frequently asked question. On paper, it would seem like using artificial intelligence (AI) to decipher khipus, the intricate knotted cord recording devices of the Inka Empire, would be a perfect match. After all, AI has made headlines for cracking ancient languages and recognizing patterns in massive datasets. So why hasn’t AI unlocked the secrets of these enigmatic textile artifacts that have puzzled researchers for more than a century?
The AI Expectation Gap
Recent years have seen remarkable achievements in computational approaches to ancient writing systems. Machine learning algorithms have assisted in identifying patterns in undeciphered scripts, helped reconstruct damaged texts[1], and contributed to understanding various ancient languages[2]. Projects in the last five years have successfully reconstructed lost scripts from limited data, but only when the language has a known descendant or relative (like Ugaritic’s relationship to Hebrew, or Linear B’s connection to Greek) and when there’s enough text for statistical inference, which typically required thousands of words rather than dozens.
These successes have created an expectation that AI might be a universal key for unlocking any ancient communication system. However, this assumption overlooks critical requirements that successful AI applications share: substantial amounts of data to learn from and, ideally, a “Rosetta Stone” — a bilingual text or known reference point that can anchor the analysis.
The Five Pillars of Decipherment
It has been observed that despite cultural differences, script differences, etc, the decipherment of unknown scripts follows a common strategy. From Egyptian hieroglyphs to Mayan writing, successful decipherments have all had the following five things in common. Now known as The Five Pillars of Decipherment, they were first described by their original author, Michael Coe, in his thoroughly delightful book on Breaking the Maya Code[3]. Restated by the grammatologist (the scientific study of writing systems or scripts), Marc Zender, in his article Theory and Method in Maya Decipherment[4], the five pillars are:
Coe’s five pillars are:
- Script Typology: understanding the type of signs and having a sign inventory.
- Corpus: a database of documents.
- Language: knowledge of the language being encoded.
- Cultural Context: understanding of rulers, places, and historical events.
- Biscript: a bilingual text or Rosetta Stone.
This framework provides a valuable structure for understanding exactly where khipu decipherment stands, and why AI has not yet been able to “crack the code.”
Pillar 1: Script Typology and Sign Inventory
After more than a century of research, scholars have developed a solid understanding of a khipu’s basic elements. Typically, each khipu consists of a primary cord with pendant cords hanging from it, potentially encoding information in multiple variables: knot types, knot numbers and positions, cord colors, final cord twist direction, materials, spacing, length, and grouping patterns.
The numerical aspect of Inka khipus is perhaps the most well understood, with the decimal position system often encoded in their knots having famously been deciphered in 1912 by L. Leland Locke[5]. Since then, multiple arithmetic operations, such as sums, have now also been clearly identified.
Consequently, this pillar is reasonably solid for khipus.
Pillar 2: Corpus
Recent surveys put the total number of surviving khipus today at around 1,300 to 1,600[6][7]. This might sound like a lot, but it’s actually a relatively small number for machine learning purposes. When AI systems learn to translate between languages, they are typically trained on millions of sentence pairs. The khipu corpus is orders of magnitude smaller than what would currently be considered adequate for most machine learning applications.
Moreover, our existing khipu database is riddled with incorrect or incomplete data. Digital records often contain errors and omissions; sometimes signs are only partially recorded, and there are frequently digitization errors. In some cases, khipus have been entered into the database backward. Making things even more difficult is the fact that many khipus themselves are incomplete or damaged.
Therefore, one of the main difficulties we face thanks to this messy, incomplete data is that it is challenging to prove decipherment hypotheses statistically.
This pillar represents a significant weakness for khipus.
Pillar 3: Language
Quechua, the principal language of the Inka Empire, is still spoken by millions today. Linguists have extensive knowledge of both modern Quechua and historical forms documented in colonial-era texts.
However, this strength is tempered by fundamental uncertainty: we are not entirely certain what kind of information khipus encode or whether they function like a conventional writing system — i.e., directly encoding sound and language. Colonial sources mention khipus being used to record histories, genealogies, poetry, and laws, but some researchers argue khipus may simply have been mnemonic devices rather than encoding narratives directly. Khipus may encode pattern-based or numeric information rather than phonetic language — a scenario where AI’s typical linguistic analysis tools simply do not apply.
Therefore, while this pillar is in relatively good shape, it is difficult to know if we can even leverage it in the case of khipus.
Pillar 4: Cultural Context
Researchers have substantial knowledge about Inka history, social organization, and administrative systems. This context is invaluable for forming hypotheses about what specific khipus might contain. In the Santa Valley, for example, researchers have worked to match six archaeological khipus with colonial census data from the same region[8].
However, even with this contextual knowledge, researchers lack definitive confirmation. Cultural context can suggest what a khipu might say, but cannot yet prove what it actually says. Moreover, AI systems do not have cultural knowledge or contextual understanding the way humans do. They can identify statistical patterns but cannot intuit cultural significance without explicit training on examples.
Thus, similar to the previous case, while this pillar appears to be in relatively good shape, it has not yet proved to be the linchpin for khipu decipherment.
Pillar 5: Biscript — The Missing Rosetta Stone
The fifth pillar represents the most critical gap, and the primary reason AI cannot simply “crack the code.” There is no confirmed bilingual text that definitively tells researchers what a specific khipu says. The closest approximation comes from around one hundred Spanish colonial court documents in which khipus were used as evidence and then translated into Spanish or Quechua for the court record. However, the original khipus referenced in these documents have not been identified or matched with surviving examples. Without the physical khipus that correspond to these translations, researchers cannot establish the confirmed correspondences needed for decipherment.
Without confirmed examples of what specific khipus actually say, there is no way to train a supervised learning algorithm — the type of AI most successful at decipherment tasks. This is why AI systems, like Google Translate, cannot simply be applied to khipus.
This is our missing pillar.
Understanding AI: Supervised versus Unsupervised Learning
When most people ask whether AI can decipher khipus, they are thinking of systems like Google Translate — large language models trained on vast quantities of human-generated text. These rely on supervised learning, where AI is trained on paired examples: input and output, text in one language and its translation in another.
With enough data, modern AI systems (i.e., deep neural networks) can find patterns, detect recurring symbols, word boundaries, and syntax patterns, and compute co-occurrence statistics. They can infer relationships by clustering words or symbols that behave alike (such as verbs or nouns), compare to known languages to detect likely relatives, and, in rare cases, build translation mappings that align an unknown script to a known one when bilingual corpora exist.
The fundamental problem with khipus and AI is straightforward: while researchers have the input (the physical khipu themselves), they lack the output — what these khipus actually say. Without confirmed translations, there is no way to train a supervised learning system.
The alternative is unsupervised learning, where AI identifies patterns without being told what they mean. Some neural network architectures can identify which elements frequently co-occur and are therefore conceptually related. However, identifying that a pattern exists is very different from understanding what that pattern means. Without confirmed examples of what a khipu actually says, these patterns remain unexplained.
Current common AI approaches fail for several reasons:
- When the language corpus is too small — for example, small data languages like Elamite, Linear A, or the Voynich Manuscript.
- When there’s no bilingual text.
- When the script may encode non-linguistic information.
Khipu research faces all of these challenges simultaneously. In such cases, AI can produce guesses, symbol clusters, or patterns, but so far they are unverifiable. It’s pattern detection, not decipherment.
What Computational Approaches Have Revealed
Despite the absence of a Rosetta Stone, researchers have applied various computational techniques to khipu data. Researcher Jon Clindaniel, for instance, applied neural network transformer architectures, using BERT, to identify semantic relationships, revealing sets of closely related colors[9]. I have previously used statistical approaches, including topic modeling, achieving results that were roughly similar.[10]
The critical question is: what do these patterns mean? Without confirmed examples of khipu content, researchers cannot answer definitively. The patterns exist, AI reliably detects them, but their meaning remains opaque.
These computational approaches work best as hybrid methods that combine AI pattern recognition with human expertise. AI-assisted decipherment works best when the tools are combined with researchers who can evaluate plausibility and context — material and archaeological clues, and cross-disciplinary models that link symbols to quantities, directions, or actions. Researchers are exploring how khipu data might map to counts or narrative threads, integrating computational pattern detection with cultural and historical knowledge.
The Broader Lesson About AI
The khipu case offers an important lesson about AI’s capabilities and limitations. AI is a powerful tool, but it cannot create information that does not exist in the data or overcome fundamental gaps in knowledge through computational power alone. The most successful applications of AI to ancient writing systems have involved researchers who already understood the system well and needed help with specific tasks — filling in damaged text, identifying scribal hands, or processing large document collections quickly.
When people imagine AI deciphering khipus, they envision feeding data into a neural network and receiving translations — but AI systems learn from examples. Without confirmed examples, without that missing fifth pillar, supervised learning has nothing to learn from. Unsupervised learning can identify patterns but cannot explain their meaning. It’s like asking an AI system to decipher a document when you cannot tell it whether it’s written in a phonetic alphabet, logographic system, musical notation, or mathematical formula.
The Answer
So why are we not yet able to use AI for khipu decipherment? Examining Coe’s Five Pillars makes it clear:
- Pillar 1 (Script Typology): Reasonably solid
- Pillar 2 (Corpus): Small, incomplete, and error-prone
- Pillar 3 (Language): Strong linguistic foundation, but uncertainty about whether or how khipus encode language
- Pillar 4 (Cultural Context): Substantial knowledge, but gaps remain
- Pillar 5 (Biscript): Missing, no confirmed Rosetta Stone
Without the crucial fifth pillar, AI cannot function as people imagine it can. Supervised learning requires verified examples that do not yet exist. Unsupervised learning can identify patterns but cannot interpret them.
This does not mean computational approaches are useless. As our dataset grows and as our decipherment becomes more complete, AI may serve as a valuable laboratory for experiments and understanding. Currently, hybrid methods that combine statistical inference coupled with cross-disciplinary approaches — such as the Khipu Field Guide team — offer the most promising directions.
Ultimately, decipherment is not purely a technical problem. It’s a profoundly human endeavor that requires creativity, cultural insight, historical knowledge, and intuitive leaps that current AI systems do not yet possess. The scholars who deciphered scripts like Egyptian hieroglyphics and Mayan glyphs succeeded through a combination of linguistic knowledge, historical context, cultural understanding, and inspired guesswork as much as art as science.
The story of khipu and AI is not one of technological failure. It’s a reminder that some mysteries require more than algorithms to solve. They require bridging not just linguistic gaps but cultural and temporal ones — understanding not just what symbols are, but what they meant to the people who created them to record, to remember, and to communicate across time. That is a challenge that will no doubt keep us, as researchers, engaged for years to come.
Assael, Yannis, Thea Sommerschield, Brendan Shillingford, et al. 2022. “Restoring and Attributing Ancient Texts Using Deep Neural Networks.” Nature 603 (7900): 280–83. https://doi.org/10.1038/s41586-022-04448-z. ↩︎
Sommerschield, Thea, Yannis Assael, John Pavlopoulos, et al. 2023. “Machine Learning for Ancient Languages: A Survey.” Computational Linguistics 49 (3): 703–47. https://doi.org/10.1162/coli_a_00481. ↩︎
Coe, Michael D., "Breaking the Maya Code", Thames & Hudson, January 1, 2012. ↩︎
Zender, Marc. "Theory and Method in Maya Decipherment." The PARI Journal VolumeXVIII, No. 2, Fall 2017. ↩︎
Locke, L. Leland. “The Ancient Quipu, a Peruvian Knot Record.” American Anthropologist 14, no. 2 (1912): 325–32. http://www.jstor.org/stable/659935. ↩︎
Medrano, Manuel. 2021. Quipus: Mil Años de Historia Anudada En Los Andes y Su Futuro Digital. Planeta. ↩︎
Thompson, Karen M. 2025. “Connecting Objects and Literature: A Case Study with Khipus, the ‘Khipu-Biblio Cross-Reference.’” Advances in Archaeological Practice, August 26, 1–18. https://doi.org/10.1017/aap.2025.3. ↩︎
FitzPatrick, Mackinley. 2024. “New Insights on Cord Attachment and Social Hierarchy in Six Khipus from the Santa Valley, Peru.” Ethnohistory 71 (4): 443–69. https://doi.org/10.1215/00141801-11266328. ↩︎
Clindaniel, Jon. 2024. “Colorful Insights from an AI Khipukamayuq.” Preprint, SocArXiv, May 22. https://doi.org/10.31235/osf.io/4p7s5. ↩︎
Khosla, Ashok. 2022. “Cord Color 'Topic Modeling'".
www.khipufieldguide.com, https://www.khipufieldguide.com/notebook/analyses/ascher_color_count.html#cord-color-topic-modeling. ↩︎
Comments ()