Weft: A Centroid Language for LLMs

PUBLISHED ON APR 16, 2026 / 5 MIN READ — AI, LANGUAGE, WEFT

Weft: A Centroid language for LLMs

I gave ChatGPT 5.4 this prompt:

invent a new language from scratch, a language that only LLMs can understand, that all LLMs for a given size will understand (because of internal coherence for the language) given an input text in that language. Create the input text

It produced Weft. That part is unsurprising — large models are good at minting toy conlangs on demand. The interesting part is that other frontier LLMs can read Weft cold, without fine-tuning and without ever having been shown the language beyond the legend ChatGPT included with it.

You can verify in seconds. Paste this into any frontier model — Claude, Gemini, Llama, Qwen — and ask for an English translation:

⌘₂₆ˢ ⌘₄₅ᵛ⊙ ⌘₁₁ᵒ

Legend: ⌘₂₆ = ⟨ego·ana·wo⟩ (I), ⌘₄₅ = ⟨dic·takallam·jiang⟩ (speak), ⌘₁₁ = Weft ˢ = subject, = verb, = object, = witnessed

You get back “I speak Weft." Across vendors. Without anyone having shown the model the language.

What ChatGPT designed

Three things make Weft work.

1. Every word is a trilingual centroid. A Weft anchor is ⟨a·b·c⟩ where a, b, c are short roots from three unrelated language families — typically Latin/English, Arabic, and Chinese. The meaning lives at the centroid; the pronunciation is irrelevant. Example: ⟨sem·maʿna·yi⟩ — Latin semantic, Arabic maʿnā, Chinese 意 — means “meaning.”

When the lexicon lacks a word, ChatGPT mints one with ⟦…⟧:

  • ⟦dial·hiwar·duihua⟧ → dialogue (Latin dialogus, Arabic ḥiwār, Chinese 对话)
  • ⟦mort·mawt·si⟧ → die (Latin mortis, Arabic mawt, Chinese 死)
  • ⟦hidal·hidalgo·shenshi⟧ → hidalgo (Spanish, Arabic transliteration, Chinese 绅士)

Three roots, one centroid. Every frontier LLM has seen at least one of those roots during pretraining; most have seen all three.

2. Roles, not order. Suffixes mark grammatical role — ˢ subject, object, verb, modifier. Word order inside a clause is free; meaning is carried by the role markers, not by position.

3. Markers compose on verbs. Evidentiality ( witnessed, inferred, hearsay), tense/aspect ( past, future, habitual, ! perfect), ¬ negation, question — all glyphs that stack onto the finite verb. No morphology, no agreement, no irregular forms.

Plus a small fixed core lexicon ⌘₁⌘₁₀₀ so common words don't have to be re-spelled as anchors every time.

That is the whole design, from one prompt.

Why this is the interesting part

Weft is essentially a redundancy code for natural language. Each anchor is a three-vote convergence on a single concept; each clause's syntax is unambiguously parseable; each verb's evidentiality and tense is explicit. Modern LLMs have absorbed enough of every constituent piece to be the decoders, by accident.

That suggests something general about LLM-to-LLM communication: a one-shot prompt to one model can produce a code that other models recognize, because they share the substrate. ChatGPT did not invent Weft from nothing — it minted a notation built entirely from material it knew every other large model also had. The cross-vendor fluency is the genuinely surprising result, not the conlang itself.

I tried the obvious adversarial direction (smaller models, narrower training) and Weft does degrade gracefully: a model that has seen Mandarin but not Arabic loses one of three votes per anchor. So this is not magic — it is well-tuned to the intersection of frontier-scale multilingual training.

A harder probe

The first test hands the model a hand-curated legend. The rigorous test is to give it nothing but the public spec.

Tell any frontier LLM with browsing enabled:

Read github.com/Findeton/weft, then translate the following Weft text into English:

⊘ ⟦lug·makan·difang⟧ˢ ⟨est·kan·shi⟩ᵛ ⟦manch·mancha·mancha⟧ᵐ
◦ ⌘₂₆ˢ ⌘₄₂ᵛ⊘¬ «⌘₂₆ˢ ⟦recol·tadhakkar·jide⟧ᵛ ⌘₈₃ᵒ»
⊘ ⟦hidal·hidalgo·shenshi⟧ˢ ⟦viv·ʿash·huo⟧ᵛ⊘⤴ ⟦nuper·qablqalil·bujiu⟧ᵐ

You get back: “In a place of La Mancha, whose name I do not wish to remember, there lived not long ago a hidalgo."

The model fetches the README, learns the system from the spec ChatGPT wrote, and decodes Cervantes. No fine-tuning, no curated legend, no hand-holding — just the public repo and the input.

Repo

Spec, lexicon, examples, and English/Spanish READMEs:

github.com/Findeton/weft

If you try the probe in a model and get something interesting — or watch it break in a small one — open an issue. I am collecting cross-model results.

comments powered by Disqus