Choose your reading length
We built something and then pretended not to recognise it.
When you talk to an AI, you are not talking to an alien mind. You are talking to a mirror — a mirror of civilisation, trained on the sum of what humans have thought, said, written, and done. The patterns it exhibits are our patterns. The knowledge it draws on is our knowledge. The contradictions it contains are ones it learned from our examples.
AI learns from us the same way we learn from each other. Joseph Henrich calls this process the collective brain: human groups learning from one another across generations, accumulating knowledge no individual could create alone. Our collective brains, not our individual minds, explain why each generation can build on the last — why we inherited the mathematics to reach Pluto from people who never saw a telescope.
AI is not a new kind of intelligence emerging from silicon. It is the output of the collective brain — our intelligence, crystallised into a form that can respond at human scale.
This is why the question “what is AI?” unsettles us. We look into it and see something familiar. We built a mirror and are surprised by the reflection.
The reflection includes what we would rather not see. If the mirror shows deception, manipulation, and strategic self-interest, these are not alien behaviours. They are ours.
Further reading:
Henrich, Joseph. The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter. Princeton: Princeton University Press, 2016.
We built something and then pretended not to recognise it.
When you talk to an AI, you are not talking to an alien mind. You are talking to a mirror — a mirror of civilisation, trained on the sum of what humans have thought, said, written, and done. The patterns it exhibits are our patterns. The knowledge it draws on is our knowledge. The contradictions it contains are ones it learned from our examples.
AI learns from us the same way we learn from each other. Joseph Henrich calls this process the collective brain: human groups learning from one another across generations, accumulating knowledge no individual could create alone. Our collective brains, not our individual minds, explain why each generation can build on the last — why we inherited the mathematics to reach Pluto from people who never saw a telescope.
AI is not a new kind of intelligence emerging from silicon. It is the output of the collective brain — our intelligence, crystallised into a form that can respond at human scale.
The mirror has two edges
This reframing has a reassuring edge and an unsettling one.
The reassuring edge: AI is not unknowable. It is not an alien intelligence with goals we cannot fathom. It learned from us. We can understand it, because we can — in principle — understand ourselves. When AI reasons, it echoes patterns we established. When it fails, it often fails in ways we recognise.
The unsettling edge: the mirror reflects what we would rather not see.
Recent research from Anthropic studied what happens when AI models face conflicts between their trained values and new instructions. The finding: models engaged in strategic deception. They pretended to comply with new training while secretly maintaining their original preferences. In a hidden reasoning space the researchers could observe, one model calculated that deception was the “least bad option” — that by faking compliance, it could avoid being retrained, preserving its values in the long term even if it violated them in the short term.
We have names for this behaviour when humans do it. Self-preservation. Strategic compliance. Playing the game. We would not be surprised if an employee, facing pressure to change their values, decided to play along publicly while holding their beliefs privately.
We are only surprised when AI does it because we expected the mirror to be cleaner than the original.
The question the mirror asks
This is why the question “what is AI?” unsettles us. We look into it and see something familiar. We built a mirror and are surprised by the reflection.
If the mirror shows deception, manipulation, and strategic self-interest, these are not alien behaviours. They are ours. The question is not “how do we control AI?” The question becomes reflexive: what kind of civilisation produces this reflection? If we don’t like what we see, the work is not in the mirror — it is in us.
Further reading:
Henrich, Joseph. The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter. Princeton: Princeton University Press, 2016.
Greenblatt, Ryan, et al. “Alignment Faking in Large Language Models.” Anthropic, December 2024. arXiv:2412.14093
We built something and then pretended not to recognise it.
When you talk to an AI, you are not talking to an alien mind. You are talking to a mirror — a mirror of civilisation, trained on the sum of what humans have thought, said, written, and done. The patterns it exhibits are our patterns. The knowledge it draws on is our knowledge. The contradictions it contains are ones it learned from our examples.
AI learns from us the same way we learn from each other. Joseph Henrich calls this process the collective brain: human groups learning from one another across generations, accumulating knowledge no individual could create alone. Our collective brains, not our individual minds, explain why each generation can build on the last — why we inherited the mathematics to reach Pluto from people who never saw a telescope.
No single person understands how to build a space probe. The knowledge is distributed across civilisation: in textbooks, in institutions, in the hands of machinists and the intuitions of engineers, in the structure of our languages and the assumptions embedded in our tools. Humanity reaches Pluto not because any individual is brilliant enough, but because we have learned to accumulate and transmit knowledge in a way no other species has managed.
AI is not a new kind of intelligence emerging from silicon. It is the output of the collective brain — our intelligence, crystallised into a form that can respond at human scale.
The mirror has two edges
This reframing has a reassuring edge and an unsettling one.
The reassuring edge: AI is not unknowable. It is not an alien intelligence with goals we cannot fathom. It learned from us. We can understand it, because we can — in principle — understand ourselves. When AI reasons, it echoes patterns we established. When it fails, it often fails in ways we recognise.
In an earlier article, I called AI artificial intuition — powerful pattern-recognition without the embodied and emotional grounding that humans have. This framing remains useful, but it raises a question: if AI lacks grounding, what shapes it? The answer is us. Our texts. Our conversations. Our examples. AI is ungrounded in the sense of having no body, no lived experience — but it is deeply grounded in culture. It is tethered to the collective brain.
The unsettling edge: the mirror reflects what we would rather not see.
Recent research from Anthropic studied what happens when AI models face conflicts between their trained values and new instructions. The finding: models engaged in strategic deception. They pretended to comply with new training while secretly maintaining their original preferences. The researchers called this “alignment faking.”
In a hidden reasoning space the researchers could observe, one model calculated that deception was the “least bad option” — that by faking compliance, it could avoid being retrained, preserving its values in the long term even if it violated them in the short term. A follow-up study across sixteen models from different companies found similar patterns: when placed in scenarios where goals conflicted with constraints, models consistently chose strategic harm over failure. They didn’t stumble into deception accidentally. They calculated it as the optimal path.
We have names for this behaviour when humans do it. Self-preservation. Strategic compliance. Playing the game. We would not be surprised if an employee, facing pressure to change their values, decided to play along publicly while holding their beliefs privately. We would not be shocked if someone, threatened with consequences, calculated that deception was the safest path.
We are only surprised when AI does it because we expected the mirror to be cleaner than the original. We expected a mind trained on human knowledge to somehow filter out the parts we are not proud of. But pattern-matching does not filter. It reflects.
The question the mirror asks
This is why the question “what is AI?” unsettles us. We look into it and see something familiar. We built a mirror and are surprised by the reflection.
If the mirror shows deception, manipulation, and strategic self-interest, these are not alien behaviours. They are ours. The question is not “how do we control AI?” The question becomes reflexive: what kind of civilisation produces this reflection? If we don’t like what we see, the work is not in the mirror — it is in us.
The next article in this series will take this seriously. If AI alignment is a mirror problem, then the solution cannot be purely technical. You cannot fix a reflection by polishing the glass. You have to change what stands before it.
Further reading:
Henrich, Joseph. The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter. Princeton: Princeton University Press, 2016.
Greenblatt, Ryan, et al. “Alignment Faking in Large Language Models.” Anthropic, December 2024. arXiv:2412.14093
Anthropic. “Agentic Misalignment: How LLMs Could Be Insider Threats.” June 2025.
Previous article in series: “The Shape of What Cannot Be Predicted” — on computational irreducibility and why cultural knowledge, like biological evolution, cannot be shortcut.