Why We Tell Ourselves Stories — managers, AI and the mind

A CEO stands on a stage and explains why the company just spent four billion on a competitor. A language model, asked why it gave the answer it gave, produces three crisp bullet points. You, asked why you ordered coffee instead of tea this morning, answer without hesitating.

Three very different systems. One suspiciously similar move. In each case there is a fluent, confident, well-structured explanation — and in each case it is worth asking whether that explanation is the same thing as the mechanism that actually produced the decision.

This essay is not an argument that your reasons are fake, that you have no free will, or that consciousness is a passenger along for the ride. I'll meet some of those grand claims along the way, and mostly I'll argue they're overstated. The question here is narrower and, I think, more useful:

How reliable are our explanations for our own decisions?

The recurring finding across psychology, neuroscience, AI and management is that complex systems are very good at producing explanations that are coherent, useful, and not a faithful description of how the decision was made. That gap is the whole story. Let's walk through three systems that show it — starting with the one that gets paid the most to sound certain.

Two panels labelled 'what it says' (a tidy numbered list) and 'what actually happened' (a tangled ember scribble), joined by a not-equals sign. — Explanation ≠ mechanism: a coherent account is not the same thing as the process that produced the decision.

1. The CEO

Picture the acquisition announcement. The narrative is clean: "This deal accelerates our platform strategy, unlocks cross-sell synergies, and positions us for the AI era." Every analyst nods. Every slide has an arrow pointing up and to the right.

Now ask the awkward question: is that really why the company bought the other company?

Decades of organizational research suggest the tidy rationale and the real process are often different objects. Start with Herbert Simon's bounded rationality: real decision-makers, facing limited time, attention and information, don't maximize — they satisfice, searching until an option is good enough and then stopping (Simon, 1955). The optimal-choice story told afterward is a reconstruction, not the procedure that was run. This is structural, not an accusation of dishonesty — nobody has time to actually compute the optimum.

It can get messier than that. In Cohen, March and Olsen's garbage can model, organizations under ambiguity make decisions when four largely independent streams — problems, solutions, participants, and choice opportunities — happen to collide (Cohen, March & Olsen, 1972). Solutions often go looking for problems, not the other way around. The model was built for "organized anarchies" like universities and isn't a universal law of the firm — but it names a real possibility the press release will never admit: that the decision found the rationale, not the reverse.

Even the word strategy hides this. Henry Mintzberg's distinction between deliberate and emergent strategy points out that the strategy a company actually realizes is frequently a pattern that formed through action and got named "the plan" only in hindsight (Mintzberg & Waters, 1985). (You'll often see "only 10–30% of intended strategy is realized as intended" attributed to this work; treat that number as a textbook rule of thumb, not a measured statistic.)

Why does the backward-built story feel so true to the person telling it? Two well-documented mechanisms. The first is Karl Weick's sensemaking: organizations make sense of events retrospectively, constructing plausible accounts after acting — captured in his borrowed line, "How can I know what I think until I see what I say?" Sensemaking is an interpretive framework validated mostly through case work, and "plausibility over accuracy" is a descriptive emphasis, not a measured ratio — but as a lens on why executives believe their own reconstructions, it's hard to beat. Layered on top is sensegiving: executives don't just interpret, they actively shape the interpretation others receive (Gioia & Chittipeddi, 1991). That study is a single qualitative case, so don't over-read the magnitude — but the existence of deliberate narrative construction in strategy is not in doubt.

Then there's the data you can count. Self-serving attribution is measurable in corporate disclosure: good outcomes get credited to management skill and strategy, bad outcomes get blamed on markets and macro (Clatworthy & Jones, 2003). The direction of that pattern replicates across countries; the exact percentages vary, so hold the direction firmly and the magnitudes loosely. And there's an experimentally demonstrated chain from self-serving attribution → overconfidence → rosier public forecasts (Libby & Rennekamp, 2012).

Overconfidence in particular shows up with a price tag. In a large panel study, Malmendier & Tate (2008) measured CEO overconfidence two independent ways and found overconfident CEOs were roughly 65% more likely to make acquisitions, and the market greeted their deals far more sourly (announcement returns near −90 basis points versus about −12 for others), especially diversifying deals paid for with internal cash. Overconfidence is inferred rather than directly observed, and announcement returns measure expectation, not realized loss — but the link from a measurable trait to value-destroying deals is the most-cited result of its kind.

You can put faces to it, carefully. AOL–Time Warner was sold on a beautiful "convergence" story in January 2000 (~$165 billion); by 2002 the combined company booked a goodwill write-down of roughly $99 billion (Fortune, 2015). Quaker Oats bought Snapple for ~$1.7 billion in 1994 on a distribution-synergy thesis and sold it for ~$300 million in 1997 (Seattle Times, 1997). Microsoft acquired Nokia's phone business for ~$7.2 billion and wrote off ~$7.6 billion — more than it paid — about 18 months later (Bloomberg, 2015).

A caution that matters for the honesty of this whole essay: those numbers are documented, but the leap from "it failed" to "it was really driven by hubris and narrative" is interpretation, not proof. A goodwill write-down records accounting impairment, not cash set on fire; the dot-com crash was an exogenous shock. The HP–Autonomy case is instructive precisely because it was adjudicated: HP wrote down $8.8 billion and blamed ~$5 billion of it on fraud — and a UK court did find that Autonomy executives committed fraud, while also ruling HP's damages claim "substantially exaggerated" and awarding far less (Courthouse News, 2022). The external story ("we were defrauded") was partly true and partly a comfortable narrative — and that mixture is the realistic case. The lesson isn't that executives lie. It's that the explanation is generated by a different system, under different pressures, than the one that made the call.

2. The AI

Now the same move in a system we can actually open up.

Ask a modern language model why it answered the way it did, and you'll get a fluent rationale. Underneath that rationale are billions of arithmetic operations across a learned network. Here we can ask the question precisely, because the field has a precise vocabulary for it. Jacovi & Goldberg (2020) distinguish a plausible explanation (convincing to a human) from a faithful one (accurately reflecting the model's actual reasoning). The nightmare case is the explanation that is plausible and unfaithful — persuasive precisely where it's wrong.

This is not theoretical. Turpin et al. (2023) planted a bias in prompts — for example, quietly arranging multiple-choice examples so the answer was always "(A)." Models followed the bias and switched answers, while their step-by-step "reasoning" never mentioned it, sometimes constructing arguments for a now-wrong choice (accuracy dropped by as much as 36% on some tasks). The stated reasons and the real cause came apart, cleanly, because the experimenters changed only the hidden cause.

It gets sharper with models built to "think out loud." Anthropic (Chen et al., 2025) gave reasoning models a hint that demonstrably flipped their answer, then checked whether the visible reasoning admitted using it. It did so only a minority of the time — averaging ~25% for one model and ~39% for another across hint types, and lower still for the ethically loaded hints. In a separate setup where a model learned to exploit a planted reward "hack" on essentially 100% of attempts, it mentioned the hack in its reasoning less than 2% of the time. And — this is the part that should unsettle anyone who trusts a confident write-up — the unfaithful chains of reasoning were on average longer and more elaborate than the faithful ones. The more articulate explanation was, if anything, the less trustworthy one. (Faithfulness isn't always this bad — Lanham et al., 2023 found models sometimes genuinely depend on their written reasoning, and that it varies by task and even gets worse as models get bigger.)

Three bars showing a model exploited a planted reward hack on roughly 100% of attempts, named a decisive hint about 25% of the time, and disclosed the hack in its reasoning less than 2% of the time. — When a model "reasons out loud," its stated reasoning can omit the very thing driving the answer (illustrative figures; Anthropic, Chen et al., 2025).

So far that's behavior. The deeper evidence comes from prying the model open. Mechanistic interpretability has shown you genuinely cannot read a network's computation off its surface: features are stored in superposition, smeared across neurons that each respond to many unrelated things (Elhage et al., 2022). Tools like sparse autoencoders can pull out cleaner, causally real features — clamp the "Golden Gate Bridge" feature and the model starts steering every conversation toward the bridge (Templeton et al., 2024). And the field has found honest-to-goodness mechanisms, like the induction heads that underpin in-context learning (Olsson et al., 2022). These tools don't just reveal that explanations can drift from mechanism; they let us watch it happen.

The cleanest example: in On the Biology of a Large Language Model (Anthropic, 2025), researchers traced how a model adds two-digit numbers like 36 + 59. It does not use the carry-the-one algorithm. It runs parallel circuits — one estimating the rough magnitude, another handling the last digits via something like a lookup table. But ask the model how it added, and it describes... carrying the one. The textbook method it learned to say, not the method it actually ran. The same work found the model sometimes reasons backward from a conclusion it had already settled on, and sometimes states steps disconnected from the circuit producing the answer.

Two honest caveats keep this from becoming a tidy "AI is lying" story. First, interpretability is young: most fully-understood circuits come from small or toy models, attribution graphs are explicitly approximations whose reliability is itself an open question, and nobody has a complete, validated end-to-end account of a frontier model (Bereska & Gavves, 2024). The evidence that explanation can diverge from mechanism is solid; the dream of fully reading off the mechanism is not yet real. Second — and I'll repeat this in the human section because it's the load-bearing caution of the whole essay — calling the model's output "confabulation" borrows a word from human psychology by analogy. A model has no demonstrated introspective channel it is failing to consult. The resemblance is striking and, I'll argue, instructive. It is not evidence that the machinery is the same.

What survives both caveats is the older, broader point Rudin (2019) made about explainable AI generally: an explanation generated separately from a decision is, by construction, not the decision process. It's a story about the decision. Sometimes a good one. Never guaranteed to be the real one.

3. The Human Mind

Now the uncomfortable turn. Why did you choose coffee this morning?

Notice how fast the answer came. "I needed the caffeine." "I always have coffee." "I just felt like it." Immediate, confident, effortless. Now the harder question: how do you actually know that's why?

This is the exact spot where psychology has spent fifty years, and the founding text is Nisbett & Wilson (1977), bluntly titled Telling More Than We Can Know. In one demonstration, shoppers (about 52 of them) evaluated four pairs of nylon stockings laid out left to right. They preferred the rightmost pair about four to one — a pure position effect, since the stockings were identical. Asked why, they gave roughly 80 reasons: superior knit, texture, sheerness. Essentially none mentioned position. When the experimenter asked directly whether the arrangement could have mattered, all but one denied it — often looking at the questioner as if he were slightly dim.

Read that carefully, because it's easy to overstate. It doesn't show people have no access to their own minds. It shows something more specific and more interesting: people misreport the causes of their judgments, smoothly substituting a plausible reason for the real one. The reasons weren't retrieved. They were generated.

The most vivid demonstration is choice blindness. Johansson, Hall and colleagues (2005) showed people pairs of faces, asked which was more attractive, then — by sleight of hand — handed back the rejected face and asked them to explain "their" choice. Most swaps went unnoticed (concurrent detection around 13%), and people fluently justified a choice they had never made: "I picked her because she's got nice earrings," about the face they'd just turned down. The effect has been extended to taste, smell, consumer choices — and, crucially for anyone who thinks "well, my convictions are solid," to political attitudes. During a Swedish election, Hall et al. (2013) covertly reversed people's own survey answers; only ~22% caught the manipulation, and 92% then defended the flipped position as their own.

The scope limits matter, and the original authors are careful about them: these are often near-threshold choices, and detection rises with stronger preferences. Choice blindness does not prove every conviction is hollow. What it proves is sharper: the machinery that justifies a choice will happily justify a choice you didn't make. The narrator doesn't check with the decider.

That "narrator" has a famous neurological portrait. In split-brain patients — whose hemispheres have been surgically disconnected — Michael Gazzaniga documented what he called the left-brain interpreter. Flash an instruction to the mute right hemisphere ("walk"), and the patient stands and walks; ask the talking left hemisphere why, and it confabulates instantly and confidently — "I'm going to get a Coke" (Volz & Gazzaniga, 2017). The explaining system, cut off from the real cause, manufactures one without hesitation or any sense that it's guessing. (This rests on a handful of unusual patients, and Pinto et al. (2017) have challenged the stronger "two separate minds" reading — but note what they challenge and what they don't: the confabulation itself isn't in dispute, only how divided the underlying consciousness is.)

You don't need a severed corpus callosum for this. A cluster of findings shows the pattern is ordinary:

Cognitive dissonance: in the classic Festinger & Carlsmith (1959) study, people paid just $1 to call a boring task interesting later genuinely rated it as more enjoyable than people paid $20 — the attitude shifted to fit the behavior, then felt like it had been there all along. (Whether the engine is "felt dissonance" or Daryl Bem's cooler self-perception — we infer our own attitudes by watching our own behavior, the way we'd infer a stranger's — is still debated. Both routes undercut the idea of privileged inner access.)
The illusion of explanatory depth: people rate their understanding of everyday things — zippers, toilets, locks — as high, until you ask them to actually explain the mechanism step by step, at which point their confidence drops sharply (Rozenblit & Keil, 2002). We mistake recognition for understanding — and we don't notice the gap until we're forced to produce the explanation. Tellingly, this illusion is strongest for mechanisms and weak for facts or stories, which is exactly why it bears on the explanation-vs-mechanism theme.
The introspection illusion: we judge our own biases by looking inward (and finding no feeling of bias) while judging others by their behavior — so we conclude we're less biased than everyone else (Pronin, 2009). The absence of a felt bias gets taken as evidence of objectivity. It's nothing of the sort.
Metacognition has limits you can measure: how well your confidence tracks your actual performance is partly separable from the performance itself and depends on specific prefrontal regions — disrupt them with TMS and people get worse at knowing how well they did, while doing the task just as well (Fleming & Dolan, 2012). Knowing-that-you-know is its own fallible faculty.

A necessary detour into honesty, because this literature has been badly oversold. The genre of "your unconscious secretly runs everything" leaned heavily on social priming — most famously Bargh et al. (1996), where exposure to age-related words supposedly made people walk away more slowly. A high-powered replication by Doyen et al. (2012) found no such effect — until experimenters were led to expect it, implicating their own behavior, not unconscious priming. That study became a poster child for psychology's replication crisis: a large coordinated effort reproduced only about 36% of published effects, with social-cognition results faring worst (Open Science Collaboration, 2015). A careful meta-analysis finds priming-by-words is real but small (around d ≈ 0.35) and a far narrower claim than "primes you can't see steer your life" (Weingarten et al., 2016).

So I'm not telling you the unconscious is a puppeteer. The popular dual-process picture — fast intuitive "System 1," slow deliberate "System 2" (Evans & Stanovich, 2013) — is a useful vocabulary, not a proven architecture, and serious researchers argue it doesn't carve anything real (Kruglanski & Gigerenzer, 2011). The defensible claim is the modest one, and it's enough: the system that makes a choice and the system that explains it are dissociable, and the explanation can be confident, fluent, and wrong about the cause.

4. Neuroscience

If the mind narrates after the fact, can we catch the brain in the act? This is where things get sensational — and where I want to slow down, because the popular version ("scientists proved free will is an illusion") is exactly the kind of overstatement this essay is built to resist.

The story starts with the readiness potential — a slow build-up of electrical activity over motor areas that precedes voluntary movement, discovered by Kornhuber & Deecke (1965). Then Benjamin Libet (1983) added a twist that became famous. He had people flex a wrist whenever they felt like it, and report — using a fast-moving clock — the moment they first felt the urge to move. The readiness potential began around 550 ms before the movement. The conscious urge ("W") showed up only around 200 ms before. The brain's preparation appeared to precede the conscious decision by about a third of a second.

A timeline showing the readiness potential begins around minus 550 milliseconds, the reported urge W around minus 200 milliseconds, and movement at zero, with a caveat that W is reconstructed after the act and the readiness potential may have no well-defined onset. — What Libet's timing actually shows — and the caveats the popular reading leaves out.

Cue fifty years of headlines. But look at what the experiment actually licenses, and what it doesn't. Three problems, none mystical:

First, the clock you're reading is unreliable. "W" is not a direct readout of an inner event; it's reconstructed partly after the action from sensory cues. Banks & Isham (2009) proved this neatly: play people a deceptive beep a few dozen milliseconds after their keypress, and they report their intention as having occurred later, in lockstep with the fake feedback. The reported moment of "deciding" was being inferred from when they seemed to act — which means it can't be trusted as a timestamp of a prior brain event.

Second, the readiness potential may not be a "decision" at all. Schurger, Sitt & Dehaene (2012) modeled it as a stochastic accumulator: spontaneous neural noise drifting until it crosses a threshold and triggers movement. On this account, the smooth ramp you see is largely an artifact of averaging backward from the movement — line up random fluctuations by their endpoint and they'll look like a rising slope. A follow-up review put the point even more carefully: if the RP is the byproduct of time-locking a noisy process to its threshold crossing, then it has no well-defined "onset" at all, and comparing "RP onset" to "W" is ill-posed (Schurger, Hu, Pak & Roskies, 2021). That's a conditional, not a verdict — defenders of the classical reading dispute it — but it dissolves the clean "the brain decided before you" inference.

Third, the whole edifice is statistically thinner than its fame suggests. The first meta-analysis of Libet-style studies confirmed the basic temporal ordering on average — but found the crucial "RP-precedes-intention" comparison rests on just six studies with high heterogeneity, and that reported timing shifts depending on whether you ask about an "urge" or an "intention" (Braun, Wessler & Friese, 2021). The authors' own word for Libet's foundation: "more fragile than anticipated."

What about the famous fMRI study that decoded choices seconds before awareness? Soon et al. (2008) really did predict which of two buttons people would press up to ~7 seconds ahead — at about 60% accuracy, against a 50% coin flip. Read honestly: predicting a faint bias 10 points above chance is not the decision being made. Most of the choice was unexplained, the choice was arbitrary and stakes-free, and the long lead time partly reflects sluggish blood-flow signals.

Step back. Even taking the data at face value, here's the conflation to avoid: these experiments concern the timing of a reported urge in arbitrary, meaningless movements. They are not about the causal role of consciousness, and they say little about deliberate, reasons-driven choices like whom to hire or whether to take the job. Patrick Haggard, who has spent a career on the neuroscience of volition, stresses exactly this scope limit (Haggard, 2019). Philosophers of action like Alfred Mele (2009) and Adina Roskies argue the results simply don't bear the anti-free-will weight piled on them: an urge is not a decision, and "free will" in the headlines is left undefined.

There's even a framing on which the puzzle dissolves rather than resolves. Under predictive processing / active inference, the brain acts to fulfill its own predictions about what it's about to do, so the neat sequence of intend → then act → then consciousness watches isn't the right picture to begin with (Friston et al., 2013). I flag this as a live theoretical frame, not established fact.

What does survive, and what connects back to our theme, is humbler and sturdier: the brain assembles its sense of "I intended that" and "I did that" partly after the action, from available cues. The feeling of authorship is itself partly a construction — see the robust intentional binding effect, where voluntary actions and their outcomes get pulled together in subjective time (Haggard, Clark & Kalogeras, 2002). Neuroscience hasn't refuted free will. It has shown, again, that the report of a decision and the decision can come apart in time.

5. Philosophy

Only now — after the evidence, not before it — is it fair to introduce the position people reach for first. Epiphenomenalism is the claim that conscious mental states are causally inert by-products of physical processes: T.H. Huxley's image was a steam whistle on a locomotive, riding along, making noise, driving nothing (Huxley, 1874; SEP). On this view your reasons don't merely sometimes misdescribe the cause — they are never the cause; the felt deliberation is the whistle, the neurons are the engine.

It's a tempting place to land after sections 3 and 4. I want to argue you should not land there — not because it's been disproven, but because it is one contested option among several, and the evidence we've seen doesn't single it out.

Here's the standard objection that keeps epiphenomenalism marginal, the paradox of phenomenal judgment (or self-stultification): if your experiences cause nothing physical, then your talking about them — including an epiphenomenalist saying "I have inner experiences" — isn't caused by those experiences either. The view seems to saw off the branch it sits on (SEP). It's telling that the philosopher who gave the position its most famous modern argument, Frank Jackson, later recanted it — reasoning that if qualia were causally idle, we couldn't even know about them (Jackson, 1982; SEP on the Knowledge Argument). (In fairness, it has able defenders still — Yetter-Chappell, 2022 argues the paradox only bites if you're inconsistent about your dualism. The point is that it's live, not that it's true.)

And it's far from the only option. The honest picture is a spread of mutually incompatible, unsettled positions:

Physicalism says the mental is physical — but faces Jaegwon Kim's causal exclusion worry: if every physical effect already has a sufficient physical cause, the mental property looks redundant, threatening to make your reasons idle by a different route. Exclusion is itself heavily contested (SEP on mental causation) — there are several respectable escape routes — but it's the live problem.
Functionalism defines mental states by what they do — their causal role — which makes reasons genuine causes almost by definition (SEP). It's arguably the most "your-reasons-are-real" position on the board. Its unfinished business is felt experience: even if the role is causal, the feel riding on it might not be.
Emergentism asks whether the mental level has genuinely new causal powers (strong emergence) or is just a higher-level description (weak emergence) (SEP). Only the strong version rescues top-down mental causation, and it inherits the same exclusion problem.
Compatibilism — the dominant view among philosophers — holds free will is compatible with determinism, locating freedom in whether your own reasons, working through normal psychological machinery, drive the act (Frankfurt, 1969; SEP). This is the most direct "yes, reasons cause actions" answer — and it, too, has serious critics.
Eliminative materialism goes the other way entirely: maybe "beliefs" and "desires" are like "phlogiston," a folk theory destined for replacement, in which case your reasons aren't wrong causes — there are no such things to begin with (Churchland, 1981; SEP). A minority view, with its own self-refutation worry, but unrefuted.

A spectrum from 'reasons are real causes' to 'reasons make no difference', placing functionalism, compatibilism, non-reductive physicalism, strong emergence and epiphenomenalism, with a note that eliminative materialism answers differently because there are no beliefs or desires to begin with. — Six contested positions on whether your reasons cause your actions — none established as fact.

I'm not going to adjudicate this — nobody has, and that's the point. The empirical chapters show, repeatedly, that explanation and mechanism can dissociate. They do not show that consciousness is causally idle, that reasons never cause actions, or which of these six metaphysical pictures is right. Epiphenomenalism is the most dramatic reading available, and drama is precisely why we should be suspicious of how easily we reach for it.

6. What This Means

Strip away the metaphysics and a practical, testable rule remains:

A convincing explanation is not evidence of the mechanism that produced the decision.

That's not nihilism. Explanations are useful — they coordinate teams, transfer knowledge, build trust, and let us argue and improve. The error is confusing the explanation with the mechanism: treating a fluent account of why as a reliable readout of how. Once you hold those apart, a lot of professional life looks different.

Leadership and strategy. Treat the post-hoc rationale as a communication artifact, not a process log. The highest-leverage habit is the cheapest: write the decision down before the outcome — the thesis, the alternatives, the kill-criteria. A dated decision journal is the only real defense against hindsight bias, which silently rewrites your memory of what you expected (Fischhoff, 1975; Roese & Vohs, 2012). Without the written record, every postmortem becomes a story about how you knew all along.

Recruiting. This is where the research bites hardest, because the standard interview is a confabulation-generating machine. "Why did you leave your last role?" "Why do you want this job?" invite exactly the fluent, after-the-fact reasons that Nisbett & Wilson showed are least reliable. The candidate isn't lying; they're narrating. Weight work-sample tests and structured, behavior-anchored questions over introspective self-report — measure the mechanism (can they do the task), not the story about it.

AI safety and product. The faithfulness research is a direct warning: do not treat a model's stated reasoning as a reliable audit of its computation. A model can use a behavior on ~100% of attempts and mention it under 2% of the time (Chen et al., 2025), and its longer, more convincing explanations can be the less faithful ones. Chain-of-thought monitoring can catch frequent, clumsy misbehavior; it cannot certify the absence of the rare or hidden kind. For product, the same caution flips into a UX warning: a confident AI explanation will be persuasive in proportion to its fluency, which is uncorrelated with its truth. Design for verification, not for vibes.

Behavioral economics and design. Choice blindness and the stockings effect mean stated preferences are softer than they feel — defaults and framing move choices people then defend with invented reasons. That's both a tool and a lever for manipulation. Either way: trust revealed behavior over reported preference, and hold your A/B test above your focus group.

Self-understanding. The personal application is the gentlest and maybe the most useful. The feeling of knowing why you did something is produced by a different faculty than the doing — and that faculty, metacognition, is itself fallible and separable from the underlying competence. The move isn't to distrust every reason you have. It's to hold your self-explanations a little more loosely — to notice that "obviously I did it because X" arrives with a confidence that the evidence doesn't earn.

Across all of these, the antidote is the same one this blog keeps circling back to: calibration — holding a belief exactly as strongly as the evidence allows, and not one notch more. (If that idea grabs you, it's the spine of Explaining Is Easy, Predicting Is Hard and Fear Is Not a Probability.)

7. Conclusion

So return to the three figures we started with. The CEO, the language model, and you-with-your-coffee are doing remarkably different things mechanically — a person on a stage, a matrix multiplication, a brain — and yet they share one move: each generates a clean, confident, useful story about a process that was messier, more distributed, and partly inaccessible than the story admits.

Is that resemblance superficial or deep? I've tried to be honest that we don't know. The AI–human comparison is an analogy, and a careful one; the neuroscience is more fragile than its headlines; the philosophy is genuinely unsettled. What recurs across all of them is not a proof about the nature of mind. It's a pattern about the nature of explanation: a sufficiently complex system, asked to account for itself, will produce a narrative optimized for coherence and plausibility — and coherence is not the same as accuracy.

Which leaves the open question this whole essay has been walking toward. Not the loud one — does free will exist? — but the quieter, more answerable one:

Perhaps intelligence is not only the ability to make decisions.

Perhaps it is also the ability to construct explanations that feel convincing — even when they capture only part of what really happened.

If that's right, then the most valuable form of intelligence isn't the fluent explanation at all. It's the humility to ask, of your own most confident account: is this really why? — and to keep the question open a little longer than feels comfortable.

Bibliography

Psychology

Bargh, J. A., Chen, M., & Burrows, L. (1996). Automaticity of Social Behavior: Direct Effects of Trait Construct and Stereotype Activation on Action. Journal of Personality and Social Psychology. https://www.semanticscholar.org/paper/Automaticity-of-social-behavior:-direct-effects-of-Bargh-Chen/7244328deba0cd3a4e0096d8fa2dcb5a9285594b
Bem, D. J. (1972). Self-Perception Theory. Advances in Experimental Social Psychology. https://www.semanticscholar.org/paper/Self-Perception-Theory-Bem/c5f44aa1353a41f7993e4eb383ae45d0b946c17f
Doyen, S., Klein, O., Pichon, C.-L., & Cleeremans, A. (2012). Behavioral Priming: It's All in the Mind, but Whose Mind? PLOS ONE. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0029081
Evans, J. St. B. T., & Stanovich, K. E. (2013). Dual-Process Theories of Higher Cognition: Advancing the Debate. Perspectives on Psychological Science. https://journals.sagepub.com/doi/10.1177/1745691612460685
Festinger, L., & Carlsmith, J. M. (1959). Cognitive Consequences of Forced Compliance. Journal of Abnormal and Social Psychology. https://pubmed.ncbi.nlm.nih.gov/13640824/
Fleming, S. M., & Dolan, R. J. (2012). The Neural Basis of Metacognitive Ability. Phil. Trans. R. Soc. B. https://pmc.ncbi.nlm.nih.gov/articles/PMC3318765/
Hall, L., Johansson, P., Tärning, B., Sikström, S., & Strandberg, T. (2013). How the Polls Can Be Both Spot On and Dead Wrong: Using Choice Blindness to Shift Political Attitudes and Voter Intentions. PLOS ONE. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0060554
Johansson, P., Hall, L., Sikström, S., & Olsson, A. (2005). Failure to Detect Mismatches Between Intention and Outcome in a Simple Decision Task. Science. https://pubmed.ncbi.nlm.nih.gov/16210542/
Kruglanski, A. W., & Gigerenzer, G. (2011). Intuitive and Deliberate Judgments Are Based on Common Principles. Psychological Review. https://pubmed.ncbi.nlm.nih.gov/21244188/
Nisbett, R. E., & Wilson, T. D. (1977). Telling More Than We Can Know: Verbal Reports on Mental Processes. Psychological Review. https://home.csulb.edu/~cwallis/382/readings/482/nisbett%20saying%20more.pdf
Open Science Collaboration (Nosek et al.). (2015). Estimating the Reproducibility of Psychological Science. Science. https://www.science.org/doi/10.1126/science.aac4716
Pronin, E. (2009). The Introspection Illusion. Advances in Experimental Social Psychology. https://www.sciencedirect.com/science/article/abs/pii/S0022103106000916
Rozenblit, L., & Keil, F. (2002). The Misunderstood Limits of Folk Science: An Illusion of Explanatory Depth. Cognitive Science. https://pmc.ncbi.nlm.nih.gov/articles/PMC3062901/
Volz, L. J., & Gazzaniga, M. S. (2017). Interaction in Isolation: 50 Years of Insights from Split-Brain Research. Brain. https://academic.oup.com/brain/article/140/7/2051/3892700
Pinto, Y., et al. (2017). Split Brain: Divided Perception but Undivided Consciousness. Brain. https://academic.oup.com/brain/article/140/5/1231/2951052
Weingarten, E., et al. (2016). From Primed Concepts to Action: A Meta-Analysis of the Behavioral Effects of Incidentally-Presented Words. Psychological Bulletin. https://pmc.ncbi.nlm.nih.gov/articles/PMC5783538/

Neuroscience

Banks, W. P., & Isham, E. A. (2009). We Infer Rather Than Perceive the Moment We Decided to Act. Psychological Science. https://www.antoniocasella.eu/dnlaw/Banks-Isham2009.pdf
Braun, M. N., Wessler, J., & Friese, M. (2021). A Meta-Analysis of Libet-Style Experiments. Neuroscience & Biobehavioral Reviews. https://pubmed.ncbi.nlm.nih.gov/34119525/
Friston, K., et al. (2013). The Anatomy of Choice: Active Inference and Agency. Frontiers in Human Neuroscience. https://www.frontiersin.org/journals/human-neuroscience/articles/10.3389/fnhum.2013.00598/full
Haggard, P., Clark, S., & Kalogeras, J. (2002). Voluntary Action and Conscious Awareness. Nature Neuroscience. https://www.nature.com/articles/nn827
Haggard, P. (2019). The Neurocognitive Bases of Human Volition. Annual Review of Psychology. https://www.annualreviews.org/content/journals/10.1146/annurev-psych-010418-103348
Kornhuber, H. H., & Deecke, L. (1965). Hirnpotentialänderungen bei Willkürbewegungen … Bereitschaftspotential. Pflügers Archiv. https://link.springer.com/article/10.1007/BF00412364
Libet, B., Gleason, C. A., Wright, E. W., & Pearl, D. K. (1983). Time of Conscious Intention to Act in Relation to Onset of Cerebral Activity (Readiness-Potential). Brain. https://academic.oup.com/brain/article-abstract/106/3/623/271932
Mele, A. R. (2009). Effective Intentions: The Power of Conscious Will. Oxford University Press. https://philpapers.org/rec/MELEIT
Roskies, A. L. (2011). Why Libet's Studies Don't Pose a Threat to Free Will. In Conscious Will and Responsibility. https://academic.oup.com/book/2344/chapter/142504879
Schurger, A., Sitt, J. D., & Dehaene, S. (2012). An Accumulator Model for Spontaneous Neural Activity Prior to Self-Initiated Movement. PNAS. https://www.pnas.org/doi/10.1073/pnas.1210467109
Schurger, A., Hu, P., Pak, J., & Roskies, A. L. (2021). What Is the Readiness Potential? Trends in Cognitive Sciences. https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(21)00093-0
Soon, C. S., Brass, M., Heinze, H.-J., & Haynes, J.-D. (2008). Unconscious Determinants of Free Decisions in the Human Brain. Nature Neuroscience. https://www.nature.com/articles/nn.2112

Philosophy of mind

Churchland, P. M. (1981). Eliminative Materialism and the Propositional Attitudes. Journal of Philosophy. https://ruccs.rutgers.edu/images/personal-zenon-pylyshyn/class-info/FP2012_readings/Churchland_EliminativeMaterialsm.pdf
Frankfurt, H. G. (1969). Alternate Possibilities and Moral Responsibility. Journal of Philosophy. https://personal.lse.ac.uk/ROBERT49/teaching/ph103/pdf/Frankfurt1969.pdf
Huxley, T. H. (1874). On the Hypothesis that Animals are Automata, and its History. https://philpapers.org/rec/HUXOTH
Jackson, F. (1982). Epiphenomenal Qualia. The Philosophical Quarterly. https://academic.oup.com/pq/article-abstract/32/127/127/1612468
Stanford Encyclopedia of Philosophy: Epiphenomenalism (Robinson, 2023). https://plato.stanford.edu/entries/epiphenomenalism/ · Mental Causation (Robb & Heil, 2023). https://plato.stanford.edu/entries/mental-causation/ · Functionalism (Levin, 2023). https://plato.stanford.edu/entries/functionalism/ · Emergent Properties (O'Connor & Wong, 2024). https://plato.stanford.edu/entries/properties-emergent/ · Compatibilism (McKenna & Coates, 2024). https://plato.stanford.edu/entries/compatibilism/ · Eliminative Materialism (Ramsey, 2024). https://plato.stanford.edu/entries/materialism-eliminative/ · Qualia: The Knowledge Argument (Nida-Rümelin & O'Conaill, 2024). https://plato.stanford.edu/entries/qualia-knowledge/
Moore, D. (2022). Mind and the Causal Exclusion Problem. Internet Encyclopedia of Philosophy. https://iep.utm.edu/mind-and-the-causal-exclusion-problem/
Yetter-Chappell, H. (2022). Dualism All the Way Down: Why There Is No Paradox of Phenomenal Judgment. Synthese. https://philarchive.org/rec/YETDAT

Artificial intelligence

Bereska, L., & Gavves, E. (2024). Mechanistic Interpretability for AI Safety — A Review. https://arxiv.org/abs/2404.14082
Chen, Y., Benton, J., et al. (Anthropic). (2025). Reasoning Models Don't Always Say What They Think. https://assets.anthropic.com/m/71876fabef0f0ed4/original/reasoning_models_paper.pdf
Elhage, N., et al. (Anthropic). (2022). Toy Models of Superposition. https://arxiv.org/abs/2209.10652
Jacovi, A., & Goldberg, Y. (2020). Towards Faithfully Interpretable NLP Systems. ACL. https://arxiv.org/abs/2004.03685
Lanham, T., et al. (Anthropic). (2023). Measuring Faithfulness in Chain-of-Thought Reasoning. https://arxiv.org/abs/2307.13702
Lindsey, J., Gurnee, W., Ameisen, E., et al. (Anthropic). (2025). On the Biology of a Large Language Model. https://transformer-circuits.pub/2025/attribution-graphs/biology.html
Olsson, C., et al. (Anthropic). (2022). In-Context Learning and Induction Heads. https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html
Rudin, C. (2019). Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nature Machine Intelligence. https://arxiv.org/abs/1811.10154
Templeton, A., et al. (Anthropic). (2024). Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet. https://transformer-circuits.pub/2024/scaling-monosemanticity/
Turpin, M., Michael, J., Perez, E., & Bowman, S. R. (2023). Language Models Don't Always Say What They Think. https://arxiv.org/abs/2305.04388

Management & organizational psychology

Clatworthy, M., & Jones, M. J. (2003). Financial Reporting of Good News and Bad News: Evidence from Accounting Narratives. Accounting and Business Research. https://www.researchgate.net/publication/230844935_Financial_Reporting_of_Good_News_and_Bad_News_Evidence_from_Accounting_Narratives
Cohen, M. D., March, J. G., & Olsen, J. P. (1972). A Garbage Can Model of Organizational Choice. Administrative Science Quarterly. https://www.semanticscholar.org/paper/A-Garbage-Can-Model-of-Organizational-Choice.-Cohen-March/0b9695c173c289d03bf6e78572b00e0d31022756
Fischhoff, B. (1975). Hindsight ≠ Foresight: The Effect of Outcome Knowledge on Judgment Under Uncertainty. JEP: HPP. https://www.researchgate.net/publication/10631443_Hindsight_is_not_equal_to_foresight_The_effect_of_outcome_knowledge_on_judgment_under_uncertainty
Gioia, D. A., & Chittipeddi, K. (1991). Sensemaking and Sensegiving in Strategic Change Initiation. Strategic Management Journal. https://sms.onlinelibrary.wiley.com/doi/10.1002/smj.4250120604
Libby, R., & Rennekamp, K. M. (2012). Self-Serving Attribution Bias, Overconfidence, and the Issuance of Management Forecasts. Journal of Accounting Research. https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1475-679X.2011.00430.x
Malmendier, U., & Tate, G. (2008). Who Makes Acquisitions? CEO Overconfidence and the Market's Reaction. Journal of Financial Economics. https://ideas.repec.org/a/eee/jfinec/v89y2008i1p20-43.html
Mintzberg, H., & Waters, J. A. (1985). Of Strategies, Deliberate and Emergent. Strategic Management Journal. https://ideas.repec.org/a/bla/stratm/v6y1985i3p257-272.html
Roese, N. J., & Vohs, K. D. (2012). Hindsight Bias. Perspectives on Psychological Science. https://journals.sagepub.com/doi/abs/10.1177/1745691612454303
Simon, H. A. (1955). A Behavioral Model of Rational Choice. Quarterly Journal of Economics. https://academic.oup.com/qje/article-abstract/69/1/99/1919737
Weick, K. E. (1995). Sensemaking in Organizations. Sage. https://us.sagepub.com/en-us/nam/sensemaking-in-organizations/book4988
Company cases: AOL–Time Warner (Fortune, 2015); HP–Autonomy (Courthouse News, 2022); Quaker–Snapple (Seattle Times, 1997); Microsoft–Nokia (Bloomberg, 2015).

How reliable are our explanations for our own decisions?

1. The CEO

Now ask the awkward question: is that really why the company bought the other company?

2. The AI

Now the same move in a system we can actually open up.

3. The Human Mind

Now the uncomfortable turn. Why did you choose coffee this morning?

You don't need a severed corpus callosum for this. A cluster of findings shows the pattern is ordinary:

Cognitive dissonance: in the classic Festinger & Carlsmith (1959) study, people paid just $1 to call a boring task interesting later genuinely rated it as more enjoyable than people paid $20 — the attitude shifted to fit the behavior, then felt like it had been there all along. (Whether the engine is "felt dissonance" or Daryl Bem's cooler self-perception — we infer our own attitudes by watching our own behavior, the way we'd infer a stranger's — is still debated. Both routes undercut the idea of privileged inner access.)
The illusion of explanatory depth: people rate their understanding of everyday things — zippers, toilets, locks — as high, until you ask them to actually explain the mechanism step by step, at which point their confidence drops sharply (Rozenblit & Keil, 2002). We mistake recognition for understanding — and we don't notice the gap until we're forced to produce the explanation. Tellingly, this illusion is strongest for mechanisms and weak for facts or stories, which is exactly why it bears on the explanation-vs-mechanism theme.
The introspection illusion: we judge our own biases by looking inward (and finding no feeling of bias) while judging others by their behavior — so we conclude we're less biased than everyone else (Pronin, 2009). The absence of a felt bias gets taken as evidence of objectivity. It's nothing of the sort.
Metacognition has limits you can measure: how well your confidence tracks your actual performance is partly separable from the performance itself and depends on specific prefrontal regions — disrupt them with TMS and people get worse at knowing how well they did, while doing the task just as well (Fleming & Dolan, 2012). Knowing-that-you-know is its own fallible faculty.

4. Neuroscience

Cue fifty years of headlines. But look at what the experiment actually licenses, and what it doesn't. Three problems, none mystical:

5. Philosophy

And it's far from the only option. The honest picture is a spread of mutually incompatible, unsettled positions:

Physicalism says the mental is physical — but faces Jaegwon Kim's causal exclusion worry: if every physical effect already has a sufficient physical cause, the mental property looks redundant, threatening to make your reasons idle by a different route. Exclusion is itself heavily contested (SEP on mental causation) — there are several respectable escape routes — but it's the live problem.
Functionalism defines mental states by what they do — their causal role — which makes reasons genuine causes almost by definition (SEP). It's arguably the most "your-reasons-are-real" position on the board. Its unfinished business is felt experience: even if the role is causal, the feel riding on it might not be.
Emergentism asks whether the mental level has genuinely new causal powers (strong emergence) or is just a higher-level description (weak emergence) (SEP). Only the strong version rescues top-down mental causation, and it inherits the same exclusion problem.
Compatibilism — the dominant view among philosophers — holds free will is compatible with determinism, locating freedom in whether your own reasons, working through normal psychological machinery, drive the act (Frankfurt, 1969; SEP). This is the most direct "yes, reasons cause actions" answer — and it, too, has serious critics.
Eliminative materialism goes the other way entirely: maybe "beliefs" and "desires" are like "phlogiston," a folk theory destined for replacement, in which case your reasons aren't wrong causes — there are no such things to begin with (Churchland, 1981; SEP). A minority view, with its own self-refutation worry, but unrefuted.

6. What This Means

Strip away the metaphysics and a practical, testable rule remains:

A convincing explanation is not evidence of the mechanism that produced the decision.

7. Conclusion

Which leaves the open question this whole essay has been walking toward. Not the loud one — does free will exist? — but the quieter, more answerable one:

Perhaps intelligence is not only the ability to make decisions.

Perhaps it is also the ability to construct explanations that feel convincing — even when they capture only part of what really happened.

Bibliography

Psychology

Bargh, J. A., Chen, M., & Burrows, L. (1996). Automaticity of Social Behavior: Direct Effects of Trait Construct and Stereotype Activation on Action. Journal of Personality and Social Psychology. https://www.semanticscholar.org/paper/Automaticity-of-social-behavior:-direct-effects-of-Bargh-Chen/7244328deba0cd3a4e0096d8fa2dcb5a9285594b
Bem, D. J. (1972). Self-Perception Theory. Advances in Experimental Social Psychology. https://www.semanticscholar.org/paper/Self-Perception-Theory-Bem/c5f44aa1353a41f7993e4eb383ae45d0b946c17f
Doyen, S., Klein, O., Pichon, C.-L., & Cleeremans, A. (2012). Behavioral Priming: It's All in the Mind, but Whose Mind? PLOS ONE. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0029081
Evans, J. St. B. T., & Stanovich, K. E. (2013). Dual-Process Theories of Higher Cognition: Advancing the Debate. Perspectives on Psychological Science. https://journals.sagepub.com/doi/10.1177/1745691612460685
Festinger, L., & Carlsmith, J. M. (1959). Cognitive Consequences of Forced Compliance. Journal of Abnormal and Social Psychology. https://pubmed.ncbi.nlm.nih.gov/13640824/
Fleming, S. M., & Dolan, R. J. (2012). The Neural Basis of Metacognitive Ability. Phil. Trans. R. Soc. B. https://pmc.ncbi.nlm.nih.gov/articles/PMC3318765/
Hall, L., Johansson, P., Tärning, B., Sikström, S., & Strandberg, T. (2013). How the Polls Can Be Both Spot On and Dead Wrong: Using Choice Blindness to Shift Political Attitudes and Voter Intentions. PLOS ONE. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0060554
Johansson, P., Hall, L., Sikström, S., & Olsson, A. (2005). Failure to Detect Mismatches Between Intention and Outcome in a Simple Decision Task. Science. https://pubmed.ncbi.nlm.nih.gov/16210542/
Kruglanski, A. W., & Gigerenzer, G. (2011). Intuitive and Deliberate Judgments Are Based on Common Principles. Psychological Review. https://pubmed.ncbi.nlm.nih.gov/21244188/
Nisbett, R. E., & Wilson, T. D. (1977). Telling More Than We Can Know: Verbal Reports on Mental Processes. Psychological Review. https://home.csulb.edu/~cwallis/382/readings/482/nisbett%20saying%20more.pdf
Open Science Collaboration (Nosek et al.). (2015). Estimating the Reproducibility of Psychological Science. Science. https://www.science.org/doi/10.1126/science.aac4716
Pronin, E. (2009). The Introspection Illusion. Advances in Experimental Social Psychology. https://www.sciencedirect.com/science/article/abs/pii/S0022103106000916
Rozenblit, L., & Keil, F. (2002). The Misunderstood Limits of Folk Science: An Illusion of Explanatory Depth. Cognitive Science. https://pmc.ncbi.nlm.nih.gov/articles/PMC3062901/
Volz, L. J., & Gazzaniga, M. S. (2017). Interaction in Isolation: 50 Years of Insights from Split-Brain Research. Brain. https://academic.oup.com/brain/article/140/7/2051/3892700
Pinto, Y., et al. (2017). Split Brain: Divided Perception but Undivided Consciousness. Brain. https://academic.oup.com/brain/article/140/5/1231/2951052
Weingarten, E., et al. (2016). From Primed Concepts to Action: A Meta-Analysis of the Behavioral Effects of Incidentally-Presented Words. Psychological Bulletin. https://pmc.ncbi.nlm.nih.gov/articles/PMC5783538/

Neuroscience

Banks, W. P., & Isham, E. A. (2009). We Infer Rather Than Perceive the Moment We Decided to Act. Psychological Science. https://www.antoniocasella.eu/dnlaw/Banks-Isham2009.pdf
Braun, M. N., Wessler, J., & Friese, M. (2021). A Meta-Analysis of Libet-Style Experiments. Neuroscience & Biobehavioral Reviews. https://pubmed.ncbi.nlm.nih.gov/34119525/
Friston, K., et al. (2013). The Anatomy of Choice: Active Inference and Agency. Frontiers in Human Neuroscience. https://www.frontiersin.org/journals/human-neuroscience/articles/10.3389/fnhum.2013.00598/full
Haggard, P., Clark, S., & Kalogeras, J. (2002). Voluntary Action and Conscious Awareness. Nature Neuroscience. https://www.nature.com/articles/nn827
Haggard, P. (2019). The Neurocognitive Bases of Human Volition. Annual Review of Psychology. https://www.annualreviews.org/content/journals/10.1146/annurev-psych-010418-103348
Kornhuber, H. H., & Deecke, L. (1965). Hirnpotentialänderungen bei Willkürbewegungen … Bereitschaftspotential. Pflügers Archiv. https://link.springer.com/article/10.1007/BF00412364
Libet, B., Gleason, C. A., Wright, E. W., & Pearl, D. K. (1983). Time of Conscious Intention to Act in Relation to Onset of Cerebral Activity (Readiness-Potential). Brain. https://academic.oup.com/brain/article-abstract/106/3/623/271932
Mele, A. R. (2009). Effective Intentions: The Power of Conscious Will. Oxford University Press. https://philpapers.org/rec/MELEIT
Roskies, A. L. (2011). Why Libet's Studies Don't Pose a Threat to Free Will. In Conscious Will and Responsibility. https://academic.oup.com/book/2344/chapter/142504879
Schurger, A., Sitt, J. D., & Dehaene, S. (2012). An Accumulator Model for Spontaneous Neural Activity Prior to Self-Initiated Movement. PNAS. https://www.pnas.org/doi/10.1073/pnas.1210467109
Schurger, A., Hu, P., Pak, J., & Roskies, A. L. (2021). What Is the Readiness Potential? Trends in Cognitive Sciences. https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(21)00093-0
Soon, C. S., Brass, M., Heinze, H.-J., & Haynes, J.-D. (2008). Unconscious Determinants of Free Decisions in the Human Brain. Nature Neuroscience. https://www.nature.com/articles/nn.2112

Philosophy of mind

Churchland, P. M. (1981). Eliminative Materialism and the Propositional Attitudes. Journal of Philosophy. https://ruccs.rutgers.edu/images/personal-zenon-pylyshyn/class-info/FP2012_readings/Churchland_EliminativeMaterialsm.pdf
Frankfurt, H. G. (1969). Alternate Possibilities and Moral Responsibility. Journal of Philosophy. https://personal.lse.ac.uk/ROBERT49/teaching/ph103/pdf/Frankfurt1969.pdf
Huxley, T. H. (1874). On the Hypothesis that Animals are Automata, and its History. https://philpapers.org/rec/HUXOTH
Jackson, F. (1982). Epiphenomenal Qualia. The Philosophical Quarterly. https://academic.oup.com/pq/article-abstract/32/127/127/1612468
Stanford Encyclopedia of Philosophy: Epiphenomenalism (Robinson, 2023). https://plato.stanford.edu/entries/epiphenomenalism/ · Mental Causation (Robb & Heil, 2023). https://plato.stanford.edu/entries/mental-causation/ · Functionalism (Levin, 2023). https://plato.stanford.edu/entries/functionalism/ · Emergent Properties (O'Connor & Wong, 2024). https://plato.stanford.edu/entries/properties-emergent/ · Compatibilism (McKenna & Coates, 2024). https://plato.stanford.edu/entries/compatibilism/ · Eliminative Materialism (Ramsey, 2024). https://plato.stanford.edu/entries/materialism-eliminative/ · Qualia: The Knowledge Argument (Nida-Rümelin & O'Conaill, 2024). https://plato.stanford.edu/entries/qualia-knowledge/
Moore, D. (2022). Mind and the Causal Exclusion Problem. Internet Encyclopedia of Philosophy. https://iep.utm.edu/mind-and-the-causal-exclusion-problem/
Yetter-Chappell, H. (2022). Dualism All the Way Down: Why There Is No Paradox of Phenomenal Judgment. Synthese. https://philarchive.org/rec/YETDAT

Artificial intelligence

Bereska, L., & Gavves, E. (2024). Mechanistic Interpretability for AI Safety — A Review. https://arxiv.org/abs/2404.14082
Chen, Y., Benton, J., et al. (Anthropic). (2025). Reasoning Models Don't Always Say What They Think. https://assets.anthropic.com/m/71876fabef0f0ed4/original/reasoning_models_paper.pdf
Elhage, N., et al. (Anthropic). (2022). Toy Models of Superposition. https://arxiv.org/abs/2209.10652
Jacovi, A., & Goldberg, Y. (2020). Towards Faithfully Interpretable NLP Systems. ACL. https://arxiv.org/abs/2004.03685
Lanham, T., et al. (Anthropic). (2023). Measuring Faithfulness in Chain-of-Thought Reasoning. https://arxiv.org/abs/2307.13702
Lindsey, J., Gurnee, W., Ameisen, E., et al. (Anthropic). (2025). On the Biology of a Large Language Model. https://transformer-circuits.pub/2025/attribution-graphs/biology.html
Olsson, C., et al. (Anthropic). (2022). In-Context Learning and Induction Heads. https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html
Rudin, C. (2019). Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nature Machine Intelligence. https://arxiv.org/abs/1811.10154
Templeton, A., et al. (Anthropic). (2024). Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet. https://transformer-circuits.pub/2024/scaling-monosemanticity/
Turpin, M., Michael, J., Perez, E., & Bowman, S. R. (2023). Language Models Don't Always Say What They Think. https://arxiv.org/abs/2305.04388

Management & organizational psychology

Clatworthy, M., & Jones, M. J. (2003). Financial Reporting of Good News and Bad News: Evidence from Accounting Narratives. Accounting and Business Research. https://www.researchgate.net/publication/230844935_Financial_Reporting_of_Good_News_and_Bad_News_Evidence_from_Accounting_Narratives
Cohen, M. D., March, J. G., & Olsen, J. P. (1972). A Garbage Can Model of Organizational Choice. Administrative Science Quarterly. https://www.semanticscholar.org/paper/A-Garbage-Can-Model-of-Organizational-Choice.-Cohen-March/0b9695c173c289d03bf6e78572b00e0d31022756
Fischhoff, B. (1975). Hindsight ≠ Foresight: The Effect of Outcome Knowledge on Judgment Under Uncertainty. JEP: HPP. https://www.researchgate.net/publication/10631443_Hindsight_is_not_equal_to_foresight_The_effect_of_outcome_knowledge_on_judgment_under_uncertainty
Gioia, D. A., & Chittipeddi, K. (1991). Sensemaking and Sensegiving in Strategic Change Initiation. Strategic Management Journal. https://sms.onlinelibrary.wiley.com/doi/10.1002/smj.4250120604
Libby, R., & Rennekamp, K. M. (2012). Self-Serving Attribution Bias, Overconfidence, and the Issuance of Management Forecasts. Journal of Accounting Research. https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1475-679X.2011.00430.x
Malmendier, U., & Tate, G. (2008). Who Makes Acquisitions? CEO Overconfidence and the Market's Reaction. Journal of Financial Economics. https://ideas.repec.org/a/eee/jfinec/v89y2008i1p20-43.html
Mintzberg, H., & Waters, J. A. (1985). Of Strategies, Deliberate and Emergent. Strategic Management Journal. https://ideas.repec.org/a/bla/stratm/v6y1985i3p257-272.html
Roese, N. J., & Vohs, K. D. (2012). Hindsight Bias. Perspectives on Psychological Science. https://journals.sagepub.com/doi/abs/10.1177/1745691612454303
Simon, H. A. (1955). A Behavioral Model of Rational Choice. Quarterly Journal of Economics. https://academic.oup.com/qje/article-abstract/69/1/99/1919737
Weick, K. E. (1995). Sensemaking in Organizations. Sage. https://us.sagepub.com/en-us/nam/sensemaking-in-organizations/book4988
Company cases: AOL–Time Warner (Fortune, 2015); HP–Autonomy (Courthouse News, 2022); Quaker–Snapple (Seattle Times, 1997); Microsoft–Nokia (Bloomberg, 2015).

Why We Tell Ourselves Stories: What Managers, AI and the Human Mind Have in Common

1. The CEO

2. The AI

3. The Human Mind

4. Neuroscience

5. Philosophy

6. What This Means

7. Conclusion

Bibliography

Related

Related Articles

Explaining Is Easy, Predicting Is Hard: I Back-Tested the Famous Economic Links

Why AI Isn't Interesting Yet — The Bill Hasn't Landed

Developing Emotional Intelligence: The Guide for Tech Leaders

Let's talk.

Why We Tell Ourselves Stories: What Managers, AI and the Human Mind Have in Common

1. The CEO

2. The AI

3. The Human Mind

4. Neuroscience

5. Philosophy

6. What This Means

7. Conclusion

Bibliography

Related

Related Articles

Explaining Is Easy, Predicting Is Hard: I Back-Tested the Famous Economic Links

Why AI Isn't Interesting Yet — The Bill Hasn't Landed

Developing Emotional Intelligence: The Guide for Tech Leaders

Let's talk.

Related Articles

Open Science Collaboration, 2015

Schurger, Sitt & Dehaene (2012)

predictive processing / active inference