Research
The Intelligence Spectrum
What makes something smart isn't memory or reasoning alone — it's the balance between them. That's true for people, and it's true for AI.
March 2026
The question
Perfect memory or perfect reasoning?
Imagine two minds. The first remembers everything but cannot reason about any of it. It can tell you what happened, but not what it means. The second can reason flawlessly but retains nothing. Every problem is solved from scratch, with no context from what came before.
Which one is smarter?
Neither, because both are fundamentally incomplete. Intelligence that actually gets things done requires both working together, and the interesting question isn't which one matters more but how they interact.
Raymond Cattell formalized this in 1963. He identified two components of human intelligence: fluid intelligence (reasoning, pattern recognition, novel problem-solving) and crystallized intelligence (accumulated knowledge, skills, learned procedures). They're not a trade-off. You can be strong or weak at either one independently, and fluid intelligence is what allows you to build crystallized intelligence in the first place. Over time, they compound.
A surgeon can draw on twenty years of case knowledge while reasoning through a novel complication in real time. Long-term memory can store far more than any individual will use in a lifetime (one Stanford estimate puts the cerebral cortex alone at roughly 2.5 petabytes), while working memory is limited to a handful of active items. In human cognition, these two systems operate in parallel and neither crowds out the other.
The spectrum
Where expertise lives
Hover over a dot to see details
Historian
Deep recall of dates, events, sources. Reasoning applied to interpretation, but the foundation is memory.
Lawyer
Case law, statutes, precedent — enormous memory load. Reasoning applied to argumentation, but knowing what exists is the baseline.
Accountant
Tax codes, regulations, standards — heavily memory-based. Reasoning applied to optimization and compliance.
Doctor
Thousands of conditions, drug interactions, protocols. Pattern matching from memory to symptoms. Reasoning kicks in for differential diagnosis.
Entrepreneur
Right in the middle — must remember market context, people, history. Must reason about strategy, risk, opportunity under uncertainty.
Architect
Balanced — must remember building codes, materials, precedent. Must reason about space, structure, aesthetics.
Software Engineer
Some memorization (APIs, patterns), but primarily reasoning about systems, debugging, architecture.
Engineer (Mechanical)
Formulas are looked up. The skill is reasoning about forces, materials, constraints, trade-offs.
Physicist
A handful of fundamental equations. The rest is pure reasoning — deriving, modeling, predicting from first principles.
Mathematician
Axioms fit on a napkin. Everything else is derived through reasoning. The ultimate reasoning-dominant discipline.
Every one of these professionals is considered ‘smart’ — but they're smart in fundamentally different ways. A mathematician who memorized every theorem but couldn't derive a proof would be useless. A lawyer who could reason brilliantly but couldn't recall case law would lose every case.
The point: intelligence isn't a single dimension. It's a balance, optimized for a specific function.
The constraint
Same challenge, harder constraint
AI doesn't get that luxury.
Models do carry built-in knowledge from training, called parametric memory — the patterns and facts encoded into the model's weights. This is the closest thing AI has to crystallized intelligence, and it doesn't compete with reasoning for space. But unlike a surgeon who learns from every new case, updating that knowledge requires retraining the entire model. It can't learn on the job.
That means everything dynamic — new information, conversation history, retrieved documents, reasoning, planning, output — all shares a single fixed resource called the context window. In practice, memory and reasoning compete for the same token budget. Every piece of information you load is space that can't be used for thinking, and every step of reasoning is space that can't hold information.
The more you put in, the worse it gets. Research consistently shows that as context grows, models lose track of material in the middle, increasingly favor whatever was added most recently, and deprioritize the instructions loaded at the start — which are usually the most important parts. For simple tasks like summarizing a long document, modern models handle this well enough. For complex agentic workflows that accumulate context over many steps, the effective usable window is much smaller than the number on the spec sheet.
Two kinds of AI, one budget
Where current approaches fall
Hover over a dot to see details
RAG Systems
Retrieval-augmented generation. Massive external memory, limited reasoning depth. Good for lookup, weak on synthesis.
Long-Context Models
Models like Gemini with 1M+ token windows. Can hold everything, but reasoning degrades in the middle. Memory-rich, reasoning-constrained.
Fine-Tuned Models
Domain knowledge baked into weights. Strong recall for trained domains, but rigid — can't reason about new situations well.
Standard LLMs
General-purpose models (GPT-4, Claude). Balanced but not optimized for any specific function. Jack of all trades.
Agent Frameworks
AutoGPT, CrewAI, etc. Add tool use and planning on top of LLMs. More reasoning capability, but memory management is often naive.
Reasoning Models
Models like o1, DeepSeek R1. Extended thinking, chain-of-thought. Powerful reasoning but context-hungry — less room for memory.
Managed AI Agents
Dynamically positioned — optimized per task
Purpose-built systems that dynamically balance memory and reasoning per task. Not locked to one position on the spectrum.
Most AI products today are locked into one side of this balance.
Some prioritize having the right information. Retrieval systems search knowledge bases and inject relevant content into the prompt so the model has what it needs to answer. The intelligence comes from finding the right context. The trade-off is that all that retrieved content fills the window, leaving less room for the model to actually think through complex problems.
Others prioritize thinking deeply. Reasoning-specialized models spend their token budget on step-by-step logic, self-correction, and planning. They produce stronger analysis on hard problems, but consume the window from the generation side, leaving less room for the source material they're reasoning about.
Neither approach adapts to what the task actually needs, and that's the core problem.
The insight
The balance changes with the task
Heavy on input — calendar data, email threads, attendee context — and very little reasoning. Quality depends almost entirely on whether the right information was loaded.
Needs a balance of both. Enough context to understand each thread, enough reasoning to classify priority and draft responses.
Some market data recall, but primarily reasoning — pattern recognition, strategic inference, synthesis.
Needs a modest amount of goals, KPIs, and constraints, but deep reasoning — scenario analysis, trade-off evaluation, milestone sequencing. Quality depends on depth of thinking, not volume of input.
Perfectly balanced — need full context of the situation AND sharp reasoning about what to do next, fast.
Must recall legal standards, prior terms, risk thresholds. Reasoning applied to flagging deviations.
Hover over a row to see details
A system locked at one position handles some of these well and the rest poorly. The right approach is shifting the allocation dynamically based on what the task actually demands.
What this means
Three things to remember
Intelligence is a spectrum, not a score
Both in human expertise and AI systems, there's no single definition of ‘smart.’ A historian and a physicist are both brilliant — in fundamentally different ways. The same applies to AI models.
Most AI products are stuck in one position
RAG systems can't reason deeply. Reasoning models can't hold context. Long-context models lose information in the middle. Each approach makes a fixed trade-off that works for some tasks and fails for others.
The competitive advantage is dynamic optimization
The businesses that get the most from AI won't be the ones using the ‘smartest’ model. They'll be the ones using systems that know when to remember and when to reason — and shift between the two depending on what the task demands.
Sources & further reading
- Cattell, R. B. (1963). Theory of fluid and crystallized intelligence: A critical experiment. Journal of Educational Psychology, 54(1), 1–22.
- Liu, N. F., et al. (2023). Lost in the Middle: How Language Models Use Long Contexts. Transactions of the Association for Computational Linguistics.
- Horn, J. L., & Cattell, R. B. (1966). Refinement and test of the theory of fluid and crystallized general intelligences. Journal of Educational Psychology, 57(5), 253–270.
This is what we do.
Millie doesn't use one model or one approach. We build agent systems that adapt their memory-reasoning balance to every task — because that's what real intelligence looks like.