General

Jun 5, 2025

Why Kin's memory needs more than RAG

This article is part of a series deep-diving into the real gritty tech behind Kin. You can find the original article here.

Also in this series:

Why Kin has its own AI memory system

Every week, we see new papers and approaches to Retrieval-Augmented Generation (RAG). RAG architectures are everywhere: graph RAG, GraphRAG, HybridRAG, HippoRAG, and countless other variations. The AI community has embraced RAG as a potential solution to many of the limitations of Large Language Models (LLMs).

However, as we build more sophisticated AI systems, particularly conversational agents that interact with users in complex ways, we’re discovering that RAG alone is insufficient.

This article explores why RAG, despite its utility, is fundamentally different from true memory systems, and why we need to look beyond RAG to make AI memory more human-like.

‍As others at Kin has discussed before, memory isn’t just about retrieving information - it’s about understanding context, building associations, and, perhaps most importantly, knowing what to forget. The more human we can get something like Kin to feel, the easier it will be to interact with it.

I'm going to jump into this below, but you can also learn more from the related video below.

The current state of RAG

Retrieval-Augmented Generation has become an essential component in modern AI systems. At its core, RAG works by:

Receiving a query or prompt
Searching through a(n either internal or external) knowledge base to retrieve relevant documents or passages
Incorporating the retrieved information into the context window of a language model
Generating a response based on both the input and the retrieved information

Consider a simple example: A user asks, “What were the major outcomes of the Paris Climate Agreement?”

A standard LLM might provide a general answer based on its training data, which could be outdated or incomplete. A RAG-enhanced system, however, would first search through its knowledge base to find specific documents about the Paris Climate Agreement, extract relevant passages about its outcomes, and then use this retrieved information to generate a more accurate and up-to-date response.

RAG has proven highly effective for many applications, particularly when building AI-powered search capabilities or working with large amounts of data that need to be made available to an LLM. It helps ground AI responses in factual information and reduces hallucinations - instances where models generate plausible-sounding but incorrect information.

For example, companies implementing RAG architectures for customer service applications have seen significant improvements in the accuracy of responses to specific product or policy questions.

As one RAG developer put it: “If you’re building AI-powered search, maybe RAG is a good approach, but if you’re building agents or conversational agents or something more complex that interacts with the user, RAG is not enough.”

This limitation becomes apparent when we move beyond simple question-answering to more complex scenarios requiring ongoing interactions, personalization, and adaptation to changing contexts.

Why RAG is not 'true' AI memory

RAG differs from human memory in several fundamental ways, which stops it feeling natural in complex situations:

1. Lack of episodic context

In human memory, information doesn’t exist in isolation.

Facts are always surrounded by and connected to the events they relate to, our life experiences, and other contextual elements. This episodic context gives facts richer meaning and helps us interpret them more fully.

Consider how differently you might understand a historical fact learned from a textbook versus one experienced through a powerful documentary or museum visit. The emotional impact, visual context, and narrative structure all contribute to how you store, recall, and understand that information. This rich tapestry of contextual elements is what makes human memory so powerful and nuanced.

For example, if you learned about the fall of the Berlin Wall through a documentary that included interviews with families who were reunited after decades of separation, your memory of this historical event would be intertwined with the emotional stories you heard. When recalling facts about the Berlin Wall later, these emotional elements would likely be activated as well, providing a richer, more contextual understanding.

As explained in the video above: “Information does not live in isolation - it’s not encyclopedic knowledge. It’s always surrounded by events that it’s related to, life experiences, and other things. You need to have episodical memories and context to have a richer interpretation of facts.”

RAG systems typically retrieve information based purely on semantic similarity or relevance rankings, stripping away this crucial episodic context. A RAG system might retrieve a passage stating when the Berlin Wall fell, but it wouldn’t necessarily have access to the emotional or experiential context that shapes how humans (or maybe even the user) understand and relate to this information.

Current implementations of RAG treat documents as isolated units of information, failing to capture how they connect to specific experiences, conversations, or interactions. This means that while RAG can provide factual information, it lacks the emotional resonance and personal relevance that make human memory so effective for learning, conversation, and decision-making.

2. Limited association building

Our brains organize memories through complex associative networks. When your brain retrieves information about the color red, it might automatically retrieve related concepts like orange or other semantically related items. These association mechanisms are fundamental to how we think and reason.

Consider what happens when you think about “beach.” Your mind likely activates a whole network of associated concepts: sand, ocean, sunscreen, vacations, specific memories of beaches you’ve visited, the feeling of sun on your skin, the sound of waves. These associations aren’t just semantic - they span sensory modalities, emotions, personal experiences, and abstract concepts.

“Your semantic memory has packages of memories that you retrieve together because they’re semantically close. If your brain retrieves something about red, it will retrieve something about orange and other related things — that’s how our brain is organized.”

This associative structure allows for creative connections and insights that go far beyond simple information retrieval. You might make an unexpected connection between the rhythm of ocean waves and a musical piece you’re composing, or between beach erosion patterns and a business problem you’re trying to solve.

Current RAG systems struggle to replicate this rich associative structure. Even graph-based approaches that attempt to capture relationships between documents or concepts typically rely on predefined connection types or simple co-occurrence statistics. They lack the multidimensional, cross-modal associations that characterize human memory. Which is fine, if you're not trying to build something human-like. Otherwise, it's an issue.

For instance, a RAG system might associate documents containing the words “beach” and “ocean” based on their semantic similarity, but it wouldn’t automatically make connections to related sensory experiences, emotional states, or abstract concepts unless explicitly programmed to do so.

While graph-based RAG approaches attempt to simulate such associative structures, they still fall short of the rich, multidimensional associations in human memory. Building effective associativity in AI systems remains a significant challenge.

3. Retrieval without understanding

Perhaps the most significant limitation of RAG is that it retrieves without understanding. As noted in the video: “You could retrieve some documents, you could build BM25 ranking or other methods that rank information and relativity, but you still do not understand what you retrieve.”

Consider a RAG system answering questions about climate change. It might retrieve and combine information from multiple scientific papers, policy documents, and news articles. The system can find documents containing relevant keywords and even rank them by relevance using sophisticated algorithms like BM25 or neural embeddings. However, it doesn’t truly comprehend the scientific concepts, causal relationships, or policy implications discussed in those documents.

This lack of understanding becomes particularly apparent when dealing with nuanced topics. For example, if asked about the relationship between climate change and extreme weather events, a RAG system might retrieve passages stating statistical correlations without grasping the underlying causal mechanisms or the scientific debate around attribution. It might juxtapose contradictory information from different sources without recognizing the contradiction.

Let’s examine a more concrete example: If a RAG system is asked, “How does carbon pricing affect industrial competitiveness?”, it might retrieve documents mentioning both carbon pricing and industrial competitiveness. However, without true understanding, it can’t independently evaluate the methodological quality of different studies, recognize when industry-funded research might be biased, or grasp the nuanced economic mechanisms at play. It’s simply matching patterns without comprehension.

RAG systems can find and rank information based on various metrics, but they don’t necessarily comprehend what they’ve retrieved in the way humans do. Natural language understanding remains a massive challenge, and without it, RAG is merely shuffling symbols without grasping their meaning.

4. No forgetting mechanism

Counterintuitively, one of the most crucial aspects of human memory is our ability to forget.

“Our brain is not about remembering things - our brain is more about forgetting things. We are forgetting machines.”

This forgetting mechanism is essential for mental health and cognitive function. When it breaks down, we can experience issues like PTSD and other mental health challenges.

Current RAG systems typically don’t incorporate principled forgetting mechanisms that prioritize important information while discarding the irrelevant. They try to remember everything.

Consider what happens when you move to a new city. Over time, you gradually forget the detailed layout of your old neighborhood - the location of every store, the names of minor streets - while retaining important information and emotional memories.

This selective forgetting is crucial; without it, navigating your new environment would be cluttered with irrelevant information from your past. Just like how, in those first few weeks, you turn left for the train station rather than right, because it was on the left in your own time.

Similarly, when you change jobs, you gradually forget the specific details of daily processes at your old workplace while retaining the valuable skills and knowledge you gained there. This forgetting is adaptive, allowing you to focus on your new role without being overwhelmed by outdated procedures.

In contrast, current RAG systems typically retain all information indefinitely. They might assign lower relevance scores to certain documents over time, but they don’t have mechanisms for truly “forgetting” information that has become outdated or irrelevant. This can lead to several problems:

Information overload: As the knowledge base grows, retrieval becomes increasingly challenging and computationally expensive.
Outdated information: Without active forgetting, outdated information persists in the system. For example, a RAG system might retrieve old product specifications or deprecated API documentation mixed with current information.
Contextual confusion: For conversational systems, the inability to forget can lead to confusion as the system tries to maintain consistency with everything it has ever been told, even when contexts change.

“Imagine having a friend who never forgets anything - building a relationship with them would be quite complex.” Similarly, conversational agents without forgetting mechanisms can become unwieldy over time as they retain every piece of information regardless of its current relevance. You'd constantly be fighting their confusion of the past with the present - not because they can't differentiate, but because true change is hard for them to grasp.

For example, imagine a customer service AI that remembers every interaction a customer has ever had with a company, going back years. Without proper forgetting mechanisms, it might continuously bring up resolved issues from the distant past or apply outdated policies. This would create a frustrating experience for the customer, and reduce the effectiveness of the AI system.

Why RAG is not enough for advanced AI systems like Kin

Given all of the above, RAG faces several practical challenges that limit its effectiveness for building truly advanced AI:

1. Context window constraints

Even with retrieval, LLMs are limited by their context windows, or the amount of information an LLM can process at once. Complex reasoning that requires synthesizing information from many sources can exceed these limitations.

Consider a legal assistant AI helping to prepare a case that involves hundreds of previous cases, statutes, and legal opinions. Even if a RAG system can identify the relevant documents, the language model’s context window might only accommodate a small fraction of this information at once. This forces artificial chunking of information that can break important connections, and limits the system’s ability to reason across the full scope of relevant material. The LLM simply has no way to understand connections and relations between those chunks.

Similarly, imagine a medical diagnosis system that needs to consider a patient’s entire medical history, relevant medical literature, similar case studies, and current symptoms. The fragmentation imposed by context window limits can prevent the system from making connections between distant but related pieces of information.

These limitations become increasingly problematic as we move from simple fact retrieval to complex reasoning tasks. While context windows have grown substantially - from 2,048 tokens in early GPT models to 32,000 or more in recent systems - they still impose artificial constraints that human memory doesn’t face. Humans can seamlessly integrate information from experiences decades apart when needed, without limitations on how much context we can consider at once.

Some researchers have proposed sliding window approaches or recursive summarization techniques to address these limitations, but these workarounds introduce their own problems, including potential loss of detail and increased computational overhead.

2. Retrieval quality bottlenecks

RAG is only as good as its retrieval component, or the way it collects suporting information for its generation. Poor retrieval - whether due to inadequate indexing, imprecise queries, or insufficient content in the knowledge base - inevitably leads to poor generation.

This becomes particularly evident when dealing with nuanced queries. For example, if a user asks, “What are the ethical implications of using predictive algorithms in criminal sentencing?”, the quality of the response depends entirely on whether the system can retrieve documents that specifically address ethical dimensions, rather than just technical aspects of predictive algorithms or general information about criminal sentencing.

The retrieval challenge intensifies when dealing with:

Implicit information needs: When a user’s query doesn’t explicitly mention all relevant aspects of their information need. For instance, “Is this investment a good idea?” implies the need for information about risk, return, market conditions, and the user’s financial goals, none of which are explicitly mentioned. Understanding that these topics are related would help.
Evolving topics: For newly emerging subjects where terminology isn’t yet standardized or where the relevant information is scattered across documents that use different terminology.
Conceptual queries: Questions that require conceptual understanding rather than fact retrieval, such as “How does confirmation bias affect scientific research?”

Current RAG systems often struggle with these challenges. Vector similarity searches, while powerful, can miss relevant information that uses different terminology or approaches a topic from a different angle. Hybrid retrieval systems that combine semantic search with keyword matching help address some of these issues, but still fall short of human-like understanding of interconnected informational needs.

3. Static knowledge representation

Most RAG implementations use fixed vector representations that don’t evolve based on new understandings or connections between pieces of information.

Consider how human understanding of concepts evolves over time. When you first learn about a complex topic like quantum physics, you might form a basic mental model. As you learn more, this model becomes more nuanced and interconnected with other knowledge. Your understanding of terms like “superposition” or “entanglement” evolves as you encounter them in different contexts and applications.

In contrast, most RAG systems represent documents as static vectors that don’t change once created. If new information emerges that changes the meaning or importance of existing documents, the system doesn’t automatically update its representations to reflect these changes.

For example, a RAG system with medical information might have documents about a specific treatment approach. If new research emerges showing that this approach is ineffective or harmful, the system doesn’t automatically update the vector representations of the existing documents to reflect their new status as outdated or contested information.

Some advanced RAG systems attempt to address this by periodically re-indexing content or using incremental updates, but they still lack the dynamic, adaptive nature of human conceptual understanding.

4. Limited self-reflection

Without the ability to evaluate the quality and relevance of retrieved information, RAG systems cannot iteratively improve their own performance.

Human memory and cognition involve constant self-monitoring and evaluation. When trying to recall information, we have a sense of whether our recollection is accurate or complete. We can recognize gaps in our knowledge or identify when we need to seek additional information. This metacognitive awareness is crucial for exponentially effective learning and problem-solving.

For example, when a doctor is diagnosing a patient with unusual symptoms, they might recognize that the pattern doesn’t fully match any condition they’re familiar with. This awareness prompts them to consult additional resources, seek a specialist’s opinion, or consider rare conditions they wouldn’t normally include in their differential diagnosis.

Current RAG systems lack this kind of self-awareness. When they retrieve information that’s only tangentially related to a query, they typically have no mechanism to recognize this mismatch or to adjust their retrieval strategy accordingly. They can’t identify gaps in their knowledge base or recognize when the retrieved information is insufficient to answer a question.

Some advanced RAG implementations incorporate relevance feedback or uncertainty estimation, but these approaches still fall far short of human-like metacognitive capabilities. Without this self-reflection, RAG systems can’t effectively learn from their mistakes or adapt their strategies based on past performance.

What makes memory truly memory?

I believe, at least for systems like Kin, we need to move beyond RAG toward more human-like AI memory to properly address the issues above.

To reach that we kind of 'true' memory, we need systems that incorporate:

1. Multi-modal memory structures

Human memory operates across different modalities, languages, and types of information. Advanced AI memory systems need “rich structural representation with understanding of things,” potentially including multi-language structures where different languages can represent the same concepts.

Our memories aren’t confined to a single format or modality. We remember faces, voices, smells, emotions, facts, procedures, and narratives - all through interconnected but distinct memory systems. Each of these modalities contributes to a rich, multidimensional representation of our experiences and knowledge.

Consider how you remember a childhood birthday party. You might recall:

Visual elements: the decorations, the cake, people’s faces
Sounds: laughter, birthday songs, specific conversations
Emotions: excitement, joy, perhaps some moments of disappointment
Procedural memories: how to play the games you played
Semantic information: who was there, how old you turned, what gifts you received

These different types of information are stored and accessed through different but interconnected memory systems. When one element is activated, it often triggers related memories across modalities.

Current AI systems, including RAG, typically operate within a single modality - usually text. Even when they process multiple modalities like images and text, they often convert everything into a single representational format (like embedding vectors) that loses the unique characteristics of different types of information.

For example, a memory of a beach sunset has visual components (the colors of the sky, the texture of the sand), auditory components (the sound of waves), emotional components (feelings of peace or awe), and possibly semantic components (knowledge about why sunsets have particular colors). A truly multimodal memory system would preserve these distinct aspects while maintaining their interconnections.

Advanced AI memory systems should support:

Distinct but interconnected representations for different types of information
Cross-modal associations that allow activation to spread across modalities
Modality-specific processing that respects the unique characteristics of different types of information
Unified access that can retrieve relevant information regardless of its original modality

Some promising research in this direction includes multimodal transformers, cross-modal retrieval systems, and neuro-symbolic architectures that combine neural representations with symbolic reasoning - but we've stil a way to go.

2. Active reconstruction

“Recall is active construction - we extract memories and construct them on the fly.”

The hippocampus, for example, is responsible for timeline and time reconstructions. This active construction process allows us to model the future and think about possibilities.

Human memory isn’t like a video recording that plays back exactly what was stored. Instead, it’s a reconstructive process. When we remember, we don’t simply retrieve a complete memory; we reconstruct it from fragments, filling in gaps based on schemas, expectations, and related memories.

Consider what happens when you recall a conversation from last week. You don’t remember every word verbatim. Instead, you reconstruct the gist of what was said, perhaps remembering a few key phrases exactly. You fill in gaps based on your understanding of the person you were talking to, the context of the conversation, and your knowledge of the topic discussed.

This reconstructive nature of memory has several important implications:

Memory is creative: We don’t just retrieve; we create memories anew each time we recall them. This allows us to adapt memories to current needs and contexts.
Memory is malleable: Our memories can change over time as we reconstruct them in slightly different ways, incorporating new information or perspectives.
Memory supports imagination: The same mechanisms that help us reconstruct past events also allow us to imagine future scenarios or counterfactual situations.

Current RAG systems lack this reconstructive aspect. They retrieve existing text passages but don’t actively reconstruct information to fit the current context or to fill gaps between retrieved fragments. They can combine or summarize retrieved passages, but this falls short of true reconstructive memory.

For example, if asked about a specific aspect of climate change that isn’t directly addressed in any single document in its knowledge base, a RAG system might retrieve several tangentially related documents but struggle to reconstruct the specific information needed. A human expert, in contrast, might reconstruct an answer by combining fragments of knowledge from different sources, filling in gaps based on general understanding of the domain.

While this can sometimes lead to false memories, this active reconstruction is essential for truly advanced memory systems. Without it, we have mere retrieval, not true memory.

That's not to say perfect retrieval isn't useful or desirable - just that adding active reconstruction to that allows an AI system to make a much more extensive use of its dataset by allowing accurate and confidence-estimated adaptations to current needs, and fillings of any gaps.

Advanced AI memory systems should therefore incorporate:

Schema-based reconstruction that uses general knowledge structures to fill gaps in specific memories
Context-sensitive recall that adapts reconstructed memories to current needs and contexts
Constructive simulation capabilities that extend beyond retrieval to support imagination and counterfactual reasoning
Confidence estimation for reconstructed elements to distinguish between directly retrieved information and inferred or reconstructed components

Recent work on generative retrieval models and neural schema networks shows promise in this direction, but much work remains to achieve truly human-like reconstructive memory.

3. Associative memory networks

Memory should be associative, building rich semantic connections between related concepts. These associations allow for more flexible and human-like reasoning, which ultimately feels more natural and intelligent to interact with - something important for advanced AI chatbots like Kin.

Human memory is fundamentally associative. Concepts, experiences, and facts are linked together in complex networks of associations that allow for flexible retrieval and creative connections. Unlike the rigid, hierarchical organization of traditional computer storage systems, associative memory enables us to access information through multiple pathways and to make unexpected connections that can lead to 'human' insights and innovations.

Consider how you might retrieve information about “Paris.” Depending on the context, you might access this information through:

Geographic associations (it’s in France, it’s a European capital)
Cultural associations (the Eiffel Tower, French cuisine, art museums)
Personal associations (a vacation you took there, a friend who lives there)
Historical associations (the French Revolution, World War II events)
Literary or artistic associations (Hemingway’s “A Moveable Feast,” Impressionist paintings)

These rich, multidimensional associations allow for flexible and context-appropriate retrieval. If you’re planning a trip, your brain might activate travel-related associations; if you’re discussing art history, it might activate artistic associations instead. The two will often combine in unexpected and helpful ways, based on tangential connections you didn't even remmeber making until you recalled them.

Current RAG systems typically rely on similarity-based retrieval that lacks this rich associative structure. Documents might be retrieved based on overall similarity to a query, but the system doesn’t maintain explicit associations between different pieces of information that could enable more flexible and creative retrieval paths.

For example, if a user asks about “innovative urban transportation solutions,” a traditional RAG system might retrieve documents containing similar terms. An associative memory system could activate a network of related concepts - bicycle sharing programs, congestion pricing, autonomous vehicles, urban planning principles - even if these aren’t all explicitly mentioned in the query. The search would be much more useful as a result.

Advanced AI memory systems should incorporate:

Explicit representation of associations between concepts, facts, and experiences
Multiple types of associations (semantic, temporal, causal, etc.) that capture different ways information can be related
Association strength that reflects how strongly connected different pieces of information are
Contextual activation of associations that depends on the current focus and goals
Spreading activation mechanisms that allow retrieval to follow chains of associations

Research on graph neural networks, associative memory models, and knowledge graphs offers promising directions for building such systems, but creating truly human-like associative memory remains a significant challenge.

4. Adaptive forgetting

“Without forgetting, memory simply doesn’t work.” Forgetting goes hand-in-hand with attention - deciding what information is worth focusing on and what can be discarded.

Contrary to popular belief, forgetting isn’t always failure of memory - it’s also an adaptive feature that helps us function effectively in a complex, changing world. By selectively retaining important information while discarding the irrelevant, our memory systems optimize for utility rather than perfect recall.

Consider what would happen if you remembered every detail of your daily commute for the past five years - every car you passed, every pedestrian you saw, every slight variation in your route. This information would overwhelm your memory system and interfere with your ability to recall the truly important information - like the way to the train station. Instead, your brain intelligently retains the stable, important features (the general route, notable landmarks) while discarding the irrelevant details.

Forgetting serves several crucial functions:

Reducing interference: By discarding outdated or irrelevant information, forgetting helps prevent interference with new learning and recall of important information.
Promoting generalization: Forgetting specific details can help extract general patterns and principles from experiences.
Adapting to changing environments: As our circumstances change, forgetting allows us to update our memories and behaviors accordingly.
Emotional regulation: The ability to forget painful experiences (or at least reduce their emotional impact) is crucial for psychological well-being.

Current RAG systems typically lack principled forgetting mechanisms. They might implement simple time-based decay or relevance thresholds, but these fall far short of the sophisticated, context-sensitive forgetting processes in human memory.

For example, a conversational AI using RAG might retain every detail of past conversations indefinitely, leading to increasingly irrelevant retrievals as the knowledge base grows. Without adaptive forgetting, the system might continue to retrieve information about a user’s past interests, situation. or needs even after these have changed significantly. Bluntly, an AI that routinely asks about a user's now-dead friend as if they're alive is likely to have that user drop off.

Advanced AI memory systems should incorporate:

Importance-weighted retention that prioritizes retention of information based on its utility, emotional significance, and relevance to current goals
Context-sensitive forgetting that adapts forgetting rates based on changes in the environment or the user’s needs
Interference-based forgetting that considers how different memories might interfere with each other
Strategic consolidation processes that strengthen important memories while allowing less important ones to fade

Recent work on neural networks with controlled forgetting, adaptive memory networks, and reinforcement learning approaches to memory optimization shows promise in this direction.

5. Hierarchical organization

Memory systems should be hierarchical to represent our own human semantic hierarchical structures, allowing for different levels of abstraction and generalization.

Human memory operates at multiple levels of abstraction simultaneously. We can zoom in to recall specific details or zoom out to access general principles and categories. This hierarchical organization allows for efficient storage, flexible retrieval, and powerful generalization capabilities.

Consider how your knowledge about animals is organized:

At the highest level, you have the general concept of “animal”
Below that, you might have categories like “mammals,” “birds,” “reptiles,” etc.
Each of these categories contains subcategories (e.g., “mammals” includes “primates,” “carnivores,” “rodents,” etc.)
At lower levels, you have specific species (e.g., “tigers,” “elephants”)
At the most detailed level, you might have specific instances (e.g., “the tiger I saw at the zoo last year”)

This hierarchical structure allows you to intuitively:

Make inferences about new instances based on their category membership
Access information at the appropriate level of detail for a given task
Generalize knowledge from specific instances to broader categories
Navigate efficiently between different levels of abstraction

Current RAG systems typically lack this rich hierarchical organization. While they might use clustering or categorization schemes to organize documents, these approaches typically don’t capture the nested, multilevel hierarchies that characterize human conceptual knowledge.

For example, if a user asks a question about “transportation options in European cities,” a traditional RAG system might retrieve documents containing these terms. A hierarchically organized memory system could recognize the relationships between specific transportation modes (buses, trams, metros), specific cities (Paris, Berlin, Barcelona), and the general concepts they exemplify, allowing for more nuanced and comprehensive responses.

Advanced AI memory systems should incorporate:

Explicit representation of hierarchical relationships between concepts at different levels of abstraction
Flexible navigation between different levels of the hierarchy based on current needs
Inheritance mechanisms that allow properties and relationships to propagate through the hierarchy
Hierarchical inference capabilities that can reason across different levels of abstraction

Research on hierarchical neural networks, concept lattices, and ontology-based knowledge representation offers promising directions for building such systems.

Promising directions for AI memory systems

Several approaches show promise for developing more sophisticated AI memory systems:

1. Hierarchical memory architectures

Specifically, hierarchal memory architechtures that operate at different timescales and levels of abstraction.

Recent research has explored multilevel memory architectures that combine fast-changing working memory components with more stable long-term memory systems. These approaches often use different encoding mechanisms and update rules for different types of information and different timescales.

For example, systems like Hierarchical Transformer Memory (HTM) use nested attention mechanisms to capture information at different levels of abstraction and different temporal scales. Some systems employ fast-changing memory buffers for recent context alongside slower-changing components for stable knowledge, mimicking the complementary learning systems theory of human memory.

2. Event-based memory systems utilizing temporal graphs

Event-based approaches organize memory around discrete events and their temporal and causal relationships, showing the connection between events over time.

These systems can capture narrative structure and episodic context that simple document retrieval systems miss.

For instance, research on Event and Entity-centric Knowledge Graphs constructs explicit representations of events, their participants, and their temporal and causal relationships. These structured representations can support more sophisticated reasoning about sequences of events, their causes and consequences, and their relationships to broader historical or personal narratives.

3. Associative memory networks

Associative memory networks can build rich semantic connections between related concepts. This allows them to explicitly represent connections between different pieces of information, allowing for flexible, multipath retrieval and creative connections between seemingly disparate concepts.

Recent approaches include Associative Memory-based Knowledge Graphs, which combine semantic network structures with neural embeddings to capture both explicit and implicit relationships between concepts. Some systems implement spreading activation mechanisms that allow retrieval to follow chains of associations, mimicking the associative nature of human recall.

4. Neuro-symbolic approaches combining neural networks with symbolic reasoning

Neuro-symbolic systems integrate neural network approaches (good at pattern recognition and generalization) with symbolic reasoning (good at explicit representation and logical inference) to better represent and manipulate information. These hybrid approaches can capture both the statistical patterns in data and the structured, compositional nature of human knowledge.

For example, systems like Neural Logic Machines and Neuro-Symbolic Concept Learners use neural networks to learn representations of concepts and relationships, combined with symbolic reasoning mechanisms to manipulate these representations according to logical rules. This approach can support both flexible pattern recognition and precise logical reasoning.

5. Forgetting mechanisms that strategically determine retain information

Adaptive forgetting approaches implement principled mechanisms for determining what information to retain and what to discard, based on factors like relevance, importance, and potential for interference.

Recent work includes Adaptive Forgetting Neural Networks, which implement learnable forgetting rates that depend on the importance and utility of different pieces of information. Some approaches use reinforcement learning to optimize forgetting policies, maximizing the utility of retained information while minimizing memory and computational costs.

Conclusion - why does Kin's memory need more than RAG?

RAG represents an important step toward more capable AI systems with access to external knowledge. However, equating RAG with memory falls short of the rich, dynamic, and integrated nature of human memory systems.

As we work to develop more advanced AI, particularly conversational agents and systems that interact with humans in complex ways, we need to move beyond simple retrieval. We need to develop systems that can truly remember, reflect, and learn from experience - systems that understand the importance of context, association, and even forgetting.

Those will be the systems people will actually want to spend extended time interacting with.

Consider the difference between:

A search engine that can retrieve documents containing the term “climate change adaptation”
A colleague who can discuss climate adaptation strategies based on their understanding of the field, drawing connections to related concepts, recalling relevant examples, and constructing new insights by combining information from different sources

The first is retrieval; the second is true memory. As AI systems become more integrated into our lives and work, we need them to function more like a colleague than a search engine.

This doesn’t mean abandoning RAG - it remains a valuable tool for grounding AI systems in factual information. But it does mean recognizing its limitations and working to complement it with more sophisticated memory mechanisms that capture the active, constructive, associative, and adaptive nature of human memory. A combination of both is needed to take everyday AI further.

The message is clear: if you want to build something truly advanced, think about memory systems, not just retrieval systems. RAG is not enough - it’s merely the beginning of our journey toward AI systems with truly human-like memory capabilities.

This article combines insights from a technical presentation on AI memory systems with additional research on the limitations of RAG and the requirements for more sophisticated AI memory. The field continues to evolve rapidly, and new approaches to AI memory may emerge to address the limitations discussed.

‍

Volodymyr Pavlyshyn

I enjoy craft software with people and for people with a focus on value and business since 2004. I wrote my first lines of code in 1995 and a still extremely excited to create things that make a difference.

Back to Resources