Ortem Technologies
    AI Tooling

    How to Handle Memory in Your AI Coding Setup

    Praveen JhaMay 7, 202612 min read
    How to Handle Memory in Your AI Coding Setup
    Quick Answer

    Use CLAUDE.md for project context, add an MCP memory server for structured recall, and enable a plugin like Supermemory for session auto-capture. This hybrid approach keeps your conventions and decisions available across sessions.

    Persistent Coding Memory

    Every session with Claude Code, Cursor, or Copilot starts from zero.

    That means there is neither any memory of your codebase conventions nor the architectural decision you spent two hours debating last Tuesday.

    You re-explain it, it listens. But then you close the session and it’s gone.

    There are real, implementable solutions to this. But first:

    Tools of the Week

    1. Supermemory.ai/claude-supermemory
    • Installs as a Claude Code plugin.
    • Injects your profile at session start.
    • Auto-captures decisions when a session ends.
    • Has a team memory namespace so one engineer's fix becomes everyone's context next session.
    1. Memory-bank MCP
    • A lightweight MCP server that gives Claude a per-project file store for architectural decisions and conventions.
    • Free, local, no third-party and requires one 'npx' command to set up.
    • Claude reads/writes to ~/.claude/memory-bank.
    1. Claude Code memory docs
    • Anthropic's official guide to CLAUDE.md, auto memory, and the full file hierarchy.
    • The right place to start before reaching for any third-party tool.

    Why This Happens

    LLMs are stateless.

    Every conversation is a new context window with no knowledge of what came before. Claude Code, Cursor, and Copilot all share this limitation.

    The model powering your session is not the same continuous entity you talked to yesterday.

    It's a fresh instantiation reading only the context you give it at startup. The only memory that survives is what you explicitly provide.

    The Three Approaches

    Approach 1: CLAUDE.md (native, free, manual)

    Claude Code reads a CLAUDE.md file at the start of every session. Whatever is in that file becomes part of Claude's initial context automatically.

    • ~/.claude/CLAUDE.md: global, applies to all projects.
    • <project-root>/CLAUDE.md: project-specific, shared via git.
    • <project-root>/CLAUDE.local.md: personal overrides, gitignored.
    • <project-root>/.claude/rules/*.md: scoped rules by file type or directory.

    A practical example for ~/.claude/CLAUDE.md:

    • Stack: React + TypeScript + Vite + Tailwind
    • Code style: Prettier + ESLint with 2-space indent
    • Architecture pattern: feature folders + shared UI library
    • Deployment: S3 + CloudFront + GitHub Actions
    • Important: Avoid any, prefer explicit types, no classes

    Project-level example:

    • apiPrefix = /api/v1
    • auth = JWT scope openid profile email
    • css = Tailwind JIT + custom spacing scale
    • tests = Cypress integration suites

    Two rules for CLAUDE.md: keep it under 200 lines, and put the most important things first.

    Claude reads sequentially and weighs the top more heavily. A 500-line file consumes ~3,800 tokens and hurts adherence; a clean 80-line file reduces wrong-scope corrections by 40%.

    CLAUDE.md fails when it only carries what you wrote and doesn't capture session learnings automatically unless auto memory is on. It requires discipline. Stale files contradict real codebases.

    Supermemory plugin flow

    Approach 2: Memory MCP servers (structured, cross-project, composable)

    MCP gives Claude Code external tools. A memory MCP server stores facts, decisions, and patterns in a persistent store and lets Claude query them mid-session.

    Three open-source options:

    • memory-bank (@allpepper/memory-bank-mcp): per-project file-based memory. Good for architecture decisions.
    • mcp-knowledge-graph: entity + relationship store (e.g., "use express-rate-limit" connected to "rate limiting" and "your security policy").
    • mcp-memory-service: local server with vector search and session harvest tool.

    Add to your Claude MCP config:

    {
      "tools": [
        "mcp-memory-service",
        "memory-bank"
      ]
    }
    

    MCP memory works automatically once installed, but Claude chooses when to query. If it doesn't check memory for a task, there is no lookup.

    Approach 3: Supermemory Claude Code plugin (automated capture + injection)

    Dhravya Shah built a plugin that addresses the MCP limitation.

    Two key behaviors:

    1. Context injection on session start – user profile and recent context preloaded.
    2. Auto capture on session end – turns are saved without manual /remember.

    Configure triggers at ~/.supermemory-claude/settings.json:

    {
      "captureKeywords": ["architecture","refactor","security","performance"],
      "saveOnExit": true
    }
    

    For teams, add repo-specific config at <project-root>/.claude/supermemory.json.

    repoContainerTag makes shared namespace per repo. One engineer's migration pattern becomes available to everybody.

    Where Supermemory fails: requires Pro plan and third-party data handling. Benchmark claims (~99% LongMemEval) are from controlled experiments; real messy sessions will have lower recall.

    Approach 4: memory-mcp (self-hosted, tiered, hooks-based)

    Two-tier memory architecture using Claude hook system:

    • Tier 1: CLAUDE.md (top 150 lines, auto-generated and updated).
    • Tier 2: .memory/state.json full store, queryable mid-session.
    • Hooks: Stop, PreCompact, SessionEnd.
    • Extracts learning from transcripts with Claude (low cost).

    Self-hosted, local-only, offline-capable. Typical daily cost: $0.05–$0.10 with Haiku extraction API calls.

    The Common Pattern Underneath

    All solutions do the same core thing:

    • write information that would otherwise vanish at session end
    • persist it
    • re-inject at startup

    Default pattern for most: a clean opinionated CLAUDE.md plus one memory tool (MCP/plugin).

    My Take

    The problem is not new. Developer complaints about Claude Code's "goldfish memory" date back to 2025.

    A proper CLAUDE.md solves 80% if you invest 15–30 minutes weekly.

    A responsible setup:

    • Tier 1: CLAUDE.md for static rules/architecture
    • Tier 2: searchable memory store (MCP or plugin) for learnings

    Large context windows help but do not persist across session boundaries. Structured persistent memory does.

    Until next time,

    Praveen Jha Director - AI Product Strategy, Development, Sales & Business Development

    What Actually Works: Lessons from Production AI Coding Setups

    The AI coding setups that work best in production teams share a few characteristics: they use a combination of techniques rather than a single approach, they invest time in quality CLAUDE.md and context files upfront rather than expecting the AI to figure out the codebase implicitly, and they use the AI for what it is genuinely good at (boilerplate, pattern repetition, test generation, documentation) rather than for complex architectural reasoning that requires full system context.

    The most common mistake: treating AI coding tools as all-knowing software engineers rather than as very capable pattern matchers and code generators that work best when given explicit, detailed context about what they should produce. The time invested in writing clear context files and prompts pays dividends in the quality of AI-generated output.

    At Ortem Technologies, we use AI coding tools alongside traditional development practices — they accelerate development of well-defined components while human engineers handle architecture, complex business logic, and system integration decisions. Talk to our engineering team about AI-assisted development | Learn about our development process

    About Ortem Technologies

    Ortem Technologies is a premier custom software, mobile app, and AI development company. We serve enterprise and startup clients across the USA, UK, Australia, Canada, and the Middle East. Our cross-industry expertise spans fintech, healthcare, and logistics, enabling us to deliver scalable, secure, and innovative digital solutions worldwide.

    📬

    Get the Ortem Tech Digest

    Monthly insights on AI, mobile, and software strategy - straight to your inbox. No spam, ever.

    AI MemoryCLAUDE.mdMCPSupermemoryCopilotCursorClaude Code

    Sources & References

    1. 1.CLAUDE.md Docs - Anthropic
    2. 2.Supermemory AI - Supermemory
    3. 3.Memory-Bank MCP - AllPepper

    About the Author

    P
    Praveen Jha

    Director – AI Product Strategy, Development, Sales & Business Development, Ortem Technologies

    Praveen Jha is the Director of AI Product Strategy, Development, Sales & Business Development at Ortem Technologies. With deep expertise in technology consulting and enterprise sales, he helps businesses identify the right digital transformation strategies - from mobile and AI solutions to cloud-native platforms. He writes about technology adoption, business growth, and building software partnerships that deliver real ROI.

    Business DevelopmentTechnology ConsultingDigital Transformation
    LinkedIn

    Stay Ahead

    Get engineering insights in your inbox

    Practical guides on software development, AI, and cloud. No fluff — published when it's worth your time.

    Ready to Start Your Project?

    Let Ortem Technologies help you build innovative solutions for your business.