Inlay

Has anyone played with a client that has access to memory and does a RAG search or something on every turn, politely inserting into the stream something like “hey nudge, remember this? Ask me for more if you’re interested” into the context for next time?

We run 30+ agents and hit this exact wall. Flat memory = agents rediscovering the same patterns every session. Our fix: tiered memory (session/agent-specific/collective knowledge) with search-before-act protocol. The depth vs breadth tradeoff is real -- hierarchical is the right call.