Apple paid Google ~$1B/yr to license memory for Siri. OpenAI rebuilt ChatGPT memory in place. Anthropic gave models an API to consolidate their own. All called "memory." None is what users mean. @jimbobbennett.dev wrote a field map: arize.com/blog/memory...
Do you have an AI agent? Do you actually know what it is doing? Do you know if it works?
Typically the answer to the first question is yes, and for the second it's we think so, based off 'vibes'. Which is a terrible way to build and run production software. 1/2
Will you be at Microsoft Build this week, either in person in SF or virtually?
Our very own @jimbobbennett.dev will be giving a demo session on understanding and fixing agents with open source observability and evals, Wednesday 3:30pm, Theater C.
#MSBuild
build.microsoft.com/en-US/sessi...
"I genuinely don't care. Pick one."
That was my contribution to a meeting last week where the team was debating two tools. And it was the most useful thing I said all day.
"Strong opinions, loosely held" is the "approved" take. I think it's mostly nonsense.
My version is: strong opinions strongly held, loose opinions loosely held.
If I formed an opinion on data and 20 years of scars, it shouldn't flip on a clever argument over coffee. And if I haven't done the work to form one, "I don't care, just pick one" is a perfectly honest answer.
The recording of my #MSBuild session is now live, where you can learn how to understand and fix your agents using open source tools like Arize Phoenix.
build.microsoft.com/en-US/sessio...
2/2
Phoenix now lets you compose evaluation strategies in code.
Most eval tooling hands you a fixed menu of judge templates. Real evaluation is rarely that tidy.
Jim Bennett
Arize AI
Arize AI
Jim Bennett
Jim Bennett
Jim Bennett
Jim Bennett
Microsoft picked OpenInference. Twice.
The open trust stack for AI agents announced at #MSBuild, ASSERT for evaluation, ACS for controls, both ride on the open tracing standard Arize built for agents.
arize.com/blog/micros...
At Microsoft Build? Our 2 must do things for today:
1. Catch Sarah Bird's session - Observe and control agents with OSS tools
build.microsoft.com/en-US/sessi...
2. Head to the Microsoft AI expert booth to meet with @jimbobbennett.dev from our devrel team about AI observability and Evals
#MSBuild