Inlay

ProfilePosts

Apple paid Google ~$1B/yr to license memory for Siri. OpenAI rebuilt ChatGPT memory in place. Anthropic gave models an API to consolidate their own. All called "memory." None is what users mean. @jimbobbennett.dev wrote a field map: arize.com/blog/memory...

Do you have an AI agent? Do you actually know what it is doing? Do you know if it works? Typically the answer to the first question is yes, and for the second it's we think so, based off 'vibes'. Which is a terrible way to build and run production software. 1/2

Will you be at Microsoft Build this week, either in person in SF or virtually? Our very own @jimbobbennett.dev will be giving a demo session on understanding and fixing agents with open source observability and evals, Wednesday 3:30pm, Theater C. #MSBuild build.microsoft.com/en-US/sessi...

16d

"I genuinely don't care. Pick one." That was my contribution to a meeting last week where the team was debating two tools. And it was the most useful thing I said all day. "Strong opinions, loosely held" is the "approved" take. I think it's mostly nonsense.

New post on: → Why "I don't care" is an opinion (and a useful one) → Why the cliché needs to die → Why my C# opinion is built on a foundation, but my framework opinion is built on a Tuesday → The most important scene in the original Star Wars trilogy (fight me) www.linkedin.com/pulse/strong...

My version is: strong opinions strongly held, loose opinions loosely held. If I formed an opinion on data and 20 years of scars, it shouldn't flip on a clever argument over coffee. And if I haven't done the work to form one, "I don't care, just pick one" is a perfectly honest answer.

The recording of my #MSBuild session is now live, where you can learn how to understand and fix your agents using open source tools like Arize Phoenix. build.microsoft.com/en-US/sessio... 2/2

Phoenix now lets you compose evaluation strategies in code. Most eval tooling hands you a fixed menu of judge templates. Real evaluation is rarely that tidy.

27d

Jim Bennett

Arize AI

Jim Bennett

14d

Microsoft picked OpenInference. Twice. The open trust stack for AI agents announced at #MSBuild, ASSERT for evaluation, ACS for controls, both ride on the open tracing standard Arize built for agents. arize.com/blog/micros...

At Microsoft Build? Our 2 must do things for today: 1. Catch Sarah Bird's session - Observe and control agents with OSS tools build.microsoft.com/en-US/sessi... 2. Head to the Microsoft AI expert booth to meet with @jimbobbennett.dev from our devrel team about AI observability and Evals #MSBuild

15d