If you're doing any performance work on GPUs, this is huge. PC sampling is a powerful tool for GPU optimization, but causes serial kernel executions.
We've released production-ready continuous PC sampling so you can get GPU performance insights on your production workloads.
Alfonso Subiotto
🚀 Launching today: GPU PC sampling in production. Instruction-level profiling telling you why and where warps stall. Whether it is memory latency, scoreboard waits, or barriers. 🧊
Previously NSight-only territory. Now running in prod with low overhead.
www.polarsignals.com/blog/posts/2...