//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfilePosts









Loading...
clamtech.org?dest=intel_u... - Sharing some thoughts on Intel's possible unified core project. Basically, I think the easiest route is a Zen 4c/5c style shrink of their P-Core. But of course, Intel has more options than that
Intel's desktop Arrow Lake always keeps the SNCU (die to die interface and some other parts of the uncore) at 2.6 GHz. On Meteor Lake, it goes up to 2.4 GHz but varies a lot probably to save power.
21d
9mo
So far I used a simple a=a[a] pattern to test GPU memory latency, but that indexed addressing penalty always bothered me. I finally got around to making the compiler spit out a chain of dependent loads and nothing else. Good start on AMD. I save ~4 or ~12 ns for scalar and vector accesses
Intel's Arrow Lake is impressively efficient when running throughput-bound stuff on 16 underclocked E-Cores. I'm running all Geekbench 6 workloads in parallel through Intel SDE (so emulated, to get exact instruction counts), and this chip is getting over 3.7G instructions/watt.
Sharing a piece I wrote a while ago on Zen 1. I mostly did this to test my site design, with plenty of pagination, tables, and captioned images. It's a pretty complete article by itself though, and I hope yall find it a fun read! clamtech.org?dest=zen1
I wanted to bring in RDNA4 since I have an example of that card now, but never found the time. That's stuck on the back of a long todo list :/
Sharing another piece I wrote last year, comparing hardware AV1 encoding on Intel's Arc B580 and AMD's Hawk Point, at clamtech.org?dest=av1hwenc Should be a good test of image handling on the site, with sliders for quality comparisons
Time for a little site with some multi-page support! I plan to write random thoughts on hardware there. To start, here's some commentary on drilling down GPU cache latency using very funny OpenCL kernels: clamtech.org?dest=gpudire...
clamtech.org?dest=gpuwrite Here's a look at GPU cache/memory write bandwidth across a variety of hardware
Looking good on Intel too, improving measured latency by ~6.8 ns, or 19-20 cycles
7mo
2mo
2mo
1mo
1mo
3mo
3mo
Clamchowder