Professor ETHZ, head of SPCL, Chief Architect ML at CSCS researching large-scale #HPC and #AI systems and #Climate computing - youtube: http://bit.ly/3h1VgIU
Torsten Hoefler 🇨🇭
Loading...
Just arrived at SciFM in Chigago. First thing I hear from the panel is great: "Agents are the new programming. They form their own language and system stack. And we computer scientists should contribute the algorithms."
💯, we also need to develop the best architectures! #HPC
Jack Dongarra's opening keynote at HACI 2026!
#HPC in transition - the investments in compute infrastructure are unprecedented but focus mostly on #AI workloads. Scientific numerical simulations need to adjust!
Many interesting challenges ahead.
Mateo Valero discusses history of vector architectures: Back to the Future in #HPC and #AI.
He also mentions the three quantum computers connected to an NVIDIA Blackwell system in BSC's chapel ⛪️.
Ending his talk looking for a future site 😀.
After a nice early morning run, #SciFM26 in Chicago continues.
Rafael shows how many physics models actually learn the same representations in their latent space if you compare proximity of embeddings.
Next: my panel on the disappointment of physical biases and priors in climate foundation models.
Torsten Hoefler 🇨🇭
Yutong Lu reveals the LineShine exascale architecture for #HPC - #AI convergence: 2 exaflop sustained fp64 at 40 MW with 1.34 million ARM CPUs using SME and 4TB/s HBM connected with 800G.
From "offline to online acceleration with SME".
The future is here on my way back from Salishan. My air office is a bit narrow but with both faster Internet than at home and a better view.
Very well done Starlink and United! Now, Swiss and Lufthansa - time to catch up :-).
Read how we designed and deployed Multipath Reliable Connection (MRC), a multi-company step towards Ultra Ethernet, at 800G in Microsoft's datacenters.
Demonstrated with a 75k GPU pretraining job running stably through multiple faults.
OpenAI blog: buff.ly/X0KL4Xy
Paper: buff.ly/I3OcXWn
Taisuke Boku follows on another interesting note: Japan is "changing lanes" from CPU to GPU and from "Made in Japan" to a collaborative "Made with Japan", partnering Fujitsu with NVIDIA.
"We don't chase fp64 performance but focus on #AI flops".
Onward!
Big announcements on MS Build 💪: Maia 200 now online in Iowa and Arizona, going international soon - at 30% better TCO than best GPUs! Big success for our SDLA architecture. 🚀
Running GPT-5.5/copilot and MAI's thinking model achieves 1.4x perf/W vs. GB200! 👏
buff.ly/WuGgZm4 (22:40 and 1:55:17)
Torsten Hoefler 🇨🇭
Torsten Hoefler 🇨🇭
OpenAI introduces MRC (Multipath Reliable Connection), a new supercomputer networking protocol released via OCP to improve resilience and performance in large-scale AI training clusters.
The SciFM conversation is shifting from models to systems. The interesting questions this year were increasingly about how to connect agents, instruments, data, simulations, and compute into effective discovery ecosystems.
cs.uchicago.edu/news/scifm-2... @ramanathanlab.bsky.social
Walking onto the University of Chicago campus this May, visitors to SciFM 2026 could sense the electric anticipation. As the third installment of the Scientific Foundation Models conference series unf...