Modern AIs tackle different questions than data science on industry or scientific applications.
Statistical thinking, front and center in data science, is often hidden in AI. But it’s just as crucial. And too often, we treat data science or AI as merely a programming exercise
In this high-level yet detailed blog post, I expend on the statistics that hold together AI and data science:
blog.probabl.ai/data-science...
- It is now possible to pass arguments to the scorers in Data Ops, such as sample weights.
- Diagrams for the Learner and parameter searches now include the full DataOp graph in their notebook repr.
- It is now possible to find nodes by name in the DataOp graph.
- The TableReport now uses plot_distributions and compute_associations to control the distribution and association tabs respectively.
- The cleaner now allows to control whether numeric-looking strings ("['1', '2', '3']") should be parsed to float.
- fuzzy_join and Joiner now allow to choose the metric that should be used for matching.
- ApplyToCols now has the exclude_cols parameter, to define which columns should not be transformed.
Gaël Varoquaux
Gaël Varoquaux
The CfP deadline for Compute! Paris 2026 was extended to Sunday, June 7! Just a few days left to submit a proposal on Open Source scientific compute, data science, ML & AI topics.
Conference dates and venue: November 25–26, 2026, Sorbonne Université · Paris
compute.events/paris2026/cf...
✨ Skrub version 0.9.0 has been released ✨
This release adds some advanced features to the Data Ops, the has_dtype() selector, as well as some clarity improvements for the Cleaner and TableReport.
Release post:
github.com/skrub-data/s...
FAQ on NeurIPS Europe:
NeurIPS Europe is an official NeurIPS 2026 satellite event taking place in Paris, France, alongside the main conference in Sydney and the other satellite event in Atlanta.
NeurIPS authors can present their papers at any of the three locations, subject to space availability.
Skrub
Skrub
Skrub
Skrub
Olivier Grisel
Gaël Varoquaux explains why and how statistical thinking is the heart of AI.