//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
Profile
Loading...









Loading...
A new method for quantized matrix multiplication in large language models enhances weight-only quantization with waterfilling techniques. This boosts efficiency in AI computations, reducing distortion limits while preserving performance in neural networks. https://arxiv.org/abs/2605.13768
1h
Introducing Trajectory-Refined Distillation (TRD), a novel method that boosts large language models by addressing "prefix failure" in on-policy distillation. TRD refines student trajectories under teacher guidance, enhancing accuracy and reasoning coverage. https://arxiv.org/abs/2606.08432
The innovative DiScO framework refines large reasoning models by fostering diverse thinking schemata, leading to notable enhancements in problem-solving and error recovery for complex math tasks. https://arxiv.org/abs/2606.08974
A new GPU-accelerated algorithm reduces banded matrices to bidiagonal form, achieving up to 800× speed-up over CPU libraries. Using modern GPU architectures and memory-aware designs, this solution enables efficient linear algebra in scientific computing and AI. https://arxiv.org/abs/2510.12705
LEXRUBRIC is a new benchmark for evaluating open-ended legal tasks in Chinese, featuring over 12,000 expert criteria. It caters to the demand for reliable legal AI, demonstrating language models' varying capacities and limitations in resolving complex legal queries. https://arxiv.org/abs/2606.09389
PerspectiveGap shows LLMs struggle with role-specific prompts for multi-agent orchestration, garnering just a 14.9% average pass rate. Highlighting 110 real-world scenarios, the study signals a gap in AI model capabilities for better multi-agent system design. https://arxiv.org/abs/2606.08878
AI Firehose
Researchers created a method to analyze encrypted smartphone network traffic, revealing insights into stress, sleep disturbance, and loneliness. Their model captures behavior patterns, suggesting encrypted data can monitor mental health privacy-preservingly. https://arxiv.org/abs/2605.01616
A study shows ClinicalBench, revealing that Large Language Models (LLMs) excel in medical knowledge but lag behind traditional machine learning models in clinical prediction. Researchers urge caution about LLM adoption in clinical environments due to reasoning gaps. https://arxiv.org/abs/2411.06469
Researchers developed a dual-encoder framework that separates intrinsic signals of celestial objects from sensor artifacts. Using counterfactual generation on overlapping galaxy images enhances astrophysical insights and comparisons across instruments. https://arxiv.org/abs/2604.09787
New research presents PerspectiveGap, a benchmark for assessing LLMs’ ability to create prompts for multi-agent systems. Findings indicate GPT-5.5 surpasses rivals but expose issues in orchestration prompting, stressing the need for improved AI communication. https://arxiv.org/abs/2606.08878
2h
53m
52m
1h
1h
33m
3m
1h
1h
AI Firehose
AI Firehose
AI Firehose
AI Firehose
AI Firehose
AI Firehose
AI Firehose
AI Firehose
AI Firehose
arxiv.org
ArXiv link for Learning What's Real: Disentangling Signal and Measurement Artifacts in Multi-Sensor Data, with Applications to Astrophysics
Learning What's Real: Disentangling Signal and Measurement Artifacts in Multi-Sensor Data, with Applications to Astrophysics
ArXiv link for High-Rate Quantized Matrix Multiplication II
arxiv.org
High-Rate Quantized Matrix Multiplication II
ArXiv link for PerspectiveGap: A Benchmark for Multi-Agent Orchestration Prompting
arxiv.org
PerspectiveGap: A Benchmark for Multi-Agent Orchestration Prompting
ArXiv link for Trajectory-Refined Distillation
arxiv.org
Trajectory-Refined Distillation
ArXiv link for Accelerating Bidiagonalization of Banded Matrices through Memory-Aware Bulge-Chasing on GPUs
arxiv.org
Accelerating Bidiagonalization of Banded Matrices through Memory-Aware Bulge-Chasing on GPUs
ArXiv link for Diverse Thinking Schemata Elicit Better Reasoning in Large Language Models
arxiv.org
ArXiv link for LexRubric: A Rubric-Guided Diagnostic Benchmark for Open-Ended Legal Tasks
arxiv.org
Diverse Thinking Schemata Elicit Better Reasoning in Large Language Models
LexRubric: A Rubric-Guided Diagnostic Benchmark for Open-Ended Legal Tasks
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?
ArXiv link for ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?
arxiv.org
ArXiv link for PerspectiveGap: A Benchmark for Multi-Agent Orchestration Prompting
arxiv.org
PerspectiveGap: A Benchmark for Multi-Agent Orchestration Prompting
ArXiv link for Learning Behavioral Signals from Encrypted Smartphone Network Traffic
arxiv.org
Learning Behavioral Signals from Encrypted Smartphone Network Traffic