//
sign in
Post
by @danabra.mov
PostEmbed
by @danabra.mov
Record
by @jimpick.com
Record
by @atsui.org
+ new component
Post
Had an AWESOME conversation with the @allthingsopen.bsky.social community about local LLMs on small hardware: model compression can quantize a model from 220 GB → 55 GB with <1% accuracy loss, and inference engines like vLLM help run them fast and efficiently. šŸŽ„ www.youtube.com/watch?v=xGqV...
www.youtube.com
YouTube video by All Things Open
Local LLMs are about to change everything – here's why quantization matters
1mo
Cedric Clyburn