GPU LLM inference on Apple Silicon: a single Java 25 executable JAR, zero dependencies. Binds Metal-enabled libllama.dylib through the Foreign Function & Memory API. Runs Mistral and Gemma GGUF models locally.
Anthropic-compatible /v1/messages
OpenAI-compatible /v1/chat/completions
Apple Silicon mlx with Zero Dependency Java. Contribute to AdamBien/lightmetal development by creating an account on GitHub.