More

crowwork · 2025-10-22T15:16:03 1761146163

The goal of the project is to bring open ABI and FFI for machine learning systems.

- Stable, minimal C ABI designed for kernels, DSLs, and runtime extensibility. - Zero-copy interop across PyTorch, JAX, and CuPy using DLPack protocol. - Compact value and call convention covering common data types for ultra low-overhead ML applications. - Multi-language support out of the box: Python, C++, and Rust (with a path towards more languages).

crowwork · 2025-01-07T20:02:21 1736280141

Scale LLM serving with programmable cross-engine serving patterns, all in a few lines of Python

crowwork · on Nov 25, 2024

XGrammar is an open-source library for efficient, flexible, and portable structured generation. Bring 2x-10x speedup in grammar grammar-guided(JSON and CFG) LLM serving.

crowwork · on June 13, 2024

Comes with ability to do full structured generation with json schema

also a in-browser demo https://chat.webllm.ai/

crowwork · on June 7, 2024

runs on qwen2 on iphone with 26 tok/sec and a OpenAI style swift API

crowwork · on Feb 23, 2024

2b model running at 20tok/sec on iphone, nice potential for future applications

crowwork · on Jan 24, 2024

Runs Phi-2 on Samsung S23 with pretty decent speed on Google Chrome browser.

LLM on browser on a phone

crowwork · on Aug 10, 2023

You can also try out the vulkan backend, which we know should work for windows, although speed might be slower than rocm

crowwork · on Aug 9, 2023

Yes, it works out of box and the blog contains a prebuilt python package that you can try out

crowwork · on Aug 9, 2023

There is also vulkan support which should be more universal(also included in the post), for example, the post also shows running LLM on a steamdeck APU.