Concurrency support is missing from the language syntax and this particular library as a concept. This is by design, to not distract from beautiful code. Your request will make zero progress and take up memory while waiting for the LLM answer. Other threads might make progress on other requests, but in real world deployments this will be a handful (<10). This server will get 10s of requests per second when something written in JS or Go will get many 1000s.
It’s amazing how the Ruby community argues against their own docs and doesn’t acknowledge the design choices their language creators have made.