8 Test it β raw curl (two modes)
Nonβstreaming β wait for the full answer (thinking off = short & fast):
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"messages":[{"role":"user","content":"Explain Nvidia Jetson in 2 sentences."}],
"max_tokens":150, "chat_template_kwargs":{"enable_thinking":false} }'
Streaming β tokens appear live (add -N and "stream":true):
curl -N http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"messages":[{"role":"user","content":"Explain Nvidia Jetson in 2 sentences."}],
"max_tokens":150, "stream":true, "chat_template_kwargs":{"enable_thinking":false} }'
Full curl guide (SSE format, options): HTTP API docs