Hello all,
TLDR: If you have a deployment of an azure-openai model, which path do you call to get chat completions?
We have a deployment visible in AI launchpad and in status running. This is ID dd5….b5
This is tied to a configuration that is using azure openai model (we intend to use gpt-3.5turbo).
In postman, we can make requests to endpoints such as /lm/deployments and indeed get information about the deployment. Showing that the authentication to the API is working. Example:
Now, we’d like to call the endpoint to get “chat completions”. i.e. passing a prompt and get a completion back.
Examples online suggest that completions are available under /v2/inference/deployments/<deploymentid>/v2/completions but this fails with a 404:
Indeed we found some conflicting info online and decided to try variations of the differents paths. They always return either a 404 or a 403.
Paths tested:
/v2/inference/deployments/dd5…b5/
/v2/inference/deployments/dd5…b5/v2/completions
/v2/inference/deployments/dd5…b5/v2/completion
/v2/inference/deployments/dd5…b5/v2/chat
/v2/inference/deployments/dd5…b5/v2/chat-completion
/v2/inference/deployments/dd5…b5/v2/query
/v2/inference/deployments/dd5…b5/chat-completion
/v2/inference/deployments/dd5…b5/chat
/v2/inference/deployments/dd5…b5/completions
/v2/inference/deployments/dd5…b5/query
These paths also fail but with 403 forbidden RBAC: access denied
/v2/lm/deployments/dd5…b5/v2/query
/v2/lm/deployments/dd5…b5/v2/completions
/v2/lm/deployments/dd5…b5/v2/chat-completions
…/chat/completions?api-version=v2
References and examples from the web:
This example suggest it should be /v2/inference/deployments/<YOUR_AICORE_DEPLOYMENT_ID>/v2/chat-completion
This example suggests it’s either /v2/completion or /v2/chat-completion at the end.
The CAP LLM plugin sample suggests it’s /chat/completions with an apiversion parameter with an undocumented value
This person is similarly confused about which path to use.
Question
Do you know what is the correct path please? Any help appreciated.