Service Certificate – STACKIT Model Serving
Service Name
STACKIT Model Serving
High level service description
STACKIT Model Serving (“Model Serving”) provides open-source Large-Language-Models (“LLM”) and other GenAI-Models as shared instances. Customers can use shared instances via an OpenAI-compatible REST API. Chat and embedding models are provided. An API key is used for authentication. When using the Model Serving Service, STACKIT does not collect or evaluate any customer data other than billing-relevant data.
Key Features
- State-of-the-art open-source LLMs
- Chat & embedding-models
- GDPR-compliant service
- Usage-based billing according to tokens used
- OpenAI-compatible interface
- Easy to use via API Key
Service Plans
Each model provided is assigned to a service plan. The service plans are assigned to the categories Base, Plus or Premium according to ascending model size. The assignment is described in the STACKIT portal and in the STACKIT documentation.
Metric
Billing for Model Serving is token-based based on the type of model:
- For chat models, according to the number of tokens used (both the input tokens [sum of the tokens in the request] and the output tokens [sum of the tokens generated by the LLM]) of a service plan, whereby each model is assigned to a service plan. Information on estimating the number of tokens in a request can be found in the respective model descriptions (model cards) within the STACKIT documentation. The price stated in the general STACKIT price list applies per up to 1 million tokens used.
- For embedding models, only input tokens are charged. The price stated in the general STACKIT price list applies per up to 1 million tokens used.
- The respective model type is shown in the STACKIT documentation and in the STACKIT Cloud Portal. The customer determines which model type is used as part of the API selection to their application.
SLA Specifics
In deviation from the availability specifications in the general STACKIT Service Description, an availability of 99.5% per calendar month is agreed (measured by the external availability of the LLM API).
Backup
Customer requests are not backed up.
Additional Terms
- When using the model selected by the customer, the customer undertakes to comply with the license conditions applicable to the respective model, which can be viewed in the STACKIT documentation.
- Model deprecation process In addition to the general STACKIT Cloud Terms of Use and the general STACKIT service description, models can be terminated by STACKIT with a notice period of 6 months. If a deprecated model version is followed by the release of a direct successor model, deprecated model versions can be discontinued by STACKIT with a lead time of 3 months and replaced by the successor model.
- STACKIT additionally points out that the customer must comply with the relevant legal terms for AI applications created by the customer.
Version and start of validity
Version 1.0, valid from 04.02.2025