> ## Documentation Index
> Fetch the complete documentation index at: https://docs.literalai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# LLM Inference servers (vLLM, etc.)

By integrating the OpenAI SDK with Literal AI's instrumentation, you can also effectively monitor **message-based inference servers such as LMStudio, vLLM or HuggingFace**, ensuring that you have full visibility into the performance and usage of your AI models.

<CodeGroup>
  ```python lmstudio.py theme={null}
  from literalai import LiteralClient
  lc = LiteralClient()
  lc.instrument_openai()
  # Example: reuse your existing OpenAI setup
  from openai import OpenAI

  # Point to the local server
  client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

  completion = literalai_client.chat.completions.create(
    model="TheBloke/Mistral-7B-Instruct-v0.2-GGUF/mistral-7b-instruct-v0.2.Q4_K_S.gguf",
    messages=[
      {"role": "system", "content": "Always answer in rhymes."},
      {"role": "user", "content": "Introduce yourself."}
    ],
    temperature=0.7,
  )

  print(completion.choices[0].message)
  ```

  ```typescript example.ts theme={null}
  import { LiteralClient } from '@literalai/client';
  import OpenAI from 'openai';

  const literalAiClient = new LiteralClient();

  const openai = new OpenAI({
    baseURL: "http://localhost:1234/v1"
  });

  literalAiClient.instrumentation.openai();

  async function createAndInstrumentOpenAIStream() {
    const stream = await openai.chat.completions.create({
      model: "TheBloke/Mistral-7B-Instruct-v0.2-GGUF/mistral-7b-instruct-v0.2.Q4_K_S.gguf",
      stream: true,
      messages: [
        { role: 'system', content: 'Always answer in rhymes.' },
        { role: 'user', content: 'Introduce yourself.' }
      ],
      temperature: 0.7,
    });
  }

  createAndInstrumentOpenAIStream();
  ```
</CodeGroup>

The same works for HuggingFace [messages API](https://huggingface.co/docs/text-generation-inference/en/messages_api) with

```py theme={null}
base_url="https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.2/v1"
```