Llama Index
The Llama Index integration enables to monitor your RAG pipelines with a single line of code:
literalai_client.instrument_llamaindex()
The Llama Index integration in the Python SDK is compatible with Llama Index starting with version 0.10.58.
LlamaIndex offers a variety of concepts to interact with LLMs:
- Query Engines
- LLMs
- Agents
We explain what each concept leads to in terms of Thread
, Run
and Generation
logs,
and show you a visual of what you can expect on Literal AI.
Query Engines
Each Thread
will result in the following tree on Literal AI :
A LlamaIndex RAG thread on Literal AI
LLMs
LlamaIndex offers wrappers around LLM providers to interact with their APIs.
llm.chat
The methods llm.chat
and llm.stream_chat
both generate a standalone Generation
:
A LlamaIndex LLM call - Standalone Generation
Please note that LlamaIndex token usage is not available for streaming methods due to limitations
in the event data present on the LLMChatEndEvent
for chunk completions.
However, the Literal AI platform defaults token counts computation to the cl100k_base
tokenizer
which is a fair approximation of the expected token usage.
llm.predict_and_call
The llm.predict_and_call
also results in a standalone Generation
on the Literal AI platform.
Specifically, LlamaIndex does not trigger events related to tool calls and we recommend decorating
your tools’ function definitions with @literalai_client.step(type="tool", name="My Tool")
to view
the calls performed.
Note that a Step
of type tool
cannot be standalone on the Literal AI platform and we thus
recommend you to add a contextual Step
wrapper around your llm.predict_and_call
call, as such:
with literalai_client.step(type="run", name="Predict & Call")
llm.predict_and_call(...)
Agents
LlamaIndex has the concept of agent as part of its FunctionCallingAgent
and specifically derived
an OpenAIAgent
with specificities to the OpenAI model offerings.
Function calling agents can be tuned in a variety of ways, but the general idea is that they iteratively perform the configured LLM calls with tool options until the LLM deems it unnecessary to call a tool.
When calling an agent.chat
, you can expect to obtain a “run” Step
of the following form:
A LlamaIndex agent chat - Agent Run with multiple intermediate steps
The tool calls in the stack above show only because the functions themselves are decorated with
an @literalai_client.step(type="tool", name="My Tool")
.
Was this page helpful?