All changes and improvements to Literal AI are listed here. For changes in the SDKs, go to the Python SDK or TypeScript SDK.

Literal AI cloud is currently compatible with:

Chainlit version 1.0.504 and above.
Python SDK version 0.0.509 and above.
TypeScript SDK version 0.0.503 and above.

0.2.1-beta (March 28th, 2025)

Improvements

Improve ingestion performance and overall platform stability
Improved the UX of text and code editors
Improved draft versions in the Prompt Playground

0.2.0-beta (March 18th, 2025)

Improvements

Step ingestion is now more efficient
Trace tree view now display the relative duration between steps
Added keyboard navigation to the trace tree view
Both the trace and waterfall view now “skip” inactive time for better readability

0.1.9-beta (March 09th, 2025)

Improvements

The dashboard has been reorganized to improve readability. Tool call distribution and prompt distribution charts have been added.
A LLM powered meta prompt generator has been added in the prompt playground
A LLM powered variable value generator has been added in the prompt playground
Data ingestion performances have been improved
Added support for Grok models
Added support for new models across various providers

0.1.8-beta (February 24th, 2025)

Improvements

Waterfall visualization for AI agent run and workflow traces to understand which step took the most time
Added support for o3 mini and claude 3.7

Bug fixes

Tool definition do not require {"type":"function", "function": ...} anymore

0.1.7-beta (January 26th, 2025)

Improvements

Support for Vertex AI

Bug fixes

Complex tool calls are now supported in the prompt playground

0.1.6-beta (December 23rd, 2024)

Improvements

Add support for alt click on tables
Show number of logs in settings
Allow prompt deletion

0.1.5-beta (December 9th, 2024)

Improvements

Add support for gpt-4o-2024-11-20 and pixtral models
Add the participant avatar in the thread detail view
Sort by experiment scores put null scores at the bottom
Remove the unused topic analysis button

Bug Fixes

Improve the code editor elements behavior

0.1.4-beta (December 2nd, 2024)

Improvements

Give access to prompt variables in Scorers via metadata.variables
Add filter on Score value in Logs > Runs table
Improve experience when adding a Step to a Dataset
Experiments table allows filtering on params
Dashboard: Prevent label overflow in “Online Evals” chart
Logs > Scores can be filtered on target log & tags
Display choice of Score label in Rule logs

Bug Fixes

In Thread details, LLM steps rely on given passed name if:
- name is not empty
- model name doesn’t start with name
Prompt Playground: refresh tab messages which would disappear on Prompt creation/save
On “Azure OpenAI” credentials, link to example configuration in docs
Correct environment filter for “Online Evals” logs
Align bar charts radius across application
Allow underscores on IDs

0.1.3-beta (November 25th, 2024)

New features

Add to a dataset directly from within the Prompt Playground

Improvements

Add support of up-to-date models for Mistral AI, Google and Groq
Support for images of up to 10mb (was 2mb) in the Prompt Playground
Add the field “environment” to rules
Fetch more versions when selecting a prompt version
Support broader ids shape for thread and steps (was uuid only)
Display the score value in the rules’ logs
Display tool calls from within the generation table

Bug Fixes

Fix an issue in the playground where the rollout was not visible on update
Fix an issue in the tables where the row height would be fit the content

0.1.2-beta (November 18th, 2024)

Improvements

Added Toxicity, Factuality and Sentiment to out-of-the-box Scorer’s
Added a “Promote” option when saving a new Prompt in the Playground
Playground persists local context at tab level
Experiment details now show Output/Expected Output diff
Prompt version selection in Playground now shows each version’s rollout percentage
Added table cell overviews on hover for text and JSON-like columns
Removed “Open” button on table row navigation
On the Playground, version selection now shows each version’s rollout percentage
Added OpenAI timestamped models

Bug Fixes

Fixed tool parsing in Playground
Fixed search functionality in compare experiments selection

0.1.1-beta (November 13th, 2024)

Bug fixes

Fixed bug Cannot query field "label" on type "PromptVersion".
Fixed a race condition on the playground prompt messages formatting

0.1.0-beta (November 11th, 2024)

This release includes a complete overhaul on credentials.Each major LLM provider now offers a single credential.
Additional credentials should be created as Custom LLM providers.See Credentials.

New features

Added ability to run experiments on prompt directly from Playground
Added Scorers page to manage ways to score on Online Evals and Experiments
Added Online Evals page to manage automatic monitoring
Made score columns sortable on the experiments table for better data analysis
Added scores distribution chart and error display in experiments
Enabled editing of Dataset items
On Datasets table, we now have links to experiments on each dataset

Improvements

Overhauled credentials management to simplify Experiment/Online Evals setup
Keep sidebar state in localStorage and removed playground resize
Added close button on hover for error messages
Changed display to 8-character UUIDs and added a copy ID button

Bug fixes

Removed Summary tab and scores displays in tables
Defaulted table JSON previews to YAML

0.0.629-beta (October 25th, 2024)

New features

Prompt Playground allows for model comparison with multiple tabs
“AI Evals” let you manage/configure monitoring rules with custom prompts
New “Scoring” tab in Settings to manage Score Schemas: categories, descriptions, etc.

Improvements

Experiment Item details view:
- Shows scores and input/output diffs
- Clarifies diffed experiments in comparison mode
Prevent deletion of AI Score on Experiment
Thread tree view: allow expand/collapse nested steps on tree nodes
Styling: navigation side bar, code editors, spacing, etc.

Bug fixes

Fix handling of max database connections
Gracefully handle streaming errors - e.g. tool calls with Google
Invalidate cache on Prompt version tags edits
Prevent “Human Review” overlap on Experiment item details
Move “Sentiment classification” Prompt example to rely on json_schema with oneOf descriptions

0.0.628-beta (October 21th, 2024)

New features

Add tagging supports on prompts
Deprecate unused prompt versions
Add experiment item detailed view
Access score schemas in settings

Improvements

Existing datasets can import CSV files
Improve Redis failure handling
Update code theming
Support description for score schemas

Bug fixes

Fix uncaught error in Playground monitoring
Facilitate credentials management in Playground
Fix credential visibility issue (Amazon Bedrock)

0.0.627-beta (October 14th, 2024)

New features

Allow re-run of experiments from Experiment details
Added Scores column to Logs > Generations table

Improvements

Show Score label instead of value on experiment scores
Add manual item creation on empty Dataset

Bug fixes

Fix Thread updates on participant id/identifier
Fix Prompt Playground duplication for overlapping default and custom models
Improved navigation to Logs > Scores
Moved asynchronous step ingestion to main thread
Increased default number of steps fetched on a Thread to 100

0.0.626-beta (October 7th, 2024)

New features

Added new end-to-end cookbook with LangGraph
Owners can now transfer project ownership to another Team member from Settings.

Improvements

Settings > General allows for Tag management actions: add, filter and delete.
Thread details display children Step’s incrementally — improves page load times for long threads.
Improved UX when adding a Run / Generation to a Dataset: layout, JSON/YAML editing and format validation.

Bug fixes

Fixed data ingestion for Users/Participants and Last engaged at information.
Fixed the Thread table filter on Step name: filter speed improved by a 15x factor.

0.0.625-beta (September 30th, 2024)

Improvements

Workers stability increased.
Colors for both light and dark theme are more contrasted.
Hint text for experiments has been improved.

Bug fixes

Prevent creation of multiple demo projects.
Prevent the UI from breaking when encountering cyclic steps.
When using tools in prompt playground avoid skipping on tools.

0.0.624-beta (September 24th, 2024)

New features

Added Amazon Bedrock support to Prompt Playground.
The Prompt Playground now lets you structure LLM outputs by specifying any JSON schema.
Users can now leave projects.

Improvements

API keys are now hidden in Settings.
Human/AI reviews display category labels instead of values.
LLM Generations created without token usage have a default estimate with tiktoken cl100k_base.
The Thread details page has an improved user experience.

Many more UI/UX & performance improvements across the platform !

Bug fixes

Dataset imports from CSV now allows JSON objects as column values.
Experiment which fail to launch - e.g. wrong API key - now show as failed.

0.0.623-beta (September 16th, 2024)

AI Evaluation rules should be re-configured to account for the switch to custom Evaluator prompts. Check out Score-based Rules to get started !

New features

Introduced “Run Experiment” feature on datasets
Enabled custom prompt for LLM-as-a-Judge evaluators
Enabled structured output in Playground and in prompts
Added Total Cost chart to dashboard (input + output tokens)

Improvements

Dataset table was enhanced with a side-panel Item View
Prompt Playground - Added keyboard shortcuts
Added ability to navigate from Generation to its root Run
Improved Step query efficiency for token count and environment filters
Improved queries for retrieving steps and threads
Optimized score management for faster edits and fewer queries
Enhanced worker performance with multi-threaded asynchronous step ingestion

Bug fixes

Prompt Playground:
- Fixed template messages editing issues
- Improved scroll behavior for streamed LLM generation
Enabled Google as a Provider in Playground
Updated prompt cache invalidation strategy

0.0.621-beta (September 2nd, 2024)

New features

We are introducing the concept of “Model Costs”, which will allow you to monitor the actual costs associated with your logged generations. You can now setup the costs for the various LLM models you use in production, including negociated prices.
When you exceed your monthly allowance on the free tier, you will now be notified by a pop-up and a persistent message in the sidebar.

Improvements

Add a settings column to the prompt versions table.
In the Settings page, you can now see details about a Score Schema by hovering above it.
Various performance improvements across the platform.

Bug fixes

Fixes an ingestion error when a step’s data includes Unicode NULL characters.
Fixes a race condition which could slow down step ingestion.

0.0.620-beta (August 27th, 2024)

New features

Logs now include Scores so that you can browse Human / AI evaluations of your application
The Dashboard includes two new tiles for recently ingested Runs & Scores, and you can jump to that data in one click.
You can now add descriptions to individual values in score schemas, to help both with human annotation and AI scoring
When setting up prompt A/B testing, you can now search prompt versions by their number

Improvements

API keys are now hidden by default with *** in Settings / LLM. Enjoy screen-sharing with your friends!
Improved speed ingestion for Step means faster ways to troubleshoot your application.
Identify the origin of Generations at a glance by checking out the Prompt name & version column in the Logs / Generations table. We have also added the much needed Output column!
Domain Experts can now browse Settings to better intuit their restricted permissions.
Thread details now show a more discoverable Scores section, right underneath Tags.

Bug fixes

Fixed a bug where you couldn’t select a tag when searching for it in the Filters UI
Fixed a bug with query refreshing
Solved an issue with the registration of clients and products on Stripe

0.0.617-beta (August 12th, 2024)

New Features

Prompt A/B testing (replaces the champion system)

Improvements

The UI now supports any number of tags (previously capped at 100)
Annotation queues, Datasets and Logs are linked more naturally traversable directly from the UI.
You can now edit a step before adding it to a Dataset from an Annotation Queue

Bug fixes

Fixed multiple bugs in the Annotation Queues
Domain Experts are now able to add items to a dataset
Redis connection should no longer hang

0.0.615-beta (July 31th, 2024)

New features

Annotation Queues: you can now collaborate as a team and assign steps for review
Environments allow you to silo experiment, development, staging and production logs
Experiments:
- Faster bootstrap to launch experiments without the need to link to a Dataset
- Experiment items are now sortable by score in a leaderboard fashion for easier comparison
- You can now troubleshoot your experiment items by visualizing the experiment runs as full traces
Generations: you can now enrich your logged LLM calls with metadata

Improvements

Work with your own LLM endpoint by configuring a Custom LLM provider in your settings - based on OpenAI’s messages API
Quickly identify your champion version in the Prompt versions table with a star icon
Swiftly re-use your Playground prompts in code by using the new Copy button on Template messages
If created via a Score Template, a Score gets linked to its template for traceability
Improved Generations table readability by displaying Input as the last message in OpenAI’s messages API
Unified & enhanced the user experience on score, tag and credentials creation
Added latest Anthropic model Claude Sonnet 3.5

Bug fixes

Project deletion completes successfully when experiments exist in project
Rule invocation now works with Azure OpenAI credentials
Improved Markdown rendering of Thread Chat view

0.0.613-beta (July 23rd, 2024)

New features

Datasets : you can now create a dataset from a CSV file
Onboarding : empty pages on a new project will now include code snippets and instructions to start sending data on Literal AI
Navigation : the sidebar has been revamped for flatter navigation between platform modules

Improvements

A new, tighter tree view that better displays Chain of Thought reasoning
Add new Run/Generation filters when creating or updating an evaluation rule : model, duration, prompt lineage, prompt version
Improve edition of Azure OpenAI credentials
Various improvements on platform deployment, both for cloud and self-hosting

Bug fixes

Fix a bug with Azure OpenAI in the prompt playground and other LLM calls
The credentials table will now refresh correctly after creting or updating an item
Local LLMs are now correctly handled by the prompt playground and other LLM calls
Fix a bug on the prompt playground where changing settings could reset the message list

0.0.612-beta (July 10th, 2024)

New features

When scoring a step through the platform, we now track the user who created the score
We are preparing the platform for the upcoming release of the Annotation Queue
A new chart on the dashboard shows the number of runs per day per run name
Upon signup, a new account will now contain a default project populated with Threads, Steps, Datasets etc…

Improvements

We are rolling out a new Role system. Possible roles are now as follows : Admin, AI Engineer, Domain expert
We have revamped the creation, edition and deletion of Rules for Online Evaluation
We have improved screen space management in tables, notably when displaying code previews
Some design tweaks on the dashboard and dark mode

Fixes

Fixed a bug where the generation was not correctly displayed in a run
Fixed a bug where some logos would not display correctly in dark mode
Fixed a bug where the scores API could break if no generationId was provided

0.0.611-beta (July 1st, 2024)

New features

Easily navigate the Runs view with arrow keys
You can now filter Runs/Generations by score presence
You can now bulk add Generations to Datasets from the UI
Dark theme for the diff editor, box plots and toasters
Added a new “Run” chart to the dashboard

Improvements

This version embarks the first iteration of our UI revamp
Images are now zoomable in the Prompt Playground

0.0.610-beta (June 24, 2024)

Improvements

Update the feedback button
Extend “Rules” table with filters and pagination
Update “is null” and “is not null” filters with a more explicit behavior
Improve the score element UI

Fixes

Fixed an issue where annotators could not access content
Fixed an issue when double-clicking on a date-picker
Fixed an issue related to “Generation” links

0.0.609-beta (June 17, 2024)

New features

Added a “maintainer” role for the project, that allows for write while preventing the user to manage the project

Improvements

Simplify the Generation and Step data handling
Rules can now be updated directly
In “Experiment” you can now see the diff between inputs and outputs columns
The navigation is improved
Score templates can now be accessed in the “Evaluate” section
When scoring with a “categorical” score, the score value now uses its name rather than the value itself
In the dashboards we no longer display nullish values as 0
Rules have now their own detail page

Fixes

Fixed an issue where prompt playground settings were not correctly persisted
Fixed an issue where some step rows were duplicated
Fixed an issue with the “dataset link”
Fixed an issue where it was not possible to select custom models in the playground
Fixed an issue with the “Generations” page pagination

0.0.608-beta (June 4, 2024)

New features

Compare feature is now available! This compare feature allows you to compare between generations.
We’ll be debuting in the coming week with self-service distribution of the Literal AI platform for self-hosting
A new user role has been added: “Annotator”, a user that can add tags and scores to the observability entities (e.g. thread, step…) and has no access outside of those.
Project administrator can now pick the role of a user when inviting.

Improvements

Literal AI API keys are now shared in the project. Before, admins could create “personal” API keys. The behavior of restricting access to admins is kept.
We’re continuing our push towards a more consistent - and prettier - User Experience:
- We’ve switched to a more vibrant color scheme
- Made some visual tweaks on the Dashboard page
- Observability items such as Threads, Runs, and others will now display as full pages rather than side-panes
- And lots of other improvements across the platform
Some changes on the way the platform is deployed, both on our end and for our on-premise users :
- Improved and centralized environment management
- The Portkey AI Gateway is now handled directly inside the Node process
- The BUCKET_NAME environment variable is no longer mandatory. Trying to store objects will log errors but not disrupt the rest of the operations

Fixes

This week’s release sees a big focus on performance, especially on the Users and individual Thread pages
We’ve also chased and squashed a few bugs related to :
- Signing Attachment URLs (for object storage like S3)
- Conflicts on unique userId
- A visual bug on initialization of “continuous” score templates
Audio attachments are now resolved.
Links on prompt versions are now directed to the correct prompt.
Show the correct projects when accessing the prompt playground.

0.0.607-beta (May 27, 2024)

New features

Added online evaluations to score LLM generations on the fly.
Created the endpoint /api/my-project to quickly access a project ID with an API key
Brushed up the Dashboard page with:
- Browser-level customizable layouts of charts
- Filters on each chart to select relevant data - also saved at the browser level
Specific to token usage, we offer multiple visualizations

Improvements

Improved the look of Step badges, specifically colors.
Sharing threads requires additional privilege
Improved UX on text, audio, image and video attachments in Step details
Prompt versions show a visual “Open” button to jump to the Prompt Playground
Revamped the UI look of the side navigation
Stop sequences on Prompt Playground now show visual cues
Removed UUID columns across tables to improve readability
JSON & Text previews come with full screen & copy/paste options

Fixes

Newly created API keys do not contain special characters

0.0.606-beta (May 20, 2024)

Breaking Changes

Dataset: Renamed intermediary steps expectedOutput to output. In a Dataset, in the field Indermediary Steps, the field expectedOutput is renamed to just output, because this is the actual output of the LLM. This breaks backward compatibility for users relying on DatasetItem.intermediarySteps.expectedOutput, DatasetItem.expectedOutput remains unchanged.

New Features

Attachments now come with preview widgets (multi-modality)

Improvements

A page change on tables now scrolls back to top of table
We removed name fallback to ID for threads, now shown as N/A
We reduced the indent of navigation sub-menus
The prompt playground now persists the credentials for your session
Improved user feedback options from the UI
JSONs in tables now display on multiple lines with syntax highlighting
Improved dashboard performance with data fetch in separate requests

Fixes

Fixed creation of attachments and scores when step doesn’t exist
Fixed thread duplication when filtering on errors
Fixed the upserts of step input/output to prevent going past size limit

0.0.605-beta (May 13, 2024)

New Features

Support GPT-4o as LLM model provider
We now display a diff of the prompt settings when saving a prompt version
Steps are now supporting tags

Improvements

We now populate the dataset item intermediarySteps when adding a step with children steps
The API credentials in the prompt playground are moved
Generation details view now has a link to prompt
Support higher than 1 temperature setting for compatible LLMs

Fixes

Fix display bugs in prompt playground
Fix a bug where we allow for very large json inputs
- Now metadataare limited to 1mb
- Step input and expectedOutput limited to 3mb
Fix a bug where full-text searching threads would lead to a spike in cpu usage
Fix a rare bug that could occur when ingesting multiple steps with a new tag

0.0.604-beta (May 6, 2024)

New Features

UI/UX: New button on the headers on the Literal AI platform of a page that links to the documentation. This improves the UX of the Literal AI platform.
Release status page. Literal AI now has a release status page: https://literalai.betteruptime.com/. Here, you can see the uptime of the services.
Experiments/Score: You are now able create Scores directly in Experiments.
Threads: There is now a search bar in the Threads table.

Improvements

Minor UI updates to:
- Sidebar navigation
- Scores table
- Table filters
Tags: Pressing enter will now create a Tag
Warn on dataset deletion
Persist playground settings
Warn when creating a prompt

Fixes

Fix dashboard evolution badge tooltip period being wrong

0.0.603-beta (April 29, 2024)

New Features

Credentials: You are now able to share your llm credentials to better collaborate through the prompt playground.

Improvements

Dashboard: New comparison badge in dashboard to display the data evolution.
UI of Thread and Dataset: On the top of a Thread or Dataset page there, the location is now shown as breadcrumb. This prevents getting lost in sheets and improves navigation.
Settings UI: Split Settings menu in the UI into sub-menus for General, LLM and Team.
Prompts: New Created by column on Prompt Version table, which improves the table display.

Fixes

Prompt Playground: Fix model select overflow (This is minimal change in the prompt playground. The long model name in the select is now ellipsed when size is reduced)
Experiments: In comparison mode, parameters are made more explicit. In addition, the charts inversed, which is now fixed.
Filters: is null and not in filters on tags edge cases are added. This fixes the tags filters in the table, as they were not working as intended before.
Tags: Newly created Tags are now visible in the UI when a Thread, Step or Generation page is refreshed. Tags are now refetched on page refresh.
Tags: Tags can now be added on generations being created.

Important Notice

Get Started

Application

Evaluation

Settings

Guides

Integrations

Self Hosting

More

​0.2.1-beta (March 28th, 2025)

​Improvements

​0.2.0-beta (March 18th, 2025)

​Improvements

​0.1.9-beta (March 09th, 2025)

​Improvements

​0.1.8-beta (February 24th, 2025)

​Improvements

​Bug fixes

​0.1.7-beta (January 26th, 2025)

​Improvements

​Bug fixes

​0.1.6-beta (December 23rd, 2024)

​Improvements

​0.1.5-beta (December 9th, 2024)

​Improvements

​Bug Fixes

​0.1.4-beta (December 2nd, 2024)

​Improvements

​Bug Fixes

​0.1.3-beta (November 25th, 2024)

​New features

​Improvements

​Bug Fixes

​0.1.2-beta (November 18th, 2024)

​Improvements

​Bug Fixes

​0.1.1-beta (November 13th, 2024)

​Bug fixes

​0.1.0-beta (November 11th, 2024)

​New features

​Improvements

​Bug fixes

​0.0.629-beta (October 25th, 2024)

​New features

​Improvements

​Bug fixes

​0.0.628-beta (October 21th, 2024)

​New features

​Improvements

​Bug fixes

​0.0.627-beta (October 14th, 2024)

​New features

​Improvements

​Bug fixes

​0.0.626-beta (October 7th, 2024)

​New features

​Improvements

​Bug fixes

​0.0.625-beta (September 30th, 2024)

​Improvements

​Bug fixes

​0.0.624-beta (September 24th, 2024)

​New features

​Improvements

​Bug fixes

​0.0.623-beta (September 16th, 2024)

​New features

​Improvements

​Bug fixes

​0.0.621-beta (September 2nd, 2024)

​New features

​Improvements

​Bug fixes

​0.0.620-beta (August 27th, 2024)

​New features

​Improvements

​Bug fixes

​0.0.617-beta (August 12th, 2024)

​New Features

​Improvements

0.2.1-beta (March 28th, 2025)

Improvements

0.2.0-beta (March 18th, 2025)

Improvements

0.1.9-beta (March 09th, 2025)

Improvements

0.1.8-beta (February 24th, 2025)

Improvements

Bug fixes

0.1.7-beta (January 26th, 2025)

Improvements

Bug fixes

0.1.6-beta (December 23rd, 2024)

Improvements

0.1.5-beta (December 9th, 2024)

Improvements

Bug Fixes

0.1.4-beta (December 2nd, 2024)

Improvements

Bug Fixes

0.1.3-beta (November 25th, 2024)

New features

Improvements

Bug Fixes

0.1.2-beta (November 18th, 2024)

Improvements

Bug Fixes

0.1.1-beta (November 13th, 2024)

Bug fixes

0.1.0-beta (November 11th, 2024)

New features

Improvements

Bug fixes

0.0.629-beta (October 25th, 2024)

New features

Improvements

Bug fixes

0.0.628-beta (October 21th, 2024)

New features

Improvements

Bug fixes

0.0.627-beta (October 14th, 2024)

New features

Improvements

Bug fixes

0.0.626-beta (October 7th, 2024)

New features

Improvements

Bug fixes

0.0.625-beta (September 30th, 2024)

Improvements

Bug fixes

0.0.624-beta (September 24th, 2024)

New features

Improvements

Bug fixes

0.0.623-beta (September 16th, 2024)

New features

Improvements

Bug fixes

0.0.621-beta (September 2nd, 2024)

New features

Improvements

Bug fixes

0.0.620-beta (August 27th, 2024)

New features

Improvements

Bug fixes

0.0.617-beta (August 12th, 2024)

New Features

Improvements

Bug fixes

0.0.615-beta (July 31th, 2024)

New features

Improvements

Bug fixes

0.0.613-beta (July 23rd, 2024)

New features

Improvements

Bug fixes