Skip to content

LLMs

TypedAI provides a simple LLM interface which wraps the Vercel ai npm package to

  • Provide simple overloads to generateMessage with a single user message and optionally a system prompt.
  • Add OpenTelemetry tracing.
  • Add cost tracking.
  • Provide common thinking levels (low, medium, high) for LLM which support configurable thinking budgets.
  • Save the LLM request/response (LlmCall) to the database.
  • API key lookup from environment variables and user profile and isConfigured() check.
  • Provides the convenience methods generatedTextWithJson and generateTextWithResult which allow a LLM to generated reasoning/chain-of-thought before generating the answer which is extracted from <json> or <result> tags.

Composite Implementations

The LLM interface also allows creating composite implementations, for example:

  • Implementations with fallbacks to handle quota exceeded or other errors, e.g using multiple providers for DeepSeek R1 etc.
  • Mixture-of-Agents/Multi-agent debate for enhanced reasoning and review of multiple LLMs.

CePO

An implementation of the Cerebras CePO multi-agent debate is provided. See:

Fast fallbacks

The FastMedium implemention prefers to use Cerebras Qwen3 235b if the input token count is within its limit, and the prompt doesn't contain any images or files, otherwise it falls back to Gemini 2.5 Flash.

The FastEasy implemention prefers to use Cerebras Qwen3 32b if the input token count is within its limit, and the prompt doesn't contain any images or files, otherwise it falls back to Gemini 2.5 Flash Lite.

ReasonerDebate

The ReasonerDebate implemention is based on the Google DeepMind sparse multi-agent debate paper.

API key rotation

Some LLM provider implementations support rotating through a set of API keys to reduce quota exceeded errors. This is currently only supported via environment variables.

Adding LLM services

New LLM services need to be registered in lmFactory.ts

LLM interface

BaseLLM class

AiLLM class

LLM service implementations