Skip to main content

SDK: Inference

The Inference client exposes provider management, model sync, request execution, and usage reporting.

Access it via client.inference.

Provider Management

import { createClient } from "@the-shift/sdk";

const client = createClient({
gatewayUrl: "https://app.the-shift.dev",
apiKey: process.env.SHIFT_API_KEY,
});

const provider = await client.inference.providers.create({
name: "openai-prod",
type: "openai",
defaultModel: "gpt-4o-mini",
});

const providers = await client.inference.providers.list();

Model Sync

const result = await client.inference.models.sync(provider.id);
console.log(result.synced);
console.log(result.models.map((m) => m.modelId));

Chat, Embeddings, Images, and Transcription

const chat = await client.inference.chat({
model: "gpt-4o-mini",
messages: [{ role: "user", content: "Summarize this spreadsheet" }],
});

const embeddings = await client.inference.embed({
model: "text-embedding-3-small",
input: "Quarterly revenue forecast",
});

const image = await client.inference.images({
model: "gpt-image-1",
prompt: "An architectural diagram of a workflow runner",
});

const transcript = await client.inference.transcribe({
model: "whisper-1",
audio: "<base64-audio>",
});

Usage Reporting

const usage = await client.inference.usage({
provider: "openai-prod",
since: "2026-03-01",
limit: 25,
});

const request = await client.inference.requests.get(usage.requests[0].id);

Transcription Jobs

For long-running audio transcription, use the async job API:

// Submit a transcription job
const job = await client.inference.transcriptions.create({
model: "whisper-1",
source: { kind: "url", url: "https://example.com/recording.mp3" },
language: "en",
maxAttempts: 3,
});

// Check job status
const status = await client.inference.transcriptions.get(job.id);
if (status.status === "completed") {
console.log(status.result.text);
console.log(status.artifacts); // normalizedAudio, transcriptText, transcriptJson
}

// List all jobs
const jobs = await client.inference.transcriptions.list();

// Retry a failed job
await client.inference.transcriptions.retry(job.id);

See the Transcription Jobs reference for the full API.

Notes

  • The gateway can resolve a provider from the model name when providerId is omitted.
  • Registered providers can also point at custom baseUrl endpoints.
  • Deterministic mock and replay-style flows are best documented alongside service-level integration tests, not as a public production contract.