SDK: Inference
The Inference client exposes provider management, model sync, request execution, and usage reporting.
Access it via client.inference.
Provider Management
import { createClient } from "@the-shift/sdk";
const client = createClient({
gatewayUrl: "https://app.the-shift.dev",
apiKey: process.env.SHIFT_API_KEY,
});
const provider = await client.inference.providers.create({
name: "openai-prod",
type: "openai",
defaultModel: "gpt-4o-mini",
});
const providers = await client.inference.providers.list();
Model Sync
const result = await client.inference.models.sync(provider.id);
console.log(result.synced);
console.log(result.models.map((m) => m.modelId));
Chat, Embeddings, Images, and Transcription
const chat = await client.inference.chat({
model: "gpt-4o-mini",
messages: [{ role: "user", content: "Summarize this spreadsheet" }],
});
const embeddings = await client.inference.embed({
model: "text-embedding-3-small",
input: "Quarterly revenue forecast",
});
const image = await client.inference.images({
model: "gpt-image-1",
prompt: "An architectural diagram of a workflow runner",
});
const transcript = await client.inference.transcribe({
model: "whisper-1",
audio: "<base64-audio>",
});
Usage Reporting
const usage = await client.inference.usage({
provider: "openai-prod",
since: "2026-03-01",
limit: 25,
});
const request = await client.inference.requests.get(usage.requests[0].id);
Transcription Jobs
For long-running audio transcription, use the async job API:
// Submit a transcription job
const job = await client.inference.transcriptions.create({
model: "whisper-1",
source: { kind: "url", url: "https://example.com/recording.mp3" },
language: "en",
maxAttempts: 3,
});
// Check job status
const status = await client.inference.transcriptions.get(job.id);
if (status.status === "completed") {
console.log(status.result.text);
console.log(status.artifacts); // normalizedAudio, transcriptText, transcriptJson
}
// List all jobs
const jobs = await client.inference.transcriptions.list();
// Retry a failed job
await client.inference.transcriptions.retry(job.id);
See the Transcription Jobs reference for the full API.
Notes
- The gateway can resolve a provider from the model name when
providerIdis omitted. - Registered providers can also point at custom
baseUrlendpoints. - Deterministic mock and replay-style flows are best documented alongside service-level integration tests, not as a public production contract.