Skip to main content
Animate portrait images in realtime with audio. Transform any static portrait into a lifelike talking avatar that syncs lip movements and expressions to your audio input. The Avatar Live Realtime API takes a portrait image and audio input, then generates an animated video stream where the avatar speaks with synchronized lip movements and natural expressions.

Model Specifications

The model processes input with these specifications:
  • Video Frame Rate: 25 FPS
  • Video Resolution: 1280x720 (16:9)
  • Supported Image Formats: JPEG, PNG, WebP

Realtime API

The realtime API uses WebRTC for low-latency avatar animation. Clients establish a connection with a portrait image, then send audio to receive an animated video stream in return. The connection remains active as long as you maintain it, allowing you to send multiple audio clips and change avatar behavior prompts on-the-fly.

Installation

npm install @decartai/sdk

Basic Usage

import { createDecartClient, models } from "@decartai/sdk";

const model = models.realtime("live_avatar");

// Create a client
const client = createDecartClient({
  apiKey: "your-api-key-here"
});

// Load avatar image (can be a Blob, File, or URL string)
const imageResponse = await fetch("/path/to/portrait.jpg");
const avatarImage = await imageResponse.blob();

// Connect with avatar image and initial prompt
const realtimeClient = await client.realtime.connect(null, {
  model,
  avatar: { avatarImage },
  initialState: {
    prompt: { text: "Smile warmly and nod occasionally", enhance: true }
  },
  onRemoteStream: (animatedStream) => {
    // Display the animated avatar
    const videoElement = document.querySelector("#avatar-output");
    videoElement.srcObject = animatedStream;
  }
});

// Play audio to animate the avatar
const audioFile = document.querySelector("input[type=file]").files[0];
await realtimeClient.playAudio(audioFile);

// Disconnect when done
realtimeClient.disconnect();

Advanced Features

Avatar Image Options

You can provide the avatar image in multiple formats:
// Option 1: Blob from fetch
const response = await fetch("/path/to/portrait.jpg");
const avatarImage = await response.blob();

// Option 2: File from input
const fileInput = document.querySelector("input[type=file]");
const avatarImage = fileInput.files[0];

// Option 3: URL string (SDK will fetch automatically)
const avatarImage = "https://example.com/portrait.jpg";

const realtimeClient = await client.realtime.connect(null, {
  model,
  avatar: { avatarImage },
  onRemoteStream: (stream) => {
    videoElement.srcObject = stream;
  }
});

Audio Input Methods

Send audio to animate the avatar using file uploads or microphone recordings:
// Option 1: Audio file upload
const audioFile = document.querySelector("input[type=file]").files[0];
await realtimeClient.playAudio(audioFile);

// Option 2: Microphone recording
const micStream = await navigator.mediaDevices.getUserMedia({
  video: false,
  audio: {
    echoCancellation: true,
    noiseSuppression: true,
    sampleRate: 16000
  }
});

const mediaRecorder = new MediaRecorder(micStream);
const chunks = [];

mediaRecorder.ondataavailable = (e) => chunks.push(e.data);
mediaRecorder.onstop = async () => {
  const audioBlob = new Blob(chunks, { type: "audio/wav" });
  await realtimeClient.playAudio(audioBlob);
};

// Start and stop recording
mediaRecorder.start();
setTimeout(() => mediaRecorder.stop(), 5000); // Record for 5 seconds

// Option 3: ArrayBuffer
const audioBuffer = await audioFile.arrayBuffer();
await realtimeClient.playAudio(audioBuffer);

Prompt Management

Control avatar behavior and expressions with prompts. You can set the prompt at connection time or change it after connection:
// Option 1: Set prompt at connection time (recommended)
const realtimeClient = await client.realtime.connect(null, {
  model,
  avatar: { avatarImage },
  initialState: {
    prompt: { text: "Speak enthusiastically with hand gestures", enhance: true }
  },
  onRemoteStream: (stream) => {
    videoElement.srcObject = stream;
  }
});

// Option 2: Change prompt after connection
await realtimeClient.setPrompt("Be friendly and nod occasionally", { enhance: true });

Connection State Management

import { createDecartClient, type DecartSDKError } from "@decartai/sdk";

const realtimeClient = await client.realtime.connect(null, {
  model,
  avatar: { avatarImage },
  onRemoteStream: (stream) => {
    videoElement.srcObject = stream;
  }
});

// Monitor connection state
realtimeClient.on("connectionChange", (state) => {
  console.log(`Connection: ${state}`); // "connecting" | "connected" | "disconnected"

  if (state === "connected") {
    document.getElementById("status").textContent = "Ready";
  } else if (state === "disconnected") {
    document.getElementById("status").textContent = "Disconnected";
  }
});

// Handle errors
realtimeClient.on("error", (error: DecartSDKError) => {
  console.error("SDK error:", error.code, error.message);
});

// Check connection synchronously
const isConnected = realtimeClient.isConnected();
const currentState = realtimeClient.getConnectionState();

Complete Example with Error Handling

import { createDecartClient, models, type DecartSDKError } from "@decartai/sdk";

async function setupAvatarLive() {
  try {
    const model = models.realtime("live_avatar");

    const client = createDecartClient({
      apiKey: process.env.DECART_API_KEY
    });

    // Load avatar image
    const avatarImage = await fetch("/portraits/avatar.jpg").then(r => r.blob());

    const realtimeClient = await client.realtime.connect(null, {
      model,
      avatar: { avatarImage },
      onRemoteStream: (stream) => {
        const videoElement = document.getElementById("avatar-video");
        videoElement.srcObject = stream;
      }
    });

    // Set up event handlers
    realtimeClient.on("connectionChange", (state) => {
      updateUIStatus(state);
    });

    realtimeClient.on("error", (error) => {
      console.error("Error:", error);
      showErrorMessage(error.message);
    });

    // Audio file input handler
    document.getElementById("audio-input").addEventListener("change", async (e) => {
      const file = e.target.files[0];
      if (file) {
        await realtimeClient.playAudio(file);
      }
    });

    // Prompt input handler
    document.getElementById("prompt-input").addEventListener("change", async (e) => {
      await realtimeClient.setPrompt(e.target.value);
    });

    // Cleanup on page unload
    window.addEventListener("beforeunload", () => {
      realtimeClient.disconnect();
    });

    return realtimeClient;
  } catch (error) {
    console.error("Setup failed:", error);
    throw error;
  }
}

WebSocket + WebRTC (Vanilla)

For direct WebSocket communication without the SDK:
<!DOCTYPE html>
<html>
<head>
  <title>Decart Avatar Live - WebRTC</title>
</head>
<body>
  <video id="avatarVideo" autoplay playsinline width="640"></video>
  <br>
  <input type="file" id="imageInput" accept="image/jpeg,image/png,image/webp">
  <input type="file" id="audioInput" accept="audio/*">
  <button id="connectBtn">Connect</button>
  <input id="promptInput" placeholder="Enter avatar behavior...">
  <button id="promptBtn">Set Prompt</button>

  <script>
    let ws;
    let peerConnection;

    // Convert file to base64
    async function fileToBase64(file) {
      return new Promise((resolve, reject) => {
        const reader = new FileReader();
        reader.onload = () => resolve(reader.result.split(',')[1]);
        reader.onerror = reject;
        reader.readAsDataURL(file);
      });
    }

    // Connect with avatar image
    async function connect() {
      const imageFile = document.getElementById('imageInput').files[0];
      if (!imageFile) {
        alert('Please select an avatar image first');
        return;
      }

      // Connect to WebSocket
      ws = new WebSocket('wss://api3.decart.ai/v1/live_avatar/stream?api_key=YOUR_API_KEY');

      ws.onopen = async () => {
        // Send avatar image first
        const imageBase64 = await fileToBase64(imageFile);
        ws.send(JSON.stringify({
          type: 'set_image',
          image_data: imageBase64
        }));
      };

      ws.onmessage = async (event) => {
        const message = JSON.parse(event.data);

        if (message.type === 'set_image_ack') {
          // Image accepted, now set up WebRTC
          await setupWebRTC();
        } else if (message.type === 'answer' && peerConnection) {
          await peerConnection.setRemoteDescription({
            type: 'answer',
            sdp: message.sdp
          });
        }
      };
    }

    async function setupWebRTC() {
      peerConnection = new RTCPeerConnection({
        iceServers: [{ urls: 'stun:stun.l.google.com:19302' }]
      });

      // Send ICE candidates
      peerConnection.onicecandidate = (event) => {
        if (event.candidate) {
          ws.send(JSON.stringify({
            type: 'ice-candidate',
            candidate: event.candidate
          }));
        }
      };

      // Receive animated video stream
      peerConnection.ontrack = (event) => {
        document.getElementById('avatarVideo').srcObject = event.streams[0];
      };

      // Add receive-only video transceiver
      peerConnection.addTransceiver('video', { direction: 'recvonly' });

      // Create silent audio stream for connection
      const audioContext = new AudioContext({ sampleRate: 16000 });
      const oscillator = audioContext.createOscillator();
      const gain = audioContext.createGain();
      const destination = audioContext.createMediaStreamDestination();

      gain.gain.value = 0;
      oscillator.connect(gain);
      gain.connect(destination);
      oscillator.start();

      destination.stream.getTracks().forEach(track => {
        peerConnection.addTrack(track, destination.stream);
      });

      // Create and send offer
      const offer = await peerConnection.createOffer();
      await peerConnection.setLocalDescription(offer);

      ws.send(JSON.stringify({
        type: 'offer',
        sdp: offer.sdp
      }));
    }

    // Send audio to animate avatar
    async function sendAudio() {
      const audioFile = document.getElementById('audioInput').files[0];
      if (!audioFile || !ws) return;

      const audioBase64 = await fileToBase64(audioFile);
      ws.send(JSON.stringify({
        type: 'audio',
        audio_data: audioBase64
      }));
    }

    // Send behavior prompt
    function sendPrompt() {
      const prompt = document.getElementById('promptInput').value.trim();
      if (prompt && ws && ws.readyState === WebSocket.OPEN) {
        ws.send(JSON.stringify({
          type: 'prompt',
          prompt: prompt
        }));
      }
    }

    // Event listeners
    document.getElementById('connectBtn').onclick = connect;
    document.getElementById('audioInput').onchange = sendAudio;
    document.getElementById('promptBtn').onclick = sendPrompt;
  </script>
</body>
</html>

API Reference

Connection Options

OptionTypeDescription
modelModelThe Avatar Live model from models.realtime("live_avatar")
avatar{ avatarImage: Blob | File | string }Avatar options with portrait image (file, blob, or URL)
initialState{ prompt?: { text: string, enhance?: boolean } }Optional initial state including avatar behavior prompt
onRemoteStream(stream: MediaStream) => voidCallback when animated video stream is ready

Methods

MethodParametersDescription
playAudioaudio: Blob | File | ArrayBufferSend audio to animate the avatar (returns Promise that resolves when audio finishes)
setPromptprompt: string, options?: { enhance?: boolean }Set avatar behavior prompt
isConnectednoneReturns boolean - connection status
getConnectionStatenoneReturns "connecting" | "connected" | "disconnected"
disconnectnoneClose the connection

Events

EventPayloadDescription
connectionChange"connecting" | "connected" | "disconnected"Connection state changed
errorDecartSDKErrorAn error occurred

Next Steps