> ## Documentation Index
> Fetch the complete documentation index at: https://docs.platform.decart.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Realtime API

> Transform video streams in realtime with WebRTC on Android

The Realtime API enables you to transform live video streams with minimal latency using WebRTC. Perfect for building Android camera effects, video restyling apps, AR experiences, and interactive live streaming.

## Quick Start

```kotlin theme={null}
import ai.decart.sdk.DecartClient
import ai.decart.sdk.DecartClientConfig
import ai.decart.sdk.RealtimeModels
import ai.decart.sdk.realtime.ConnectOptions
import ai.decart.sdk.realtime.FacingMode
import ai.decart.sdk.realtime.InitialPrompt
import ai.decart.sdk.realtime.MirrorMode

val client = DecartClient(context, DecartClientConfig(apiKey = "your-api-key"))
val realtime = client.realtime

realtime.initialize(eglBase)

val cameraTrack = realtime.createCameraVideoTrack(
    facing = FacingMode.FRONT,
    mirror = MirrorMode.AUTO,
)

realtime.connect(
    localVideoTrack = cameraTrack.track,
    localAudioTrack = null,
    options = ConnectOptions(
        model = RealtimeModels.LUCY_RESTYLE_2,
        onRemoteVideoTrack = { track ->
            remoteRenderer.addSink(track)
        },
        initialPrompt = InitialPrompt("a cyberpunk cityscape")
    )
)

realtime.setPrompt("Anime world", enhance = true)

cameraTrack.stop()
realtime.disconnect()
client.release()
```

## Client-Side Authentication

For production Android apps, use ephemeral keys instead of embedding your permanent API key in the APK. Ephemeral keys are short-lived tokens safe to include in client applications.

<Info>
  Learn more about [client tokens](/getting-started/client-tokens) and why they're important for security.
</Info>

### Fetching an Ephemeral Key

Your app should fetch an ephemeral key from your backend server before connecting:

```kotlin theme={null}
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.withContext
import okhttp3.OkHttpClient
import okhttp3.Request
import okhttp3.RequestBody.Companion.toRequestBody
import org.json.JSONObject

suspend fun fetchEphemeralKey(): String = withContext(Dispatchers.IO) {
    val client = OkHttpClient()
    val request = Request.Builder()
        .url("https://your-backend.com/api/realtime-token")
        .post("".toRequestBody())
        // Add any auth headers your backend requires
        // .addHeader("Authorization", "Bearer $userToken")
        .build()

    val response = client.newCall(request).execute()
    if (!response.isSuccessful) {
        throw Exception("Failed to fetch token: ${response.code}")
    }

    val body = response.body?.string()
        ?: throw Exception("Empty response body")
    JSONObject(body).getString("apiKey")
}
```

### Connecting with an Ephemeral Key

```kotlin theme={null}
val ephemeralKey = fetchEphemeralKey()

val client = DecartClient(
    context = applicationContext,
    config = DecartClientConfig(apiKey = ephemeralKey)
)

// Connect as usual
client.realtime.connect(...)
```

<Warning>
  Never hardcode your permanent API key in Android apps. APKs can be decompiled, exposing embedded secrets. Always use ephemeral keys from your backend.
</Warning>

## Connecting

### Initializing WebRTC

Before connecting, initialize the WebRTC peer connection factory. This sets up video encoding/decoding with hardware acceleration:

```kotlin theme={null}
import org.webrtc.EglBase

// Create an EGL context (shared with SurfaceViewRenderers)
val eglBase = EglBase.create()

// Initialize the SDK's WebRTC internals
client.realtime.initialize(eglBase)
```

<Tip>
  `initialize()` is optional — if you skip it, `connect()` calls it automatically with a default `EglBase`. Call it explicitly when you need access to the EGL context for setting up `SurfaceViewRenderer` instances.
</Tip>

### Creating a Camera Track

Use `createCameraVideoTrack` for a one-call pipeline. Pass the model's dimensions:

```kotlin theme={null}
import ai.decart.sdk.realtime.FacingMode
import ai.decart.sdk.realtime.MirrorMode

val model = RealtimeModels.LUCY_RESTYLE_2

val cameraTrack = client.realtime.createCameraVideoTrack(
    facing = FacingMode.FRONT,
    mirror = MirrorMode.AUTO,
    width = model.width,
    height = model.height,
    fps = model.fps,
)
```

The returned `CameraVideoTrack` exposes `track`, `source`, `capturer`, and `surfaceTextureHelper`; `stop()` releases all of them in order.

#### Manual capture pipeline

If you already manage your own camera, use the lower-level helpers and attach `MirrorVideoProcessor` to mirror the input:

```kotlin theme={null}
import ai.decart.sdk.realtime.MirrorVideoProcessor
import org.webrtc.*

val videoSource = client.realtime.createVideoSource(isScreencast = false)!!
videoSource.setVideoProcessor(MirrorVideoProcessor()) // optional, for selfie
val videoTrack = client.realtime.createVideoTrack("camera", videoSource)!!

val enumerator = Camera2Enumerator(context)
val cameraName = enumerator.deviceNames.first { enumerator.isFrontFacing(it) }
val capturer = enumerator.createCapturer(cameraName, null)

capturer.initialize(
    SurfaceTextureHelper.create("CaptureThread", client.realtime.getEglBaseContext()),
    context,
    videoSource.capturerObserver
)
capturer.startCapture(model.width, model.height, model.fps)
```

<Warning>Camera capture requires a **real Android device**. The emulator does not support WebRTC camera features.</Warning>

### Front-camera mirroring

Pre-flipping the selfie input keeps server-baked pixels (watermarks, overlays) correctly oriented when you render the remote stream as-is. Do **not** call `setMirror(true)` on the remote `SurfaceViewRenderer` — that would re-flip the baked pixels.

`MirrorMode` values:

* `OFF` — never mirror.
* `ON` — always mirror.
* `AUTO` — mirror when `facing == FacingMode.FRONT`.

### Establishing Connection

Connect to the Realtime API with your camera track:

```kotlin theme={null}
val characterBase64 = Base64.encodeToString(characterBytes, Base64.NO_WRAP)

client.realtime.connect(
    localVideoTrack = videoTrack,
    localAudioTrack = null,  // or a microphone AudioTrack
    options = ConnectOptions(
        model = RealtimeModels.LUCY_2,
        onRemoteVideoTrack = { track ->
            // Display the transformed video
            remoteRenderer.addSink(track)
        },
        onRemoteAudioTrack = { track ->
            // Handle remote audio (optional)
        },
        initialPrompt = InitialPrompt(
            text = "Substitute the character in the video with the person in the reference image.",
            enhance = true  // Let Decart enhance the prompt (recommended)
        ),
        initialImage = characterBase64
    )
)
```

**Parameters:**

* `localVideoTrack` (optional) - WebRTC `VideoTrack` from camera (null for Live Avatar or subscribe mode)
* `localAudioTrack` (optional) - WebRTC `AudioTrack` from microphone
* `options.model` (required) - Realtime model from `RealtimeModels`
* `options.onRemoteVideoTrack` (required) - Callback that receives the transformed video track
* `options.onRemoteAudioTrack` (optional) - Callback for remote audio track
* `options.initialPrompt` (optional) - Starting prompt
  * `text` - Style description
  * `enhance` - Whether to auto-enhance the prompt (default: `true`)
* `options.initialImage` (optional) - Base64-encoded reference image

<Tip>
  Set `initialImage` and/or `initialPrompt` so the first frame is already transformed — otherwise viewers briefly see the raw camera feed.
</Tip>

## Managing Prompts

Change the transformation style dynamically without reconnecting:

```kotlin theme={null}
// Simple prompt with automatic enhancement
client.realtime.setPrompt("Anime style")

// Detailed prompt without enhancement
client.realtime.setPrompt(
    "A detailed artistic style with vibrant colors and dramatic lighting",
    enhance = false
)
```

**Parameters:**

* `prompt: String` (required) - Text description of desired style
* `enhance: Boolean` (optional) - Whether to enhance the prompt (default: `true`)

<Note>Prompt enhancement uses Decart's AI to expand simple prompts for better results. Disable it if you want full control over the exact prompt.</Note>

## Reference Images

Send a reference image (and optionally a prompt) for image-guided models:

```kotlin theme={null}
// Set image with prompt
client.realtime.setImage(
    imageBase64 = base64EncodedImage,
    prompt = "Transform into this character",
    enhance = true,
    timeout = 30_000L  // 30 second ack timeout
)

// Clear image
client.realtime.setImage(imageBase64 = null)
```

**Parameters:**

* `imageBase64: String?` (required) - Base64-encoded image, or `null` to clear
* `prompt: String?` (optional) - Text prompt to send with the image
* `enhance: Boolean?` (optional) - Whether to enhance the prompt
* `timeout: Long` (optional) - Timeout in milliseconds for the server ack (default: 30,000)

## Connection State

Monitor and react to connection state changes using Kotlin Flow:

```kotlin theme={null}
import ai.decart.sdk.ConnectionState

// Observe state changes
lifecycleScope.launch {
    client.realtime.connectionState.collect { state ->
        when (state) {
            ConnectionState.DISCONNECTED -> showReconnectButton()
            ConnectionState.CONNECTING -> showLoadingIndicator()
            ConnectionState.CONNECTED -> hideLoadingIndicator()
            ConnectionState.GENERATING -> showGeneratingUI()
            ConnectionState.RECONNECTING -> showReconnectingBanner()
        }
    }
}

// Check state synchronously
val isConnected = client.realtime.isConnected()
```

**Connection States:**

* `DISCONNECTED` - Not connected (initial state, after `disconnect()`, or after reconnect failure)
* `CONNECTING` - Initial connection in progress
* `CONNECTED` - Connected and ready to send prompts
* `GENERATING` - Actively generating transformed video (sticky until disconnected)
* `RECONNECTING` - Connection lost unexpectedly; the SDK is automatically retrying

<Info>
  The SDK automatically reconnects when an unexpected disconnection occurs (e.g., network interruption). During auto-reconnect, the state transitions to `RECONNECTING` while the SDK retries. If all retries fail, the state moves to `DISCONNECTED` and an error is emitted.
</Info>

## Error Handling

Handle errors using the `errors` SharedFlow:

```kotlin theme={null}
import ai.decart.sdk.DecartError
import ai.decart.sdk.ErrorCodes

lifecycleScope.launch {
    client.realtime.errors.collect { error ->
        when (error.code) {
            ErrorCodes.INVALID_API_KEY ->
                showError("Invalid API key. Check your credentials.")
            ErrorCodes.WEBRTC_TIMEOUT_ERROR ->
                showError("Connection timed out. Check your network.")
            ErrorCodes.WEBRTC_ICE_ERROR ->
                showError("ICE negotiation failed.")
            ErrorCodes.WEBRTC_WEBSOCKET_ERROR ->
                showError("WebSocket connection error.")
            ErrorCodes.WEBRTC_SERVER_ERROR ->
                showError("Server error. Try again later.")
            ErrorCodes.WEBRTC_SIGNALING_ERROR ->
                showError("Signaling error.")
            else ->
                showError("Error: ${error.message}")
        }
    }
}
```

**Error Codes:**

* `INVALID_API_KEY` - API key is invalid or missing
* `WEBRTC_TIMEOUT_ERROR` - Connection timed out
* `WEBRTC_ICE_ERROR` - ICE negotiation failed
* `WEBRTC_WEBSOCKET_ERROR` - WebSocket connection error
* `WEBRTC_SERVER_ERROR` - Server-side error
* `WEBRTC_SIGNALING_ERROR` - Signaling protocol error

## WebRTC Stats

Monitor real-time performance metrics:

```kotlin theme={null}
lifecycleScope.launch {
    client.realtime.stats.collect { stats ->
        stats.video?.let { video ->
            println("FPS: ${video.framesPerSecond}")
            println("Resolution: ${video.frameWidth}x${video.frameHeight}")
            println("Bitrate: ${video.bitrate} bps")
            println("Packets lost: ${video.packetsLost}")
            println("Jitter: ${video.jitter}")
        }
        stats.connection.let { conn ->
            println("RTT: ${conn.currentRoundTripTime}s")
        }
    }
}
```

**Available stats:**

* `video` - Inbound video: FPS, resolution, bitrate, packets lost, jitter, freeze count
* `audio` - Inbound audio: bitrate, packets lost, jitter
* `outboundVideo` - Outbound video: FPS, resolution, bitrate, quality limitation reason
* `connection` - Round-trip time, available outgoing bitrate

## Generation Ticks

Track session duration for billing and usage display:

```kotlin theme={null}
lifecycleScope.launch {
    client.realtime.generationTicks.collect { tick ->
        println("Generation running for ${tick.seconds} seconds")
        updateBillingUI(tick.seconds)
    }
}
```

## Diagnostics

Monitor detailed connection diagnostics for debugging:

```kotlin theme={null}
lifecycleScope.launch {
    client.realtime.diagnostics.collect { event ->
        when (event) {
            is DiagnosticEvent.PhaseTiming ->
                println("${event.data.phase}: ${event.data.durationMs}ms")
            is DiagnosticEvent.Reconnect ->
                println("Reconnect attempt ${event.data.attempt}/${event.data.maxAttempts}")
            is DiagnosticEvent.VideoStall ->
                println("Video stall: ${if (event.data.stalled) "detected" else "recovered"}")
            else -> { /* other diagnostic events */ }
        }
    }
}
```

## Session Management

Access the current session ID for logging, analytics, or debugging:

```kotlin theme={null}
val sessionId = client.realtime.sessionId
println("Current session: $sessionId")
```

## Cleanup

Always disconnect and release resources when done:

```kotlin theme={null}
cameraTrack.stop() // or capturer.stopCapture() for a manual pipeline
client.realtime.disconnect()
client.release()
eglBase.release()
```

<Warning>Call `cameraTrack.stop()` before `client.release()` — the camera resources are tied to the client's `PeerConnectionFactory`.</Warning>

## Complete Jetpack Compose Example

A full Jetpack Compose application with camera capture, video transformation, and prompt input:

```kotlin theme={null}
import ai.decart.sdk.*
import ai.decart.sdk.realtime.*
import android.Manifest
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.result.contract.ActivityResultContracts
import androidx.compose.foundation.layout.*
import androidx.compose.material3.*
import androidx.compose.runtime.*
import androidx.compose.ui.Modifier
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.unit.dp
import androidx.compose.ui.viewinterop.AndroidView
import androidx.lifecycle.ViewModel
import androidx.lifecycle.viewModelScope
import kotlinx.coroutines.flow.*
import kotlinx.coroutines.launch
import org.webrtc.*

class RealtimeViewModel : ViewModel() {
    private val _connectionState = MutableStateFlow(ConnectionState.DISCONNECTED)
    val connectionState = _connectionState.asStateFlow()

    private val _error = MutableStateFlow<String?>(null)
    val error = _error.asStateFlow()

    var promptText by mutableStateOf("Turn into a fantasy figure")

    private var client: DecartClient? = null
    private var cameraTrack: CameraVideoTrack? = null

    // Shared EGL context for WebRTC and SurfaceViewRenderer
    val eglBase: EglBase = EglBase.create()

    val remoteRenderer = mutableStateOf<SurfaceViewRenderer?>(null)

    fun connect(context: android.content.Context) {
        viewModelScope.launch {
            try {
                val decartClient = DecartClient(
                    context = context,
                    config = DecartClientConfig(
                        apiKey = "your-api-key"
                    )
                )
                client = decartClient

                val realtime = decartClient.realtime
                realtime.initialize(eglBase)

                val model = RealtimeModels.LUCY_RESTYLE_2

                val track = realtime.createCameraVideoTrack(
                    facing = FacingMode.FRONT,
                    mirror = MirrorMode.AUTO,
                    width = model.width,
                    height = model.height,
                    fps = model.fps,
                )
                cameraTrack = track

                // Observe state
                viewModelScope.launch {
                    realtime.connectionState.collect { _connectionState.value = it }
                }
                viewModelScope.launch {
                    realtime.errors.collect { _error.value = it.message }
                }

                // Connect
                realtime.connect(
                    localVideoTrack = track.track,
                    options = ConnectOptions(
                        model = model,
                        onRemoteVideoTrack = { remote ->
                            remoteRenderer.value?.let { renderer ->
                                remote.addSink(renderer)
                            }
                        },
                        initialPrompt = InitialPrompt(
                            text = promptText,
                            enhance = true
                        )
                    )
                )
            } catch (e: Exception) {
                _error.value = e.message
            }
        }
    }

    fun setPrompt() {
        try {
            client?.realtime?.setPrompt(promptText, enhance = true)
        } catch (e: Exception) {
            _error.value = e.message
        }
    }

    fun disconnect() {
        cameraTrack?.stop()
        cameraTrack = null
        client?.release()
        client = null
        _connectionState.value = ConnectionState.DISCONNECTED
    }

    override fun onCleared() {
        disconnect()
        eglBase.release()
    }
}

@Composable
fun RealtimeScreen(viewModel: RealtimeViewModel) {
    val context = LocalContext.current
    val connectionState by viewModel.connectionState.collectAsState()
    val error by viewModel.error.collectAsState()
    val isConnected = connectionState == ConnectionState.CONNECTED
        || connectionState == ConnectionState.GENERATING

    val permissionLauncher = rememberLauncherForActivityResult(
        ActivityResultContracts.RequestPermission()
    ) { granted ->
        if (granted) viewModel.connect(context)
    }

    Column(
        modifier = Modifier.fillMaxSize().padding(16.dp)
    ) {
        // Remote video
        AndroidView(
            factory = { ctx ->
                SurfaceViewRenderer(ctx).also { renderer ->
                    renderer.init(viewModel.eglBase.eglBaseContext, null)
                    renderer.setScalingType(RendererCommon.ScalingType.SCALE_ASPECT_FIT)
                    viewModel.remoteRenderer.value = renderer
                }
            },
            modifier = Modifier.weight(1f).fillMaxWidth()
        )

        Spacer(modifier = Modifier.height(8.dp))

        // Status
        Text("Status: $connectionState")

        error?.let {
            Text(it, color = MaterialTheme.colorScheme.error)
        }

        Spacer(modifier = Modifier.height(8.dp))

        // Prompt input
        Row(modifier = Modifier.fillMaxWidth()) {
            OutlinedTextField(
                value = viewModel.promptText,
                onValueChange = { viewModel.promptText = it },
                label = { Text("Style prompt") },
                modifier = Modifier.weight(1f)
            )
            Spacer(modifier = Modifier.width(8.dp))
            Button(
                onClick = { viewModel.setPrompt() },
                enabled = isConnected
            ) {
                Text("Send")
            }
        }

        Spacer(modifier = Modifier.height(8.dp))

        // Connect / Disconnect
        Button(
            onClick = {
                if (isConnected) {
                    viewModel.disconnect()
                } else {
                    permissionLauncher.launch(Manifest.permission.CAMERA)
                }
            },
            modifier = Modifier.fillMaxWidth()
        ) {
            Text(if (isConnected) "Disconnect" else "Connect")
        }
    }
}
```

## Best Practices

<AccordionGroup>
  <Accordion title="Use model properties for camera constraints">
    Always use the model's `fps`, `width`, and `height` properties when configuring camera capture to ensure optimal performance.

    ```kotlin theme={null}
    val model = RealtimeModels.LUCY_RESTYLE_2
    capturer.startCapture(model.width, model.height, model.fps)
    ```
  </Accordion>

  <Accordion title="Enable prompt enrichment">
    For best results, keep `enhance = true` (default) to let Decart's AI enhance your prompts. Only disable it if you need exact prompt control.
  </Accordion>

  <Accordion title="Collect flows in lifecycle-aware scopes">
    Always collect `connectionState`, `errors`, and other flows in `lifecycleScope` or `viewModelScope` to avoid leaks and ensure proper cancellation.
  </Accordion>

  <Accordion title="Clean up properly">
    Always call `disconnect()` then `release()` when done. Stop camera capture separately. Failing to release can leak native WebRTC resources.
  </Accordion>

  <Accordion title="Test on real devices">
    Always test camera features on real Android devices. The emulator does not support WebRTC camera access.
  </Accordion>

  <Accordion title="Request permissions properly">
    Request `CAMERA` and `RECORD_AUDIO` permissions at runtime before attempting to connect. Handle permission denials gracefully in your UI.
  </Accordion>
</AccordionGroup>

## API Reference

### `realtime.initialize(eglBase?)`

Initialize the WebRTC peer connection factory. Optional - called automatically by `connect()` if not called explicitly.

**Parameters:**

* `eglBase: EglBase?` (optional) - EGL context for hardware-accelerated video. A default is created if omitted.

### `realtime.connect(localVideoTrack, localAudioTrack, options)`

Connect to a realtime model and start streaming. Suspend function.

**Parameters:**

* `localVideoTrack: VideoTrack?` - Camera video track
* `localAudioTrack: AudioTrack?` - Microphone audio track (null to omit)
* `options: ConnectOptions` - Connection configuration
  * `model: RealtimeModel` - Model from `RealtimeModels`
  * `onRemoteVideoTrack: (VideoTrack) -> Unit` - Callback for transformed video
  * `onRemoteAudioTrack: ((AudioTrack) -> Unit)?` - Callback for remote audio
  * `initialPrompt: InitialPrompt?` - Starting prompt (text + enhance)
  * `initialImage: String?` - Base64-encoded reference image

**Throws:** `IllegalStateException` if factory initialization fails

### `realtime.setPrompt(prompt, enhance)`

Change the transformation style.

**Parameters:**

* `prompt: String` - Text description of desired style
* `enhance: Boolean` - Whether to enhance the prompt (default: `true`)

**Throws:** `IllegalStateException` if not connected

### `realtime.setImage(imageBase64, prompt, enhance, timeout)`

Send a reference image. Suspend function.

**Parameters:**

* `imageBase64: String?` - Base64-encoded image, or null to clear
* `prompt: String?` - Optional text prompt
* `enhance: Boolean?` - Whether to enhance the prompt
* `timeout: Long` - Ack timeout in ms (default: 30,000)

**Throws:** `IllegalStateException` if not connected

### `realtime.disconnect()`

End the current session and clean up WebRTC resources.

### `realtime.release()`

Release all resources including the peer connection factory and EGL context. Call when done with the client entirely.

### `realtime.isConnected()`

**Returns:** `Boolean` - Whether currently connected.

### `realtime.isPlayAudioAvailable()`

**Returns:** `Boolean` - Whether `playAudio()` is available (Live Avatar mode).

### `realtime.getEglBaseContext()`

**Returns:** `EglBase.Context?` - EGL context for initializing `SurfaceViewRenderer`.

### `realtime.createVideoSource(isScreencast)`

**Returns:** `VideoSource?` - A new video source for camera capture.

### `realtime.createVideoTrack(id, source)`

**Returns:** `VideoTrack?` - A new video track from a video source.

### `realtime.createCameraVideoTrack(facing, mirror, width, height, fps, trackId)`

One-call camera setup. Returns a `CameraVideoTrack` with `track`, `source`, `capturer`, `surfaceTextureHelper`, and `stop()`.

**Parameters:**

* `facing: FacingMode` - `FRONT` (default) or `BACK`.
* `mirror: MirrorMode` - `AUTO` (default), `OFF`, or `ON`.
* `width: Int`, `height: Int`, `fps: Int` - capture parameters (defaults: `1280`, `720`, `30`).
* `trackId: String` - track id (default: `"local_video"`).

Requires `initialize()` and `android.permission.CAMERA`. Throws `IllegalStateException` if no camera is available.

### `MirrorVideoProcessor`

Public `VideoProcessor` that horizontally flips frames. Attach via `source.setVideoProcessor(MirrorVideoProcessor())` when you manage your own capture pipeline.

### Observable State

| Property          | Type                                | Description                  |
| ----------------- | ----------------------------------- | ---------------------------- |
| `connectionState` | `StateFlow<ConnectionState>`        | Current connection state     |
| `errors`          | `SharedFlow<DecartError>`           | Error events                 |
| `stats`           | `SharedFlow<WebRTCStats>`           | WebRTC performance stats     |
| `generationTicks` | `SharedFlow<GenerationTickMessage>` | Billing/usage tick events    |
| `diagnostics`     | `SharedFlow<DiagnosticEvent>`       | Connection diagnostic events |
| `sessionId`       | `String?`                           | Current session ID           |

## Next Steps

<CardGroup cols={2}>
  <Card title="Queue API" icon="layer-group" href="/sdks/android-queue">
    Generate and transform videos with batch processing and Flow-based progress
  </Card>

  <Card title="SDK Overview" icon="android" href="/sdks/android">
    Installation, setup, and Android SDK fundamentals
  </Card>
</CardGroup>
