Skip to main content
The Realtime API enables you to transform live video streams with minimal latency using WebRTC. Perfect for building Android camera effects, video restyling apps, AR experiences, and interactive live streaming.

Quick Start

import ai.decart.sdk.DecartClient
import ai.decart.sdk.DecartClientConfig
import ai.decart.sdk.RealtimeModels
import ai.decart.sdk.realtime.ConnectOptions
import ai.decart.sdk.realtime.InitialPrompt

val client = DecartClient(context, DecartClientConfig(apiKey = "your-api-key"))
val realtime = client.realtime

// Initialize WebRTC
realtime.initialize(eglBase)

// Connect with camera
realtime.connect(
    localVideoTrack = cameraTrack,
    localAudioTrack = null,
    options = ConnectOptions(
        model = RealtimeModels.MIRAGE_V2,
        onRemoteVideoTrack = { track ->
            remoteRenderer.addSink(track)
        },
        initialPrompt = InitialPrompt("a cyberpunk cityscape")
    )
)

// Change style on the fly
realtime.setPrompt("Anime world", enhance = true)

// Disconnect when done
realtime.disconnect()
client.release()

Client-Side Authentication

For production Android apps, use ephemeral keys instead of embedding your permanent API key in the APK. Ephemeral keys are short-lived tokens safe to include in client applications.
Learn more about client tokens and why they’re important for security.

Fetching an Ephemeral Key

Your app should fetch an ephemeral key from your backend server before connecting:
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.withContext
import okhttp3.OkHttpClient
import okhttp3.Request
import okhttp3.RequestBody.Companion.toRequestBody
import org.json.JSONObject

suspend fun fetchEphemeralKey(): String = withContext(Dispatchers.IO) {
    val client = OkHttpClient()
    val request = Request.Builder()
        .url("https://your-backend.com/api/realtime-token")
        .post("".toRequestBody())
        // Add any auth headers your backend requires
        // .addHeader("Authorization", "Bearer $userToken")
        .build()

    val response = client.newCall(request).execute()
    if (!response.isSuccessful) {
        throw Exception("Failed to fetch token: ${response.code}")
    }

    val body = response.body?.string()
        ?: throw Exception("Empty response body")
    JSONObject(body).getString("apiKey")
}

Connecting with an Ephemeral Key

val ephemeralKey = fetchEphemeralKey()

val client = DecartClient(
    context = applicationContext,
    config = DecartClientConfig(apiKey = ephemeralKey)
)

// Connect as usual
client.realtime.connect(...)
Never hardcode your permanent API key in Android apps. APKs can be decompiled, exposing embedded secrets. Always use ephemeral keys from your backend.

Connecting

Initializing WebRTC

Before connecting, initialize the WebRTC peer connection factory. This sets up video encoding/decoding with hardware acceleration:
import org.webrtc.EglBase

// Create an EGL context (shared with SurfaceViewRenderers)
val eglBase = EglBase.create()

// Initialize the SDK's WebRTC internals
client.realtime.initialize(eglBase)
initialize() is optional — if you skip it, connect() calls it automatically with a default EglBase. Call it explicitly when you need access to the EGL context for setting up SurfaceViewRenderer instances.

Creating a Camera Track

Use the SDK’s helper methods to create a video track from the device camera:
import org.webrtc.*

val model = RealtimeModels.MIRAGE_V2

// Create video source and track via the SDK
val videoSource = client.realtime.createVideoSource(isScreencast = false)!!
val videoTrack = client.realtime.createVideoTrack("camera", videoSource)!!

// Set up camera capturer
val enumerator = Camera2Enumerator(context)
val cameraName = enumerator.deviceNames.first { enumerator.isFrontFacing(it) }
val capturer = enumerator.createCapturer(cameraName, null)

// Start capture using model dimensions for optimal performance
capturer.initialize(
    SurfaceTextureHelper.create("CaptureThread", client.realtime.getEglBaseContext()),
    context,
    videoSource.capturerObserver
)
capturer.startCapture(model.width, model.height, model.fps)
Use the model’s fps, width, and height properties when configuring camera capture to ensure optimal performance and compatibility.
Camera capture requires a real Android device. The emulator does not support WebRTC camera features.

Establishing Connection

Connect to the Realtime API with your camera track:
client.realtime.connect(
    localVideoTrack = videoTrack,
    localAudioTrack = null,  // or a microphone AudioTrack
    options = ConnectOptions(
        model = RealtimeModels.MIRAGE_V2,
        onRemoteVideoTrack = { track ->
            // Display the transformed video
            remoteRenderer.addSink(track)
        },
        onRemoteAudioTrack = { track ->
            // Handle remote audio (optional)
        },
        initialPrompt = InitialPrompt(
            text = "Lego World",
            enhance = true  // Let Decart enhance the prompt (recommended)
        ),
        initialImage = null  // Optional base64 image for avatar models
    )
)
Parameters:
  • localVideoTrack (optional) - WebRTC VideoTrack from camera (null for Live Avatar or subscribe mode)
  • localAudioTrack (optional) - WebRTC AudioTrack from microphone
  • options.model (required) - Realtime model from RealtimeModels
  • options.onRemoteVideoTrack (required) - Callback that receives the transformed video track
  • options.onRemoteAudioTrack (optional) - Callback for remote audio track
  • options.initialPrompt (optional) - Starting prompt
    • text - Style description
    • enhance - Whether to auto-enhance the prompt (default: true)
  • options.initialImage (optional) - Base64-encoded reference image

Managing Prompts

Change the transformation style dynamically without reconnecting:
// Simple prompt with automatic enhancement
client.realtime.setPrompt("Anime style")

// Detailed prompt without enhancement
client.realtime.setPrompt(
    "A detailed artistic style with vibrant colors and dramatic lighting",
    enhance = false
)
Parameters:
  • prompt: String (required) - Text description of desired style
  • enhance: Boolean (optional) - Whether to enhance the prompt (default: true)
Prompt enhancement uses Decart’s AI to expand simple prompts for better results. Disable it if you want full control over the exact prompt.

Reference Images

Send a reference image (and optionally a prompt) for image-guided models:
// Set image with prompt
client.realtime.setImage(
    imageBase64 = base64EncodedImage,
    prompt = "Transform into this character",
    enhance = true,
    timeout = 30_000L  // 30 second ack timeout
)

// Clear image
client.realtime.setImage(imageBase64 = null)
Parameters:
  • imageBase64: String? (required) - Base64-encoded image, or null to clear
  • prompt: String? (optional) - Text prompt to send with the image
  • enhance: Boolean? (optional) - Whether to enhance the prompt
  • timeout: Long (optional) - Timeout in milliseconds for the server ack (default: 30,000)

Audio (Live Avatar)

When connected to the LIVE_AVATAR model without a user-provided audio track, you can play audio through the stream:
// Check if audio playback is available
if (client.realtime.isPlayAudioAvailable()) {
    // Play audio data (WAV, MP3, etc.)
    client.realtime.playAudio(audioBytes)
}
playAudio() is only available when connected to the LIVE_AVATAR model without providing your own audio track. The SDK manages an internal audio stream for this mode.

Connection State

Monitor and react to connection state changes using Kotlin Flow:
import ai.decart.sdk.ConnectionState

// Observe state changes
lifecycleScope.launch {
    client.realtime.connectionState.collect { state ->
        when (state) {
            ConnectionState.DISCONNECTED -> showReconnectButton()
            ConnectionState.CONNECTING -> showLoadingIndicator()
            ConnectionState.CONNECTED -> hideLoadingIndicator()
            ConnectionState.GENERATING -> showGeneratingUI()
            ConnectionState.RECONNECTING -> showReconnectingBanner()
        }
    }
}

// Check state synchronously
val isConnected = client.realtime.isConnected()
Connection States:
  • DISCONNECTED - Not connected (initial state, after disconnect(), or after reconnect failure)
  • CONNECTING - Initial connection in progress
  • CONNECTED - Connected and ready to send prompts
  • GENERATING - Actively generating transformed video (sticky until disconnected)
  • RECONNECTING - Connection lost unexpectedly; the SDK is automatically retrying
The SDK automatically reconnects when an unexpected disconnection occurs (e.g., network interruption). During auto-reconnect, the state transitions to RECONNECTING while the SDK retries. If all retries fail, the state moves to DISCONNECTED and an error is emitted.

Error Handling

Handle errors using the errors SharedFlow:
import ai.decart.sdk.DecartError
import ai.decart.sdk.ErrorCodes

lifecycleScope.launch {
    client.realtime.errors.collect { error ->
        when (error.code) {
            ErrorCodes.INVALID_API_KEY ->
                showError("Invalid API key. Check your credentials.")
            ErrorCodes.WEBRTC_TIMEOUT_ERROR ->
                showError("Connection timed out. Check your network.")
            ErrorCodes.WEBRTC_ICE_ERROR ->
                showError("ICE negotiation failed.")
            ErrorCodes.WEBRTC_WEBSOCKET_ERROR ->
                showError("WebSocket connection error.")
            ErrorCodes.WEBRTC_SERVER_ERROR ->
                showError("Server error. Try again later.")
            ErrorCodes.WEBRTC_SIGNALING_ERROR ->
                showError("Signaling error.")
            else ->
                showError("Error: ${error.message}")
        }
    }
}
Error Codes:
  • INVALID_API_KEY - API key is invalid or missing
  • WEBRTC_TIMEOUT_ERROR - Connection timed out
  • WEBRTC_ICE_ERROR - ICE negotiation failed
  • WEBRTC_WEBSOCKET_ERROR - WebSocket connection error
  • WEBRTC_SERVER_ERROR - Server-side error
  • WEBRTC_SIGNALING_ERROR - Signaling protocol error

WebRTC Stats

Monitor real-time performance metrics:
lifecycleScope.launch {
    client.realtime.stats.collect { stats ->
        stats.video?.let { video ->
            println("FPS: ${video.framesPerSecond}")
            println("Resolution: ${video.frameWidth}x${video.frameHeight}")
            println("Bitrate: ${video.bitrate} bps")
            println("Packets lost: ${video.packetsLost}")
            println("Jitter: ${video.jitter}")
        }
        stats.connection.let { conn ->
            println("RTT: ${conn.currentRoundTripTime}s")
        }
    }
}
Available stats:
  • video - Inbound video: FPS, resolution, bitrate, packets lost, jitter, freeze count
  • audio - Inbound audio: bitrate, packets lost, jitter
  • outboundVideo - Outbound video: FPS, resolution, bitrate, quality limitation reason
  • connection - Round-trip time, available outgoing bitrate

Generation Ticks

Track session duration for billing and usage display:
lifecycleScope.launch {
    client.realtime.generationTicks.collect { tick ->
        println("Generation running for ${tick.seconds} seconds")
        updateBillingUI(tick.seconds)
    }
}

Diagnostics

Monitor detailed connection diagnostics for debugging:
lifecycleScope.launch {
    client.realtime.diagnostics.collect { event ->
        when (event) {
            is DiagnosticEvent.PhaseTiming ->
                println("${event.data.phase}: ${event.data.durationMs}ms")
            is DiagnosticEvent.Reconnect ->
                println("Reconnect attempt ${event.data.attempt}/${event.data.maxAttempts}")
            is DiagnosticEvent.VideoStall ->
                println("Video stall: ${if (event.data.stalled) "detected" else "recovered"}")
            else -> { /* other diagnostic events */ }
        }
    }
}

Session Management

Access the current session ID for logging, analytics, or debugging:
val sessionId = client.realtime.sessionId
println("Current session: $sessionId")

Cleanup

Always disconnect and release resources when done:
// Stop camera capture
capturer.stopCapture()

// Disconnect from the service
client.realtime.disconnect()

// Release all SDK resources
client.release()

// Release EGL context
eglBase.release()
Failing to call release() can leave WebRTC connections open and leak native resources.

Complete Jetpack Compose Example

A full Jetpack Compose application with camera capture, video transformation, and prompt input:
import ai.decart.sdk.*
import ai.decart.sdk.realtime.*
import android.Manifest
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.result.contract.ActivityResultContracts
import androidx.compose.foundation.layout.*
import androidx.compose.material3.*
import androidx.compose.runtime.*
import androidx.compose.ui.Modifier
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.unit.dp
import androidx.compose.ui.viewinterop.AndroidView
import androidx.lifecycle.ViewModel
import androidx.lifecycle.viewModelScope
import kotlinx.coroutines.flow.*
import kotlinx.coroutines.launch
import org.webrtc.*

class RealtimeViewModel : ViewModel() {
    private val _connectionState = MutableStateFlow(ConnectionState.DISCONNECTED)
    val connectionState = _connectionState.asStateFlow()

    private val _error = MutableStateFlow<String?>(null)
    val error = _error.asStateFlow()

    var promptText by mutableStateOf("Turn into a fantasy figure")

    private var client: DecartClient? = null
    private var capturer: CameraVideoCapturer? = null

    // Shared EGL context for WebRTC and SurfaceViewRenderer
    val eglBase: EglBase = EglBase.create()

    val remoteRenderer = mutableStateOf<SurfaceViewRenderer?>(null)

    fun connect(context: android.content.Context) {
        viewModelScope.launch {
            try {
                val decartClient = DecartClient(
                    context = context,
                    config = DecartClientConfig(
                        apiKey = "your-api-key"
                    )
                )
                client = decartClient

                val realtime = decartClient.realtime
                realtime.initialize(eglBase)

                val model = RealtimeModels.MIRAGE_V2

                // Create camera track
                val videoSource = realtime.createVideoSource(false)!!
                val videoTrack = realtime.createVideoTrack("camera", videoSource)!!

                val enumerator = Camera2Enumerator(context)
                val cameraName = enumerator.deviceNames.first {
                    enumerator.isFrontFacing(it)
                }
                val cameraCapturer = enumerator.createCapturer(cameraName, null)
                capturer = cameraCapturer

                cameraCapturer.initialize(
                    SurfaceTextureHelper.create(
                        "CaptureThread",
                        realtime.getEglBaseContext()
                    ),
                    context,
                    videoSource.capturerObserver
                )
                cameraCapturer.startCapture(model.width, model.height, model.fps)

                // Observe state
                viewModelScope.launch {
                    realtime.connectionState.collect { _connectionState.value = it }
                }
                viewModelScope.launch {
                    realtime.errors.collect { _error.value = it.message }
                }

                // Connect
                realtime.connect(
                    localVideoTrack = videoTrack,
                    options = ConnectOptions(
                        model = model,
                        onRemoteVideoTrack = { track ->
                            remoteRenderer.value?.let { renderer ->
                                track.addSink(renderer)
                            }
                        },
                        initialPrompt = InitialPrompt(
                            text = promptText,
                            enhance = true
                        )
                    )
                )
            } catch (e: Exception) {
                _error.value = e.message
            }
        }
    }

    fun setPrompt() {
        try {
            client?.realtime?.setPrompt(promptText, enhance = true)
        } catch (e: Exception) {
            _error.value = e.message
        }
    }

    fun disconnect() {
        capturer?.stopCapture()
        capturer = null
        client?.release()
        client = null
        _connectionState.value = ConnectionState.DISCONNECTED
    }

    override fun onCleared() {
        disconnect()
        eglBase.release()
    }
}

@Composable
fun RealtimeScreen(viewModel: RealtimeViewModel) {
    val context = LocalContext.current
    val connectionState by viewModel.connectionState.collectAsState()
    val error by viewModel.error.collectAsState()
    val isConnected = connectionState == ConnectionState.CONNECTED
        || connectionState == ConnectionState.GENERATING

    val permissionLauncher = rememberLauncherForActivityResult(
        ActivityResultContracts.RequestPermission()
    ) { granted ->
        if (granted) viewModel.connect(context)
    }

    Column(
        modifier = Modifier.fillMaxSize().padding(16.dp)
    ) {
        // Remote video
        AndroidView(
            factory = { ctx ->
                SurfaceViewRenderer(ctx).also { renderer ->
                    renderer.init(viewModel.eglBase.eglBaseContext, null)
                    renderer.setScalingType(RendererCommon.ScalingType.SCALE_ASPECT_FIT)
                    viewModel.remoteRenderer.value = renderer
                }
            },
            modifier = Modifier.weight(1f).fillMaxWidth()
        )

        Spacer(modifier = Modifier.height(8.dp))

        // Status
        Text("Status: $connectionState")

        error?.let {
            Text(it, color = MaterialTheme.colorScheme.error)
        }

        Spacer(modifier = Modifier.height(8.dp))

        // Prompt input
        Row(modifier = Modifier.fillMaxWidth()) {
            OutlinedTextField(
                value = viewModel.promptText,
                onValueChange = { viewModel.promptText = it },
                label = { Text("Style prompt") },
                modifier = Modifier.weight(1f)
            )
            Spacer(modifier = Modifier.width(8.dp))
            Button(
                onClick = { viewModel.setPrompt() },
                enabled = isConnected
            ) {
                Text("Send")
            }
        }

        Spacer(modifier = Modifier.height(8.dp))

        // Connect / Disconnect
        Button(
            onClick = {
                if (isConnected) {
                    viewModel.disconnect()
                } else {
                    permissionLauncher.launch(Manifest.permission.CAMERA)
                }
            },
            modifier = Modifier.fillMaxWidth()
        ) {
            Text(if (isConnected) "Disconnect" else "Connect")
        }
    }
}

Best Practices

Always use the model’s fps, width, and height properties when configuring camera capture to ensure optimal performance.
val model = RealtimeModels.MIRAGE_V2
capturer.startCapture(model.width, model.height, model.fps)
For best results, keep enhance = true (default) to let Decart’s AI enhance your prompts. Only disable it if you need exact prompt control.
Always collect connectionState, errors, and other flows in lifecycleScope or viewModelScope to avoid leaks and ensure proper cancellation.
Always call disconnect() then release() when done. Stop camera capture separately. Failing to release can leak native WebRTC resources.
Always test camera features on real Android devices. The emulator does not support WebRTC camera access.
Request CAMERA and RECORD_AUDIO permissions at runtime before attempting to connect. Handle permission denials gracefully in your UI.

API Reference

realtime.initialize(eglBase?)

Initialize the WebRTC peer connection factory. Optional - called automatically by connect() if not called explicitly. Parameters:
  • eglBase: EglBase? (optional) - EGL context for hardware-accelerated video. A default is created if omitted.

realtime.connect(localVideoTrack, localAudioTrack, options)

Connect to a realtime model and start streaming. Suspend function. Parameters:
  • localVideoTrack: VideoTrack? - Camera video track (null for Live Avatar)
  • localAudioTrack: AudioTrack? - Microphone audio track (null to omit)
  • options: ConnectOptions - Connection configuration
    • model: RealtimeModel - Model from RealtimeModels
    • onRemoteVideoTrack: (VideoTrack) -> Unit - Callback for transformed video
    • onRemoteAudioTrack: ((AudioTrack) -> Unit)? - Callback for remote audio
    • initialPrompt: InitialPrompt? - Starting prompt (text + enhance)
    • initialImage: String? - Base64-encoded reference image
Throws: IllegalStateException if factory initialization fails

realtime.setPrompt(prompt, enhance)

Change the transformation style. Parameters:
  • prompt: String - Text description of desired style
  • enhance: Boolean - Whether to enhance the prompt (default: true)
Throws: IllegalStateException if not connected

realtime.setImage(imageBase64, prompt, enhance, timeout)

Send a reference image. Suspend function. Parameters:
  • imageBase64: String? - Base64-encoded image, or null to clear
  • prompt: String? - Optional text prompt
  • enhance: Boolean? - Whether to enhance the prompt
  • timeout: Long - Ack timeout in ms (default: 30,000)
Throws: IllegalStateException if not connected

realtime.playAudio(audioData)

Play audio through the Live Avatar stream. Suspend function. Parameters:
  • audioData: ByteArray - Raw audio data (WAV, MP3, etc.)
Throws: IllegalStateException if not in Live Avatar mode

realtime.disconnect()

End the current session and clean up WebRTC resources.

realtime.release()

Release all resources including the peer connection factory and EGL context. Call when done with the client entirely.

realtime.isConnected()

Returns: Boolean - Whether currently connected.

realtime.isPlayAudioAvailable()

Returns: Boolean - Whether playAudio() is available (Live Avatar mode).

realtime.getEglBaseContext()

Returns: EglBase.Context? - EGL context for initializing SurfaceViewRenderer.

realtime.createVideoSource(isScreencast)

Returns: VideoSource? - A new video source for camera capture.

realtime.createVideoTrack(id, source)

Returns: VideoTrack? - A new video track from a video source.

Observable State

PropertyTypeDescription
connectionStateStateFlow<ConnectionState>Current connection state
errorsSharedFlow<DecartError>Error events
statsSharedFlow<WebRTCStats>WebRTC performance stats
generationTicksSharedFlow<GenerationTickMessage>Billing/usage tick events
diagnosticsSharedFlow<DiagnosticEvent>Connection diagnostic events
sessionIdString?Current session ID

Next Steps