The Realtime API enables you to transform live video streams with minimal latency using WebRTC. Perfect for building Android camera effects, video restyling apps, AR experiences, and interactive live streaming.
Quick Start
import ai.decart.sdk.DecartClient
import ai.decart.sdk.DecartClientConfig
import ai.decart.sdk.RealtimeModels
import ai.decart.sdk.realtime.ConnectOptions
import ai.decart.sdk.realtime.InitialPrompt
val client = DecartClient (context, DecartClientConfig (apiKey = "your-api-key" ))
val realtime = client.realtime
// Initialize WebRTC
realtime. initialize (eglBase)
// Connect with camera
realtime. connect (
localVideoTrack = cameraTrack,
localAudioTrack = null ,
options = ConnectOptions (
model = RealtimeModels.MIRAGE_V2,
onRemoteVideoTrack = { track ->
remoteRenderer. addSink (track)
},
initialPrompt = InitialPrompt ( "a cyberpunk cityscape" )
)
)
// Change style on the fly
realtime. setPrompt ( "Anime world" , enhance = true )
// Disconnect when done
realtime. disconnect ()
client. release ()
Client-Side Authentication
For production Android apps, use ephemeral keys instead of embedding your permanent API key in the APK. Ephemeral keys are short-lived tokens safe to include in client applications.
Learn more about client tokens and why they’re important for security.
Fetching an Ephemeral Key
Your app should fetch an ephemeral key from your backend server before connecting:
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.withContext
import okhttp3.OkHttpClient
import okhttp3.Request
import okhttp3.RequestBody.Companion.toRequestBody
import org.json.JSONObject
suspend fun fetchEphemeralKey (): String = withContext (Dispatchers.IO) {
val client = OkHttpClient ()
val request = Request. Builder ()
. url ( "https://your-backend.com/api/realtime-token" )
. post ( "" . toRequestBody ())
// Add any auth headers your backend requires
// .addHeader("Authorization", "Bearer $userToken")
. build ()
val response = client. newCall (request). execute ()
if ( ! response.isSuccessful) {
throw Exception ( "Failed to fetch token: ${ response.code } " )
}
val body = response.body?. string ()
?: throw Exception ( "Empty response body" )
JSONObject (body). getString ( "apiKey" )
}
Connecting with an Ephemeral Key
val ephemeralKey = fetchEphemeralKey ()
val client = DecartClient (
context = applicationContext,
config = DecartClientConfig (apiKey = ephemeralKey)
)
// Connect as usual
client.realtime. connect ( .. .)
Never hardcode your permanent API key in Android apps. APKs can be decompiled, exposing embedded secrets. Always use ephemeral keys from your backend.
Connecting
Initializing WebRTC
Before connecting, initialize the WebRTC peer connection factory. This sets up video encoding/decoding with hardware acceleration:
import org.webrtc.EglBase
// Create an EGL context (shared with SurfaceViewRenderers)
val eglBase = EglBase. create ()
// Initialize the SDK's WebRTC internals
client.realtime. initialize (eglBase)
initialize() is optional — if you skip it, connect() calls it automatically with a default EglBase. Call it explicitly when you need access to the EGL context for setting up SurfaceViewRenderer instances.
Creating a Camera Track
Use the SDK’s helper methods to create a video track from the device camera:
import org.webrtc. *
val model = RealtimeModels.MIRAGE_V2
// Create video source and track via the SDK
val videoSource = client.realtime. createVideoSource (isScreencast = false ) !!
val videoTrack = client.realtime. createVideoTrack ( "camera" , videoSource) !!
// Set up camera capturer
val enumerator = Camera2Enumerator (context)
val cameraName = enumerator.deviceNames. first { enumerator. isFrontFacing (it) }
val capturer = enumerator. createCapturer (cameraName, null )
// Start capture using model dimensions for optimal performance
capturer. initialize (
SurfaceTextureHelper. create ( "CaptureThread" , client.realtime. getEglBaseContext ()),
context,
videoSource.capturerObserver
)
capturer. startCapture (model.width, model.height, model.fps)
Use the model’s fps, width, and height properties when configuring camera capture to ensure optimal performance and compatibility.
Camera capture requires a real Android device . The emulator does not support WebRTC camera features.
Establishing Connection
Connect to the Realtime API with your camera track:
client.realtime. connect (
localVideoTrack = videoTrack,
localAudioTrack = null , // or a microphone AudioTrack
options = ConnectOptions (
model = RealtimeModels.MIRAGE_V2,
onRemoteVideoTrack = { track ->
// Display the transformed video
remoteRenderer. addSink (track)
},
onRemoteAudioTrack = { track ->
// Handle remote audio (optional)
},
initialPrompt = InitialPrompt (
text = "Lego World" ,
enhance = true // Let Decart enhance the prompt (recommended)
),
initialImage = null // Optional base64 image for avatar models
)
)
Parameters:
localVideoTrack (optional) - WebRTC VideoTrack from camera (null for Live Avatar or subscribe mode)
localAudioTrack (optional) - WebRTC AudioTrack from microphone
options.model (required) - Realtime model from RealtimeModels
options.onRemoteVideoTrack (required) - Callback that receives the transformed video track
options.onRemoteAudioTrack (optional) - Callback for remote audio track
options.initialPrompt (optional) - Starting prompt
text - Style description
enhance - Whether to auto-enhance the prompt (default: true)
options.initialImage (optional) - Base64-encoded reference image
Managing Prompts
Change the transformation style dynamically without reconnecting:
// Simple prompt with automatic enhancement
client.realtime. setPrompt ( "Anime style" )
// Detailed prompt without enhancement
client.realtime. setPrompt (
"A detailed artistic style with vibrant colors and dramatic lighting" ,
enhance = false
)
Parameters:
prompt: String (required) - Text description of desired style
enhance: Boolean (optional) - Whether to enhance the prompt (default: true)
Prompt enhancement uses Decart’s AI to expand simple prompts for better results. Disable it if you want full control over the exact prompt.
Reference Images
Send a reference image (and optionally a prompt) for image-guided models:
// Set image with prompt
client.realtime. setImage (
imageBase64 = base64EncodedImage,
prompt = "Transform into this character" ,
enhance = true ,
timeout = 30_000L // 30 second ack timeout
)
// Clear image
client.realtime. setImage (imageBase64 = null )
Parameters:
imageBase64: String? (required) - Base64-encoded image, or null to clear
prompt: String? (optional) - Text prompt to send with the image
enhance: Boolean? (optional) - Whether to enhance the prompt
timeout: Long (optional) - Timeout in milliseconds for the server ack (default: 30,000)
Audio (Live Avatar)
When connected to the LIVE_AVATAR model without a user-provided audio track, you can play audio through the stream:
// Check if audio playback is available
if (client.realtime. isPlayAudioAvailable ()) {
// Play audio data (WAV, MP3, etc.)
client.realtime. playAudio (audioBytes)
}
playAudio() is only available when connected to the LIVE_AVATAR model without providing your own audio track. The SDK manages an internal audio stream for this mode.
Connection State
Monitor and react to connection state changes using Kotlin Flow:
import ai.decart.sdk.ConnectionState
// Observe state changes
lifecycleScope. launch {
client.realtime.connectionState. collect { state ->
when (state) {
ConnectionState.DISCONNECTED -> showReconnectButton ()
ConnectionState.CONNECTING -> showLoadingIndicator ()
ConnectionState.CONNECTED -> hideLoadingIndicator ()
ConnectionState.GENERATING -> showGeneratingUI ()
ConnectionState.RECONNECTING -> showReconnectingBanner ()
}
}
}
// Check state synchronously
val isConnected = client.realtime. isConnected ()
Connection States:
DISCONNECTED - Not connected (initial state, after disconnect(), or after reconnect failure)
CONNECTING - Initial connection in progress
CONNECTED - Connected and ready to send prompts
GENERATING - Actively generating transformed video (sticky until disconnected)
RECONNECTING - Connection lost unexpectedly; the SDK is automatically retrying
The SDK automatically reconnects when an unexpected disconnection occurs (e.g., network interruption). During auto-reconnect, the state transitions to RECONNECTING while the SDK retries. If all retries fail, the state moves to DISCONNECTED and an error is emitted.
Error Handling
Handle errors using the errors SharedFlow:
import ai.decart.sdk.DecartError
import ai.decart.sdk.ErrorCodes
lifecycleScope. launch {
client.realtime.errors. collect { error ->
when (error.code) {
ErrorCodes.INVALID_API_KEY ->
showError ( "Invalid API key. Check your credentials." )
ErrorCodes.WEBRTC_TIMEOUT_ERROR ->
showError ( "Connection timed out. Check your network." )
ErrorCodes.WEBRTC_ICE_ERROR ->
showError ( "ICE negotiation failed." )
ErrorCodes.WEBRTC_WEBSOCKET_ERROR ->
showError ( "WebSocket connection error." )
ErrorCodes.WEBRTC_SERVER_ERROR ->
showError ( "Server error. Try again later." )
ErrorCodes.WEBRTC_SIGNALING_ERROR ->
showError ( "Signaling error." )
else ->
showError ( "Error: ${ error.message } " )
}
}
}
Error Codes:
INVALID_API_KEY - API key is invalid or missing
WEBRTC_TIMEOUT_ERROR - Connection timed out
WEBRTC_ICE_ERROR - ICE negotiation failed
WEBRTC_WEBSOCKET_ERROR - WebSocket connection error
WEBRTC_SERVER_ERROR - Server-side error
WEBRTC_SIGNALING_ERROR - Signaling protocol error
WebRTC Stats
Monitor real-time performance metrics:
lifecycleScope. launch {
client.realtime.stats. collect { stats ->
stats.video?. let { video ->
println ( "FPS: ${ video.framesPerSecond } " )
println ( "Resolution: ${ video.frameWidth } x ${ video.frameHeight } " )
println ( "Bitrate: ${ video.bitrate } bps" )
println ( "Packets lost: ${ video.packetsLost } " )
println ( "Jitter: ${ video.jitter } " )
}
stats.connection. let { conn ->
println ( "RTT: ${ conn.currentRoundTripTime } s" )
}
}
}
Available stats:
video - Inbound video: FPS, resolution, bitrate, packets lost, jitter, freeze count
audio - Inbound audio: bitrate, packets lost, jitter
outboundVideo - Outbound video: FPS, resolution, bitrate, quality limitation reason
connection - Round-trip time, available outgoing bitrate
Generation Ticks
Track session duration for billing and usage display:
lifecycleScope. launch {
client.realtime.generationTicks. collect { tick ->
println ( "Generation running for ${ tick.seconds } seconds" )
updateBillingUI (tick.seconds)
}
}
Diagnostics
Monitor detailed connection diagnostics for debugging:
lifecycleScope. launch {
client.realtime.diagnostics. collect { event ->
when (event) {
is DiagnosticEvent.PhaseTiming ->
println ( " ${ event. data .phase } : ${ event. data .durationMs } ms" )
is DiagnosticEvent.Reconnect ->
println ( "Reconnect attempt ${ event. data .attempt } / ${ event. data .maxAttempts } " )
is DiagnosticEvent.VideoStall ->
println ( "Video stall: ${ if (event. data .stalled) "detected" else "recovered" } " )
else -> { /* other diagnostic events */ }
}
}
}
Session Management
Access the current session ID for logging, analytics, or debugging:
val sessionId = client.realtime.sessionId
println ( "Current session: $sessionId " )
Cleanup
Always disconnect and release resources when done:
// Stop camera capture
capturer. stopCapture ()
// Disconnect from the service
client.realtime. disconnect ()
// Release all SDK resources
client. release ()
// Release EGL context
eglBase. release ()
Failing to call release() can leave WebRTC connections open and leak native resources.
Complete Jetpack Compose Example
A full Jetpack Compose application with camera capture, video transformation, and prompt input:
import ai.decart.sdk. *
import ai.decart.sdk.realtime. *
import android.Manifest
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.result.contract.ActivityResultContracts
import androidx.compose.foundation.layout. *
import androidx.compose.material3. *
import androidx.compose.runtime. *
import androidx.compose.ui.Modifier
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.unit.dp
import androidx.compose.ui.viewinterop.AndroidView
import androidx.lifecycle.ViewModel
import androidx.lifecycle.viewModelScope
import kotlinx.coroutines.flow. *
import kotlinx.coroutines.launch
import org.webrtc. *
class RealtimeViewModel : ViewModel () {
private val _connectionState = MutableStateFlow (ConnectionState.DISCONNECTED)
val connectionState = _connectionState. asStateFlow ()
private val _error = MutableStateFlow < String ?>( null )
val error = _error. asStateFlow ()
var promptText by mutableStateOf ( "Turn into a fantasy figure" )
private var client: DecartClient ? = null
private var capturer: CameraVideoCapturer ? = null
// Shared EGL context for WebRTC and SurfaceViewRenderer
val eglBase: EglBase = EglBase. create ()
val remoteRenderer = mutableStateOf < SurfaceViewRenderer ?>( null )
fun connect (context: android .content.Context) {
viewModelScope. launch {
try {
val decartClient = DecartClient (
context = context,
config = DecartClientConfig (
apiKey = "your-api-key"
)
)
client = decartClient
val realtime = decartClient.realtime
realtime. initialize (eglBase)
val model = RealtimeModels.MIRAGE_V2
// Create camera track
val videoSource = realtime. createVideoSource ( false ) !!
val videoTrack = realtime. createVideoTrack ( "camera" , videoSource) !!
val enumerator = Camera2Enumerator (context)
val cameraName = enumerator.deviceNames. first {
enumerator. isFrontFacing (it)
}
val cameraCapturer = enumerator. createCapturer (cameraName, null )
capturer = cameraCapturer
cameraCapturer. initialize (
SurfaceTextureHelper. create (
"CaptureThread" ,
realtime. getEglBaseContext ()
),
context,
videoSource.capturerObserver
)
cameraCapturer. startCapture (model.width, model.height, model.fps)
// Observe state
viewModelScope. launch {
realtime.connectionState. collect { _connectionState. value = it }
}
viewModelScope. launch {
realtime.errors. collect { _error. value = it.message }
}
// Connect
realtime. connect (
localVideoTrack = videoTrack,
options = ConnectOptions (
model = model,
onRemoteVideoTrack = { track ->
remoteRenderer. value ?. let { renderer ->
track. addSink (renderer)
}
},
initialPrompt = InitialPrompt (
text = promptText,
enhance = true
)
)
)
} catch (e: Exception ) {
_error. value = e.message
}
}
}
fun setPrompt () {
try {
client?.realtime?. setPrompt (promptText, enhance = true )
} catch (e: Exception ) {
_error. value = e.message
}
}
fun disconnect () {
capturer?. stopCapture ()
capturer = null
client?. release ()
client = null
_connectionState. value = ConnectionState.DISCONNECTED
}
override fun onCleared () {
disconnect ()
eglBase. release ()
}
}
@Composable
fun RealtimeScreen (viewModel: RealtimeViewModel ) {
val context = LocalContext.current
val connectionState by viewModel.connectionState. collectAsState ()
val error by viewModel.error. collectAsState ()
val isConnected = connectionState == ConnectionState.CONNECTED
|| connectionState == ConnectionState.GENERATING
val permissionLauncher = rememberLauncherForActivityResult (
ActivityResultContracts. RequestPermission ()
) { granted ->
if (granted) viewModel. connect (context)
}
Column (
modifier = Modifier. fillMaxSize (). padding ( 16 .dp)
) {
// Remote video
AndroidView (
factory = { ctx ->
SurfaceViewRenderer (ctx). also { renderer ->
renderer. init (viewModel.eglBase.eglBaseContext, null )
renderer. setScalingType (RendererCommon.ScalingType.SCALE_ASPECT_FIT)
viewModel.remoteRenderer. value = renderer
}
},
modifier = Modifier. weight ( 1f ). fillMaxWidth ()
)
Spacer (modifier = Modifier. height ( 8 .dp))
// Status
Text ( "Status: $connectionState " )
error?. let {
Text (it, color = MaterialTheme.colorScheme.error)
}
Spacer (modifier = Modifier. height ( 8 .dp))
// Prompt input
Row (modifier = Modifier. fillMaxWidth ()) {
OutlinedTextField (
value = viewModel.promptText,
onValueChange = { viewModel.promptText = it },
label = { Text ( "Style prompt" ) },
modifier = Modifier. weight ( 1f )
)
Spacer (modifier = Modifier. width ( 8 .dp))
Button (
onClick = { viewModel. setPrompt () },
enabled = isConnected
) {
Text ( "Send" )
}
}
Spacer (modifier = Modifier. height ( 8 .dp))
// Connect / Disconnect
Button (
onClick = {
if (isConnected) {
viewModel. disconnect ()
} else {
permissionLauncher. launch (Manifest.permission.CAMERA)
}
},
modifier = Modifier. fillMaxWidth ()
) {
Text ( if (isConnected) "Disconnect" else "Connect" )
}
}
}
Best Practices
Use model properties for camera constraints
Always use the model’s fps, width, and height properties when configuring camera capture to ensure optimal performance. val model = RealtimeModels.MIRAGE_V2
capturer. startCapture (model.width, model.height, model.fps)
For best results, keep enhance = true (default) to let Decart’s AI enhance your prompts. Only disable it if you need exact prompt control.
Collect flows in lifecycle-aware scopes
Always collect connectionState, errors, and other flows in lifecycleScope or viewModelScope to avoid leaks and ensure proper cancellation.
Always call disconnect() then release() when done. Stop camera capture separately. Failing to release can leak native WebRTC resources.
Always test camera features on real Android devices. The emulator does not support WebRTC camera access.
Request permissions properly
Request CAMERA and RECORD_AUDIO permissions at runtime before attempting to connect. Handle permission denials gracefully in your UI.
API Reference
realtime.initialize(eglBase?)
Initialize the WebRTC peer connection factory. Optional - called automatically by connect() if not called explicitly.
Parameters:
eglBase: EglBase? (optional) - EGL context for hardware-accelerated video. A default is created if omitted.
realtime.connect(localVideoTrack, localAudioTrack, options)
Connect to a realtime model and start streaming. Suspend function.
Parameters:
localVideoTrack: VideoTrack? - Camera video track (null for Live Avatar)
localAudioTrack: AudioTrack? - Microphone audio track (null to omit)
options: ConnectOptions - Connection configuration
model: RealtimeModel - Model from RealtimeModels
onRemoteVideoTrack: (VideoTrack) -> Unit - Callback for transformed video
onRemoteAudioTrack: ((AudioTrack) -> Unit)? - Callback for remote audio
initialPrompt: InitialPrompt? - Starting prompt (text + enhance)
initialImage: String? - Base64-encoded reference image
Throws: IllegalStateException if factory initialization fails
realtime.setPrompt(prompt, enhance)
Change the transformation style.
Parameters:
prompt: String - Text description of desired style
enhance: Boolean - Whether to enhance the prompt (default: true)
Throws: IllegalStateException if not connected
realtime.setImage(imageBase64, prompt, enhance, timeout)
Send a reference image. Suspend function.
Parameters:
imageBase64: String? - Base64-encoded image, or null to clear
prompt: String? - Optional text prompt
enhance: Boolean? - Whether to enhance the prompt
timeout: Long - Ack timeout in ms (default: 30,000)
Throws: IllegalStateException if not connected
realtime.playAudio(audioData)
Play audio through the Live Avatar stream. Suspend function.
Parameters:
audioData: ByteArray - Raw audio data (WAV, MP3, etc.)
Throws: IllegalStateException if not in Live Avatar mode
realtime.disconnect()
End the current session and clean up WebRTC resources.
realtime.release()
Release all resources including the peer connection factory and EGL context. Call when done with the client entirely.
realtime.isConnected()
Returns: Boolean - Whether currently connected.
realtime.isPlayAudioAvailable()
Returns: Boolean - Whether playAudio() is available (Live Avatar mode).
realtime.getEglBaseContext()
Returns: EglBase.Context? - EGL context for initializing SurfaceViewRenderer.
realtime.createVideoSource(isScreencast)
Returns: VideoSource? - A new video source for camera capture.
realtime.createVideoTrack(id, source)
Returns: VideoTrack? - A new video track from a video source.
Observable State
Property Type Description connectionStateStateFlow<ConnectionState>Current connection state errorsSharedFlow<DecartError>Error events statsSharedFlow<WebRTCStats>WebRTC performance stats generationTicksSharedFlow<GenerationTickMessage>Billing/usage tick events diagnosticsSharedFlow<DiagnosticEvent>Connection diagnostic events sessionIdString?Current session ID
Next Steps