Case Study

AI Avatar Preservation Platform

A privacy-first AI system that transforms video recordings into interactive digital avatars — processing locally on-device, then enriching with cloud AI for personality, voice, and face profiles that family members can converse with.

Next.jsReactTypeScript FirebaseOllama (LLaVA + Llama 3.1) FFmpeg WASMGoogle Cloud AIVertex AI

ProblemRecordLocal AICloud AIAvatarChatImpact

Before Static Memories

When someone passes away or becomes unavailable, all that remains are static photos and videos — no way to interact, ask questions, or hear their perspective on new events.

📸

Static Media Only

Photos and videos can't respond, can't adapt, can't hold a conversation.

🧠

Personality Lost

Mannerisms, communication style, humour, and emotional tone aren't captured by photos.

🔒

Privacy Concerns

Uploading personal video to cloud AI services raises significant privacy risks.

Challenge: Preserve a person's identity as an interactive AI — without compromising their privacy.

Stage 1 Video Recording

The user records themselves via the browser. The app captures video and audio, showing a live webcam preview with recording controls.

🎥

Browser-Based Capture

WebRTC webcam + microphone recording with live preview, timer, and recording indicator.

🔴

Recording Controls

Start/stop with visual pulsing indicator. Video blob ready for local processing.

💾

Local Storage

Raw video stays on the user's machine — never uploaded to the cloud.

Stage 2 On-Device AI Processing

The key differentiator: all initial AI processing runs locally via Ollama. No raw video ever leaves the user's machine.

Input

Video Blob

Raw recording

Compression

FFmpeg WASM

Extract key frame, resize 512×512, JPEG 70%

↓ processed locally ↓

Vision

LLaVA Model

Scene, objects, emotions, people — structured JSON

Personality

Llama 3.1 (8B)

Big Five traits, communication style, emotional state

↓

100% On-Device — No raw data leaves the machine

Stage 3 Cloud AI Enrichment

Only lightweight JSON analysis results go to the cloud. Google Cloud APIs add deeper analysis: transcription, facial landmarks, sentiment, and voice synthesis.

🗣️

Speech-to-Text

Transcription with speaker diarization, word timestamps, and punctuation.

👁️

Cloud Vision

Face detection, landmarks, emotion analysis with confidence scores.

📊

Natural Language

Sentiment analysis, entity extraction, syntax analysis of speech.

🔊

Text-to-Speech

Synthesised voice profile matching the user's speaking patterns.

Stage 4 Avatar Assembly

Local and cloud analysis merge into a comprehensive avatar profile — personality, voice, face, and memory — stored in Firestore.

🧠

Personality Profile

Big Five traits, communication style, emotional tone, topic interests

✓

🔊

Voice Profile

Words per minute, pause patterns, tonal variation, vocabulary complexity

✓

👤

Face Analysis

Facial landmarks, emotion likelihoods, detection confidence scores

✓

💭

Memory Bank

Company-scoped learnings from prior sessions for contextual responses

✓

Stage 5 Conversational AI

Family members chat with the avatar. Responses match the person's authentic style and tone, with multi-provider LLM fallback and optional voice synthesis.

💬

Authentic Responses

LLM generates text in the person's communication style using the personality profile.

🔄

Multi-Provider Fallback

Ollama (local, privacy-first) → Vertex AI Gemini 1.5 Pro (cloud fallback).

🔊

Voice Synthesis

Optional audio responses using Google TTS matched to the person's speaking patterns.

Results Architecture & Impact

A privacy-first AI platform combining on-device processing with cloud enrichment to create interactive digital avatars.

Next.js + React

Dashboard with recording, chat, and monitoring

Ollama (Local AI)

LLaVA vision + Llama 3.1 personality analysis

FFmpeg WASM

In-browser video compression and frame extraction

Google Cloud AI

Speech, Vision, NLP, TTS enrichment layer

Firebase

Cloud Functions, Firestore, Storage

Vertex AI

Gemini 1.5 Pro for conversational fallback

Local

On-device AI — raw video never leaves the machine

Google Cloud AI services integrated

Multi

LLM provider fallback chain (local → cloud)

∞

Conversational digital legacy preserved

Privacy-First Digital Immortality

On-device AI processing meets cloud enrichment — creating interactive avatars that preserve personality, voice, and memory.