System Layers

🖥️

User Interface

Modern reactive frontend with server-side rendering

SvelteKit 2TypeScriptCSS3

API Layer

Edge-deployed API endpoints with sub-100ms latency

SvelteKit RoutesREST APIsJSON
🧠

AI Processing

LLM inference and embeddings at the edge

Workers AILlama 3.1 8BBGE Embeddings
💾

Data Layer

Cloudflare-native storage solutions

D1 DatabaseVectorizeR2 Storage
☁️

Infrastructure

300+ edge locations worldwide

Cloudflare PagesEdge NetworkGlobal CDN

Data Flows

Document Analysis Flow

  1. 1 User inputs text or uploads document
  2. 2 Frontend sends to /api/analyze endpoint
  3. 3 Workers AI extracts entities, sentiment, keywords
  4. 4 JSON response rendered in real-time

RAG Query Flow

  1. 1 User uploads documents for indexing
  2. 2 BGE embeddings generated and stored
  3. 3 User query converted to embedding
  4. 4 Cosine similarity finds relevant chunks
  5. 5 Llama 3.1 generates contextual answer

Generation Flow

  1. 1 User selects template and inputs
  2. 2 Template merged with context
  3. 3 Llama 3.1 generates content
  4. 4 Formatted output delivered

Technical Decisions

Why we chose this stack over alternatives.

SvelteKit over Next.js

Native Cloudflare adapter, smaller bundles, faster builds. Personal expertise enables rapid iteration.

Impact: 40% smaller bundle size

Workers AI over OpenAI

Data stays in Europe, no API key management, integrated billing. GDPR compliant by design.

Impact: Zero external dependencies

Edge over Origin

Processing happens at 300+ edge locations. No cold starts, no scaling concerns.

Impact: <100ms latency globally

Serverless over Containers

No infrastructure management. Scales automatically from 0 to millions of requests.

Impact: $0 idle cost

Performance

<100ms API Latency
300+ Edge Locations
0ms Cold Start
Auto-Scale

See It In Action

Experience the architecture through our live demos.