Architecture | NLP Studio

System Layers

🖥️

User Interface

Modern reactive frontend with server-side rendering

SvelteKit 2TypeScriptCSS3

↓

⚡

API Layer

Edge-deployed API endpoints with sub-100ms latency

SvelteKit RoutesREST APIsJSON

↓

🧠

AI Processing

LLM inference and embeddings at the edge

Workers AILlama 3.1 8BBGE Embeddings

↓

💾

Data Layer

Cloudflare-native storage solutions

D1 DatabaseVectorizeR2 Storage

↓

☁️

Infrastructure

300+ edge locations worldwide

Cloudflare PagesEdge NetworkGlobal CDN

Data Flows

Document Analysis Flow

1 User inputs text or uploads document
2 Frontend sends to /api/analyze endpoint
3 Workers AI extracts entities, sentiment, keywords
4 JSON response rendered in real-time

RAG Query Flow

1 User uploads documents for indexing
2 BGE embeddings generated and stored
3 User query converted to embedding
4 Cosine similarity finds relevant chunks
5 Llama 3.1 generates contextual answer

Generation Flow

1 User selects template and inputs
2 Template merged with context
3 Llama 3.1 generates content
4 Formatted output delivered

Technical Decisions

Why we chose this stack over alternatives.

SvelteKit over Next.js

Native Cloudflare adapter, smaller bundles, faster builds. Personal expertise enables rapid iteration.

Impact: 40% smaller bundle size

Workers AI over OpenAI

Data stays in Europe, no API key management, integrated billing. GDPR compliant by design.

Impact: Zero external dependencies

Edge over Origin

Processing happens at 300+ edge locations. No cold starts, no scaling concerns.

Impact: <100ms latency globally

Serverless over Containers

No infrastructure management. Scales automatically from 0 to millions of requests.

Impact: $0 idle cost

Performance

<100ms API Latency

300+ Edge Locations

0ms Cold Start

∞ Auto-Scale

See It In Action

Experience the architecture through our live demos.

Try Analysis Try RAG Use Cases

🏗️ Architecture