RowSpeak Private Deployment: Performance Benchmarks

This document provides reference performance data for RowSpeak Private Deployment across different hardware configurations and usage scenarios. Use it to set expectations, plan infrastructure, and validate your deployment.

Summary

Metric	Value
Inference latency (first token)	< 100ms
Average full response time	3–8 seconds
Uptime SLA	99.9%
Concurrent users (standard config)	50+
Data leaks	0 (by architecture)

Test Environment Reference

All benchmarks below were run on the following standard configuration unless otherwise noted.

Component	Specification
CPU	16-core Intel Xeon
RAM	64 GB DDR4
GPU	NVIDIA A10 (24 GB VRAM)
Storage	1 TB NVMe SSD
OS	Ubuntu 22.04 LTS
Model	DeepSeek-V2 (local)
Network	1 Gbps internal

Response Time by Task Type

Concurrency Benchmarks

How RowSpeak performs as simultaneous users increase.

Performance by Hardware Configuration

Configuration	Concurrent Users	Avg Response	P95 Response	Recommended For
Minimum (8-core, 32GB, 16GB VRAM)	10–20	4.5s	9s	Small teams, pilot
Standard (16-core, 64GB, 24GB VRAM)	50	3.5s	7s	Departments, 50–100 users
Enterprise (32-core, 128GB, 80GB VRAM)	100–200	2.8s	6s	Large orgs, high concurrency
Enterprise cluster (multi-node)	500+	2.5s	5s	Enterprise-wide rollout

Model Performance Comparison

Different models have different speed/quality tradeoffs. Here is how they compare on standard spreadsheet analysis tasks.

Model	Type	Avg Response	Quality	Best For
DeepSeek-V2	Open-source	3.5s	High	General analysis, Chinese language
Qwen2.5-72B	Open-source	4.1s	High	Multilingual, structured data
GPT-4o	Closed-source (API)	2.8s	Very High	Complex reasoning, English
Claude 3.5 Sonnet	Closed-source (API)	3.2s	Very High	Long documents, nuanced output
Gemini 1.5 Pro	Closed-source (API)	3.0s	High	Mixed media, large context

Closed-source model response times depend on the provider's API latency and your network connection to their endpoints.

Stability and Uptime

RowSpeak Private Deployment is designed for continuous operation.

Target uptime: 99.9% (less than 9 hours downtime per year)
Graceful degradation: if the model layer is temporarily unavailable, the application layer continues to serve cached results
Restart recovery: full service recovery in under 60 seconds after a planned restart
Memory stability: no memory leaks observed in 30-day continuous operation tests

File Processing Performance

File Type	File Size	Processing Time
Single-sheet CSV	< 1 MB	< 1s
Multi-sheet Excel	5 MB	2–4s
Large Excel workbook	50 MB	8–15s
PDF with tables	10 MB	5–10s
Batch (10 files)	50 MB total	20–40s

Planning Your Deployment

Use the hardware sizing table above as a starting point. For a more precise recommendation based on your team size, file types, and usage patterns, request the Deployment Pack which includes a sizing worksheet.

For a live performance demonstration using your own file types, book a demo.

Turn files into answers, reports, and dashboards.

From raw data to business-ready decisions.