RowSpeak Private Deployment: Performance Benchmarks
This document provides reference performance data for RowSpeak Private Deployment across different hardware configurations and usage scenarios. Use it to set expectations, plan infrastructure, and validate your deployment.
Summary
| Metric | Value |
|---|---|
| Inference latency (first token) | < 100ms |
| Average full response time | 3–8 seconds |
| Uptime SLA | 99.9% |
| Concurrent users (standard config) | 50+ |
| Data leaks | 0 (by architecture) |
Test Environment Reference
All benchmarks below were run on the following standard configuration unless otherwise noted.
| Component | Specification |
|---|---|
| CPU | 16-core Intel Xeon |
| RAM | 64 GB DDR4 |
| GPU | NVIDIA A10 (24 GB VRAM) |
| Storage | 1 TB NVMe SSD |
| OS | Ubuntu 22.04 LTS |
| Model | DeepSeek-V2 (local) |
| Network | 1 Gbps internal |
Response Time by Task Type
Concurrency Benchmarks
How RowSpeak performs as simultaneous users increase.
Performance by Hardware Configuration
| Configuration | Concurrent Users | Avg Response | P95 Response | Recommended For |
|---|---|---|---|---|
| Minimum (8-core, 32GB, 16GB VRAM) | 10–20 | 4.5s | 9s | Small teams, pilot |
| Standard (16-core, 64GB, 24GB VRAM) | 50 | 3.5s | 7s | Departments, 50–100 users |
| Enterprise (32-core, 128GB, 80GB VRAM) | 100–200 | 2.8s | 6s | Large orgs, high concurrency |
| Enterprise cluster (multi-node) | 500+ | 2.5s | 5s | Enterprise-wide rollout |
Model Performance Comparison
Different models have different speed/quality tradeoffs. Here is how they compare on standard spreadsheet analysis tasks.
| Model | Type | Avg Response | Quality | Best For |
|---|---|---|---|---|
| DeepSeek-V2 | Open-source | 3.5s | High | General analysis, Chinese language |
| Qwen2.5-72B | Open-source | 4.1s | High | Multilingual, structured data |
| GPT-4o | Closed-source (API) | 2.8s | Very High | Complex reasoning, English |
| Claude 3.5 Sonnet | Closed-source (API) | 3.2s | Very High | Long documents, nuanced output |
| Gemini 1.5 Pro | Closed-source (API) | 3.0s | High | Mixed media, large context |
Closed-source model response times depend on the provider's API latency and your network connection to their endpoints.
Stability and Uptime
RowSpeak Private Deployment is designed for continuous operation.
- Target uptime: 99.9% (less than 9 hours downtime per year)
- Graceful degradation: if the model layer is temporarily unavailable, the application layer continues to serve cached results
- Restart recovery: full service recovery in under 60 seconds after a planned restart
- Memory stability: no memory leaks observed in 30-day continuous operation tests
File Processing Performance
| File Type | File Size | Processing Time |
|---|---|---|
| Single-sheet CSV | < 1 MB | < 1s |
| Multi-sheet Excel | 5 MB | 2–4s |
| Large Excel workbook | 50 MB | 8–15s |
| PDF with tables | 10 MB | 5–10s |
| Batch (10 files) | 50 MB total | 20–40s |
Planning Your Deployment
Use the hardware sizing table above as a starting point. For a more precise recommendation based on your team size, file types, and usage patterns, request the Deployment Pack which includes a sizing worksheet.
For a live performance demonstration using your own file types, book a demo.