Edge AI Benchmarks
913 benchmarks across ~100 real devices — phones, tablets, MCUs, automotive, XR headsets. 10 edge models across 8 domains, profiled on Snapdragon NPUs, Apple ANE, LiteRT, and bare-metal Cortex-M. Every model sub-60 μs on the best phone. The numbers below aren't projections. They're 19 hours of real silicon.
Benchmarks
Devices
Fastest (NPU)
Fastest (LiteRT)
Passed
Models
Edge Models
10 models, all sub-1 MB, all sub-60μs on the best Snapdragon NPU. Every model profiled across 72 phone/tablet/IoT devices. Fastest per model highlighted.
Scroll horizontally to see all columns
| Model | Accuracy | Snapdragon | ANE | LiteRT | Energy | Size |
|---|---|---|---|---|---|---|
| Wake WordVoice | 98.1% | 32 μs | 118 μs | 349 μs | 25.7 μJ | 229 KB |
| ECG ArrhythmiaCardiac | ~100% | 36 μs | 41 μs | 15 μs | 2.1 μJ | 101 KB |
| TS Anomaly DetectorTime Series | 100% | 39 μs | 38 μs | 7 μs | 1.0 μJ | 36 KB |
| NanoVision CIFAR-10Vision | 84.9% | 41 μs | 83 μs | 36 μs | 5.4 μJ | 83 KB |
| FinSense ENFinance | 83.9% | 42 μs | 60 μs | — | 7.5 μJ | 632 KB |
| Speech EmotionAudio | 62.2% | 45 μs | 109 μs | 83 μs | 12.5 μJ | 229 KB |
| Bearing AnomalyIndustrial | 100% | 46 μs | 51 μs | 32 μs | 4.3 μJ | 40 KB |
| HAR ActivityIMU | 95.6% | 54 μs | 49 μs | 35 μs | 5.1 μJ | 29 KB |
| Audio EventAudio | 87.5% | 55 μs | 142 μs | 127 μs | 16.6 μJ | 230 KB |
| Fall DetectionIMU | 97.5% | 58 μs | 60 μs | 57 μs | 7.4 μJ | 29 KB |
Honest note: Speech Emotion (62.2%) and FinSense edge (83.9%) are the weakest performers — audio emotion is a hard problem at sub-1 MB, and the FinSense edge variant trades accuracy for 180× size reduction. Distillation continues. We publish the real numbers, not just the good ones.
Domain Models
Purpose-built models for legal and tax work. They understand your documents, classify them accurately, and run on your infrastructure — so client data never leaves your network.
LawSense
LegalDrop 10,000 documents from discovery. LawSense sorts them — contract, pleading, correspondence, regulation, opinion — at 99.3% accuracy. On the firm’s own hardware. Client files never leave the building.
TaxSense
TaxReads tax rulings and classifies them: deductions, credits, penalties, procedure. Then shows you exactly which phrases drove the decision. Not a black box — it explains itself. That matters when the IRS asks.
Multi-Platform Latency
Side-by-side across Snapdragon NPU, Apple ANE, and Google LiteRT. Plus 29 MCU/SBC boards via Edge Impulse. LiteRT dominates signal models. Snapdragon leads audio.
Energy per Inference
Measured via IOReport on M4 Max. Microjoules per inference — real power, not estimates.
CoreML/ANE vs LiteRT/XNNPACK (μJ)
Best: TS Anomaly at 1.01 μJ (LiteRT)
Inferences per AA Battery
AA = 9,360 J. Best energy path per model.
Sustained Throughput
Inferences per second over 3-second burst. Peak: 134,775 inf/s (TS Anomaly, LiteRT).
Distillation: 420 MB → 1.46 MB
288x compression from BERT teacher to NanoCNN student. The sub-2MB Pareto frontier.
Platform Strengths
Four platform families, distinct strengths.
Google LiteRT
XNNPACK on M4 Max CPU
134K inf/s · 1.01 μJ
Fastest on most models. Dominates 1D signals (TS, ECG, HAR, Bearing, Vision). Best throughput and energy. Runs anywhere with a CPU.
Apple M4 Max ANE
CoreML Neural Engine
26K inf/s · 3.91 μJ
Dedicated neural hardware for sustained always-on inference. Best on wake word and audio. Real power via IOReport.
Snapdragon NPU
72 devices · Pixel 3 to X2 Elite
30× faster than ResNet-50
All 10 models sub-60μs on latest phones. 72 real devices profiled via QAI Hub. Thermally invisible — 0.0°C delta at max throughput.
vs. Standard Models
Same hardware, same NPU, same measurement. ResNet-50, EfficientNet-B0, MobileNetV2 — all three baselines profiled on 19 Snapdragon devices. Our models are 5-30× faster and 20-3,350× smaller.
Snapdragon NPU Latency Comparison
All on same Snapdragon NPU via Qualcomm AI Hub. 57/57 baseline jobs completed, zero failures. Our models: 5-30\u00d7 faster and 20-3,350\u00d7 smaller than every standard baseline.
Thermal Impact
At maximum sustained throughput (10,000+ inferences/sec for 60 seconds), our models produce zero measurable temperature increase on the host device.
Bearing Anomaly
603,654 inferences at 10,061/s
TS Anomaly
226,172 inferences at 3,770/s
60-second sustained burst on MacBook Pro M4 Max. No throttling detected.
FinSense Live
Real market headlines scored for sentiment. Demo — production model runs on-device at 42 μs.
Device Compatibility
913 benchmarks across 101 real devices. Phones, tablets, MCUs, automotive, XR. Search your device — see the exact latency. We publish every result, including the 67 failures.
Wake Word
Voice · 229 KBECG Arrhythmia
Cardiac · 101 KBTS Anomaly
Time Series · 36 KBNanoVision
Vision · 83 KBFinSense EN
Finance · 632 KBSpeech Emotion
Audio · 229 KBBearing Anomaly
Industrial · 40 KBHAR Activity
IMU · 29 KBAudio Event
Audio · 230 KBFall Detection
IMU · 29 KB923 results
Scroll horizontally to see all columns
| Device | Chip | Model | Latency | Status | Source |
|---|---|---|---|---|---|
| Snapdragon X2 Elite CRD | Snapdragon X2 Elite | Wake Word | 32μs | PASS | QAI |
| Snapdragon X2 Elite CRD | Snapdragon X2 Elite | ECG Arrhythmia | 36μs | PASS | QAI |
| Google Pixel 10 Pro XL | Snapdragon 8s Elite | TS Anomaly | 39μs | PASS | QAI |
| Snapdragon X2 Elite CRD | Snapdragon X2 Elite | TS Anomaly | 40μs | PASS | QAI |
| Snapdragon X2 Elite CRD | Snapdragon X2 Elite | NanoVision | 41μs | PASS | QAI |
| Snapdragon X Plus 8-Core CRD | Snapdragon X Plus | Wake Word | 42μs | PASS | QAI |
| Snapdragon X2 Elite CRD | Snapdragon X2 Elite | FinSense EN | 42μs | PASS | QAI |
| Google Pixel 9 Pro | Tensor G4 | TS Anomaly | 43μs | PASS | QAI |
| Samsung Galaxy S25+ | Snapdragon 8 Elite | TS Anomaly | 43μs | PASS | QAI |
| Snapdragon X2 Elite CRD | Snapdragon X2 Elite | Speech Emotion | 45μs | PASS | QAI |
| Snapdragon 8 Elite QRD | Snapdragon 8 Elite | Wake Word | 46μs | PASS | QAI |
| Samsung Galaxy S25 (Family) | Unknown | NanoVision | 46μs | PASS | QAI |
| Snapdragon 8 Elite Gen 5 QRD | Snapdragon 8 Elite | NanoVision | 46μs | PASS | QAI |
| Google Pixel 10 | Snapdragon 8s Elite | Bearing Anomaly | 46μs | PASS | QAI |
| Google Pixel 10 Pro XL | Snapdragon 8s Elite | Bearing Anomaly | 46μs | PASS | QAI |
| Samsung Galaxy S25+ | Snapdragon 8 Elite | Wake Word | 47μs | PASS | QAI |
| Samsung Galaxy S25+ | Snapdragon 8 Elite | NanoVision | 47μs | PASS | QAI |
| Snapdragon X Plus 8-Core CRD | Snapdragon X Plus | NanoVision | 47μs | PASS | QAI |
| Samsung Galaxy S25 | Unknown | Wake Word | 48μs | PASS | QAI |
| Snapdragon 8 Elite QRD | Snapdragon 8 Elite | TS Anomaly | 48μs | PASS | QAI |
| Samsung Galaxy S25 | Unknown | NanoVision | 48μs | PASS | QAI |
| Samsung Galaxy S25+ | Snapdragon 8 Elite | FinSense EN | 48μs | PASS | QAI |
| Samsung Galaxy S24 | Snapdragon 8 Gen 3 | Wake Word | 49μs | PASS | QAI |
| Samsung Galaxy S25 (Family) | Unknown | ECG Arrhythmia | 49μs | PASS | QAI |
| Google Pixel 6 (Family) | Tensor G1 | TS Anomaly | 49μs | PASS | QAI |
| Google Pixel 7 (Family) | Tensor G2 | TS Anomaly | 49μs | PASS | QAI |
| Snapdragon X Plus 8-Core CRD | Snapdragon X Plus | TS Anomaly | 49μs | PASS | QAI |
| Samsung Galaxy S25 | Unknown | FinSense EN | 49μs | PASS | QAI |
| Samsung Galaxy S25 (Family) | Unknown | Wake Word | 50μs | PASS | QAI |
| Snapdragon 8 Elite Gen 5 QRD | Snapdragon 8 Elite | Wake Word | 50μs | PASS | QAI |
| Samsung Galaxy S25 | Unknown | TS Anomaly | 50μs | PASS | QAI |
| Samsung Galaxy S25 Ultra | Snapdragon 8 Elite | NanoVision | 50μs | PASS | QAI |
| Samsung Galaxy S25 Ultra | Snapdragon 8 Elite | Wake Word | 51μs | PASS | QAI |
| Samsung Galaxy S25 Ultra | Snapdragon 8 Elite | ECG Arrhythmia | 51μs | PASS | QAI |
| Google Pixel 10 | Snapdragon 8s Elite | TS Anomaly | 51μs | PASS | QAI |
| Snapdragon 7 Gen 4 QRD | Snapdragon 7 Gen 4 | Bearing Anomaly | 51μs | PASS | QAI |
| Samsung Galaxy S25+ | Snapdragon 8 Elite | ECG Arrhythmia | 52μs | PASS | QAI |
| Snapdragon 8 Elite Gen 5 QRD | Snapdragon 8 Elite | ECG Arrhythmia | 52μs | PASS | QAI |
| Google Pixel 6 | Tensor G1 | TS Anomaly | 52μs | PASS | QAI |
| Samsung Galaxy S25 Ultra | Snapdragon 8 Elite | TS Anomaly | 52μs | PASS | QAI |
| Snapdragon X Plus 8-Core CRD | Snapdragon X Plus | FinSense EN | 52μs | PASS | QAI |
| Samsung Galaxy S24 (Family) | Snapdragon 8 Gen 3 | Wake Word | 53μs | PASS | QAI |
| Samsung Galaxy S25 | Unknown | ECG Arrhythmia | 53μs | PASS | QAI |
| Google Pixel 7 | Tensor G2 | TS Anomaly | 53μs | PASS | QAI |
| Samsung Galaxy S25 (Family) | Unknown | TS Anomaly | 53μs | PASS | QAI |
| Snapdragon 8 Elite QRD | Snapdragon 8 Elite | NanoVision | 53μs | PASS | QAI |
| Samsung Galaxy S25 (Family) | Unknown | FinSense EN | 53μs | PASS | QAI |
| Samsung Galaxy S25 (Family) | Unknown | Speech Emotion | 53μs | PASS | QAI |
| Samsung Galaxy S25+ | Snapdragon 8 Elite | Speech Emotion | 53μs | PASS | QAI |
| Google Pixel 8 | Tensor G3 | Bearing Anomaly | 53μs | PASS | QAI |
| Snapdragon 8 Elite Gen 5 QRD | Snapdragon 8 Elite | TS Anomaly | 54μs | PASS | QAI |
| Snapdragon 8 Elite Gen 5 QRD | Snapdragon 8 Elite | Speech Emotion | 54μs | PASS | QAI |
| Snapdragon X2 Elite CRD | Snapdragon X2 Elite | HAR Activity | 54μs | PASS | QAI |
| Samsung Galaxy S24+ | Snapdragon 8 Gen 3 | Wake Word | 55μs | PASS | QAI |
| Google Pixel 10 Pro XL | Snapdragon 8s Elite | ECG Arrhythmia | 55μs | PASS | QAI |
| Snapdragon 8 Elite Gen 5 QRD | Snapdragon 8 Elite | FinSense EN | 55μs | PASS | QAI |
| Samsung Galaxy S25 Ultra | Snapdragon 8 Elite | Speech Emotion | 55μs | PASS | QAI |
| Samsung Galaxy S25 | Unknown | Audio Event | 55μs | PASS | QAI |
| Google Pixel 10 | Snapdragon 8s Elite | ECG Arrhythmia | 56μs | PASS | QAI |
| Google Pixel 7 Pro | Tensor G2 | ECG Arrhythmia | 56μs | PASS | QAI |
| Snapdragon X Plus 8-Core CRD | Snapdragon X Plus | ECG Arrhythmia | 56μs | PASS | QAI |
| Samsung Galaxy S25 Ultra | Snapdragon 8 Elite | FinSense EN | 56μs | PASS | QAI |
| Samsung Galaxy S25 | Unknown | Speech Emotion | 56μs | PASS | QAI |
| Snapdragon X Plus 8-Core CRD | Snapdragon X Plus | Speech Emotion | 56μs | PASS | QAI |
| Samsung Galaxy S25 (Family) | Unknown | Audio Event | 56μs | PASS | QAI |
| Samsung Galaxy S24 Ultra | Snapdragon 8 Gen 3 | Wake Word | 57μs | PASS | QAI |
| Snapdragon 8 Elite QRD | Snapdragon 8 Elite | ECG Arrhythmia | 57μs | PASS | QAI |
| Google Pixel 7 Pro | Tensor G2 | TS Anomaly | 57μs | PASS | QAI |
| Samsung Galaxy S24 | Snapdragon 8 Gen 3 | TS Anomaly | 57μs | PASS | QAI |
| Samsung Galaxy S24 Ultra | Snapdragon 8 Gen 3 | TS Anomaly | 57μs | PASS | QAI |
| Google Pixel 8 (Family) | Tensor G3 | Bearing Anomaly | 57μs | PASS | QAI |
| Samsung Galaxy S24 (Family) | Snapdragon 8 Gen 3 | ECG Arrhythmia | 58μs | PASS | QAI |
| Samsung Galaxy S24 Ultra | Snapdragon 8 Gen 3 | ECG Arrhythmia | 58μs | PASS | QAI |
| Samsung Galaxy S24 Ultra | Snapdragon 8 Gen 3 | NanoVision | 58μs | PASS | QAI |
| Snapdragon X2 Elite CRD | Snapdragon X2 Elite | Fall Detection | 58μs | PASS | QAI |
| Samsung Galaxy S24 (Family) | Snapdragon 8 Gen 3 | TS Anomaly | 59μs | PASS | QAI |
| Samsung Galaxy S24+ | Snapdragon 8 Gen 3 | TS Anomaly | 59μs | PASS | QAI |
| Samsung Galaxy S24 | Snapdragon 8 Gen 3 | NanoVision | 59μs | PASS | QAI |
| Samsung Galaxy S24 (Family) | Snapdragon 8 Gen 3 | NanoVision | 59μs | PASS | QAI |
| Google Pixel 8 Pro | Tensor G3 | Bearing Anomaly | 59μs | PASS | QAI |
| Samsung Galaxy S24+ | Snapdragon 8 Gen 3 | Audio Event | 59μs | PASS | QAI |
| Google Pixel 7 (Family) | Tensor G2 | ECG Arrhythmia | 60μs | PASS | QAI |
| Samsung Galaxy S24+ | Snapdragon 8 Gen 3 | ECG Arrhythmia | 60μs | PASS | QAI |
| Google Pixel 9 | Tensor G4 | TS Anomaly | 60μs | PASS | QAI |
| Samsung Galaxy S24+ | Snapdragon 8 Gen 3 | Speech Emotion | 60μs | PASS | QAI |
| Samsung Galaxy S25 (Family) | Unknown | HAR Activity | 60μs | PASS | QAI |
| Samsung Galaxy S25+ | Snapdragon 8 Elite | HAR Activity | 60μs | PASS | QAI |
| Samsung Galaxy S24 | Snapdragon 8 Gen 3 | Audio Event | 60μs | PASS | QAI |
| Samsung Galaxy S24 (Family) | Snapdragon 8 Gen 3 | Speech Emotion | 61μs | PASS | QAI |
| Samsung Galaxy S25 Ultra | Snapdragon 8 Elite | HAR Activity | 61μs | PASS | QAI |
| Samsung Galaxy S23 | Snapdragon 8 Gen 2 | NanoVision | 62μs | PASS | QAI |
| Samsung Galaxy S24+ | Snapdragon 8 Gen 3 | NanoVision | 62μs | PASS | QAI |
| Samsung Galaxy S24+ | Snapdragon 8 Gen 3 | FinSense EN | 62μs | PASS | QAI |
| Samsung Galaxy S24 | Snapdragon 8 Gen 3 | Speech Emotion | 62μs | PASS | QAI |
| Snapdragon 8 Elite QRD | Snapdragon 8 Elite | Speech Emotion | 62μs | PASS | QAI |
| Google Pixel 6 (Family) | Tensor G1 | Bearing Anomaly | 62μs | PASS | QAI |
| Samsung Galaxy S24 (Family) | Snapdragon 8 Gen 3 | FinSense EN | 63μs | PASS | QAI |
| Snapdragon 8 Elite QRD | Snapdragon 8 Elite | FinSense EN | 63μs | PASS | QAI |
| Samsung Galaxy S21 | Snapdragon 888 | Bearing Anomaly | 63μs | PASS | QAI |
| Samsung Galaxy S21 Ultra | Snapdragon 888 | Bearing Anomaly | 63μs | PASS | QAI |
Methodology
Qualcomm AI Hub
ONNX models profiled on real devices via cloud-hosted NPU profiling across 72 phones, tablets, XR headsets, and automotive platforms. Median latency reported.
Edge Impulse
TFLite models deployed and profiled on 29 MCU/MPU boards from Cortex-M4F to Jetson Orin. Includes BrainChip neuromorphic, NXP NPU, and Synaptics edge accelerators.
Apple CoreML / ANE
ONNX → PyTorch trace → CoreML via coremltools. Benchmarked natively in Swift on M4 Max Neural Engine. 3-second sustained burst.
Google AI Edge LiteRT
ONNX → TFLite via onnx2tf. Benchmarked with XNNPACK delegate on M4 Max CPU. 3-second sustained burst.
Power Measurement
Real power via Apple IOReport (no sudo). CPU+GPU channels sampled during inference. Energy = (power × time) / count. Idle baseline subtracted.
Failures
We publish every failure. Galaxy Tab A8 (2021) fails on QAI (device farm issue). Memryx MX3 and Qualcomm RB3 Gen 2 DK fail on Edge Impulse (platform bugs, not model issues).
Keep exploring