March 2026 — 10 edge models, ~100 devices, 913 benchmarks on real silicon

Edge AI Benchmarks

913 benchmarks across ~100 real devices — phones, tablets, MCUs, automotive, XR headsets. 10 edge models across 8 domains, profiled on Snapdragon NPUs, Apple ANE, LiteRT, and bare-metal Cortex-M. Every model sub-60 μs on the best phone. The numbers below aren't projections. They're 19 hours of real silicon.

0

Benchmarks

~100

Devices

0 μs

Fastest (NPU)

0 μs

Fastest (LiteRT)

0

Passed

0

Models

Edge Models

10 models, all sub-1 MB, all sub-60μs on the best Snapdragon NPU. Every model profiled across 72 phone/tablet/IoT devices. Fastest per model highlighted.

Scroll horizontally to see all columns

ModelAccuracySnapdragonANELiteRTEnergySize
Wake WordVoice98.1%32 μs118 μs349 μs25.7 μJ229 KB
ECG ArrhythmiaCardiac~100%36 μs41 μs15 μs2.1 μJ101 KB
TS Anomaly DetectorTime Series100%39 μs38 μs7 μs1.0 μJ36 KB
NanoVision CIFAR-10Vision84.9%41 μs83 μs36 μs5.4 μJ83 KB
FinSense ENFinance83.9%42 μs60 μs7.5 μJ632 KB
Speech EmotionAudio62.2%45 μs109 μs83 μs12.5 μJ229 KB
Bearing AnomalyIndustrial100%46 μs51 μs32 μs4.3 μJ40 KB
HAR ActivityIMU95.6%54 μs49 μs35 μs5.1 μJ29 KB
Audio EventAudio87.5%55 μs142 μs127 μs16.6 μJ230 KB
Fall DetectionIMU97.5%58 μs60 μs57 μs7.4 μJ29 KB
Qualcomm AI Hub · 72 devicesCoreML · M4 Max ANEGoogle LiteRT · XNNPACK

Honest note: Speech Emotion (62.2%) and FinSense edge (83.9%) are the weakest performers — audio emotion is a hard problem at sub-1 MB, and the FinSense edge variant trades accuracy for 180× size reduction. Distillation continues. We publish the real numbers, not just the good ones.

Domain Models

Purpose-built models for legal and tax work. They understand your documents, classify them accurately, and run on your infrastructure — so client data never leaves your network.

LawSense

Legal
99.3%
Accuracy
0.992 macro F1
F1 Score
704 MB
Model Size

Drop 10,000 documents from discovery. LawSense sorts them — contract, pleading, correspondence, regulation, opinion — at 99.3% accuracy. On the firm’s own hardware. Client files never leave the building.

DeBERTa-v3GPU Inference

TaxSense

Tax
86%
Accuracy
0.86 macro F1
F1 Score
<5 MB
Model Size

Reads tax rulings and classifies them: deductions, credits, penalties, procedure. Then shows you exactly which phrases drove the decision. Not a black box — it explains itself. That matters when the IRS asks.

DeBERTa-v3GPU Inference

Multi-Platform Latency

Side-by-side across Snapdragon NPU, Apple ANE, and Google LiteRT. Plus 29 MCU/SBC boards via Edge Impulse. LiteRT dominates signal models. Snapdragon leads audio.

Energy per Inference

Measured via IOReport on M4 Max. Microjoules per inference — real power, not estimates.

CoreML/ANE vs LiteRT/XNNPACK (μJ)

Best: TS Anomaly at 1.01 μJ (LiteRT)

Inferences per AA Battery

AA = 9,360 J. Best energy path per model.

TS Anomaly9.3B
ECG4.5B
Bearing2.2B
HAR1.8B
NanoVision1.7B
Fall1.3B
FinSense1.3B
Speech Emotion746M
Audio Event563M
Wake Word365M

Sustained Throughput

Inferences per second over 3-second burst. Peak: 134,775 inf/s (TS Anomaly, LiteRT).

Distillation: 420 MB → 1.46 MB

288x compression from BERT teacher to NanoCNN student. The sub-2MB Pareto frontier.

BERT TeacherDistilBERTNanoCNN FP32NanoCNN INT8

Platform Strengths

Four platform families, distinct strengths.

Google LiteRT

XNNPACK on M4 Max CPU

7 μs

134K inf/s · 1.01 μJ

Fastest on most models. Dominates 1D signals (TS, ECG, HAR, Bearing, Vision). Best throughput and energy. Runs anywhere with a CPU.

Apple M4 Max ANE

CoreML Neural Engine

38 μs

26K inf/s · 3.91 μJ

Dedicated neural hardware for sustained always-on inference. Best on wake word and audio. Real power via IOReport.

Snapdragon NPU

72 devices · Pixel 3 to X2 Elite

32 μs

30× faster than ResNet-50

All 10 models sub-60μs on latest phones. 72 real devices profiled via QAI Hub. Thermally invisible — 0.0°C delta at max throughput.

vs. Standard Models

Same hardware, same NPU, same measurement. ResNet-50, EfficientNet-B0, MobileNetV2 — all three baselines profiled on 19 Snapdragon devices. Our models are 5-30× faster and 20-3,350× smaller.

Snapdragon NPU Latency Comparison

ResNet-50 (97.4 MB)964 μs Standard baseline
EfficientNet-B0 (20.2 MB)655 μs "Efficient" baseline
MobileNetV2 (13.3 MB)294 μs Mobile baseline
Fall Detection (29 KB)58 μs 5-17× faster
TS Anomaly (36 KB)39 μs 8-25× faster
Wake Word (229 KB)32 μs 9-30× faster

All on same Snapdragon NPU via Qualcomm AI Hub. 57/57 baseline jobs completed, zero failures. Our models: 5-30\u00d7 faster and 20-3,350\u00d7 smaller than every standard baseline.

Thermal Impact

At maximum sustained throughput (10,000+ inferences/sec for 60 seconds), our models produce zero measurable temperature increase on the host device.

Bearing Anomaly

603,654 inferences at 10,061/s

0.0°C

TS Anomaly

226,172 inferences at 3,770/s

0.0°C

60-second sustained burst on MacBook Pro M4 Max. No throttling detected.

FinSense Live

Real market headlines scored for sentiment. Demo — production model runs on-device at 42 μs.

Live headlines via Finnhub · Demo sentiment

Device Compatibility

913 benchmarks across 101 real devices. Phones, tablets, MCUs, automotive, XR. Search your device — see the exact latency. We publish every result, including the 67 failures.

913
Benchmarks
101
Devices
93.8%
Pass Rate
39 μs
Fastest (NPU)
<1 ms
Fastest (MCU)
67
Failures

Wake Word

Voice · 229 KB
97/101
NPU best: 32 μs
Throughput: 31,250/s
MCU RAM: 322 KB
CR2032: 2 days

ECG Arrhythmia

Cardiac · 101 KB
71/72
NPU best: 36 μs
Throughput: 27,777/s

TS Anomaly

Time Series · 36 KB
97/101
NPU best: 39 μs
Throughput: 25,641/s
MCU RAM: 16 KB
CR2032: 2 days

NanoVision

Vision · 83 KB
98/101
NPU best: 41 μs
Throughput: 24,390/s
MCU RAM: 258 KB
CR2032: 2 days

FinSense EN

Finance · 632 KB
71/101
NPU best: 42 μs
Throughput: 23,809/s

Speech Emotion

Audio · 229 KB
71/72
NPU best: 45 μs
Throughput: 22,222/s

Bearing Anomaly

Industrial · 40 KB
98/101
NPU best: 46 μs
Throughput: 21,739/s
MCU RAM: 82 KB
CR2032: 2 days

HAR Activity

IMU · 29 KB
98/101
NPU best: 54 μs
Throughput: 18,518/s
MCU RAM: 259 KB
CR2032: 2 days

Audio Event

Audio · 230 KB
57/72
NPU best: 55 μs
Throughput: 18,181/s

Fall Detection

IMU · 29 KB
98/101
NPU best: 58 μs
Throughput: 17,241/s
MCU RAM: 403 KB
CR2032: 2 days

923 results

Scroll horizontally to see all columns

DeviceChipModelLatencyStatusSource
Snapdragon X2 Elite CRDSnapdragon X2 EliteWake Word32μsPASSQAI
Snapdragon X2 Elite CRDSnapdragon X2 EliteECG Arrhythmia36μsPASSQAI
Google Pixel 10 Pro XLSnapdragon 8s EliteTS Anomaly39μsPASSQAI
Snapdragon X2 Elite CRDSnapdragon X2 EliteTS Anomaly40μsPASSQAI
Snapdragon X2 Elite CRDSnapdragon X2 EliteNanoVision41μsPASSQAI
Snapdragon X Plus 8-Core CRDSnapdragon X PlusWake Word42μsPASSQAI
Snapdragon X2 Elite CRDSnapdragon X2 EliteFinSense EN42μsPASSQAI
Google Pixel 9 ProTensor G4TS Anomaly43μsPASSQAI
Samsung Galaxy S25+Snapdragon 8 EliteTS Anomaly43μsPASSQAI
Snapdragon X2 Elite CRDSnapdragon X2 EliteSpeech Emotion45μsPASSQAI
Snapdragon 8 Elite QRDSnapdragon 8 EliteWake Word46μsPASSQAI
Samsung Galaxy S25 (Family)UnknownNanoVision46μsPASSQAI
Snapdragon 8 Elite Gen 5 QRDSnapdragon 8 EliteNanoVision46μsPASSQAI
Google Pixel 10Snapdragon 8s EliteBearing Anomaly46μsPASSQAI
Google Pixel 10 Pro XLSnapdragon 8s EliteBearing Anomaly46μsPASSQAI
Samsung Galaxy S25+Snapdragon 8 EliteWake Word47μsPASSQAI
Samsung Galaxy S25+Snapdragon 8 EliteNanoVision47μsPASSQAI
Snapdragon X Plus 8-Core CRDSnapdragon X PlusNanoVision47μsPASSQAI
Samsung Galaxy S25UnknownWake Word48μsPASSQAI
Snapdragon 8 Elite QRDSnapdragon 8 EliteTS Anomaly48μsPASSQAI
Samsung Galaxy S25UnknownNanoVision48μsPASSQAI
Samsung Galaxy S25+Snapdragon 8 EliteFinSense EN48μsPASSQAI
Samsung Galaxy S24Snapdragon 8 Gen 3Wake Word49μsPASSQAI
Samsung Galaxy S25 (Family)UnknownECG Arrhythmia49μsPASSQAI
Google Pixel 6 (Family)Tensor G1TS Anomaly49μsPASSQAI
Google Pixel 7 (Family)Tensor G2TS Anomaly49μsPASSQAI
Snapdragon X Plus 8-Core CRDSnapdragon X PlusTS Anomaly49μsPASSQAI
Samsung Galaxy S25UnknownFinSense EN49μsPASSQAI
Samsung Galaxy S25 (Family)UnknownWake Word50μsPASSQAI
Snapdragon 8 Elite Gen 5 QRDSnapdragon 8 EliteWake Word50μsPASSQAI
Samsung Galaxy S25UnknownTS Anomaly50μsPASSQAI
Samsung Galaxy S25 UltraSnapdragon 8 EliteNanoVision50μsPASSQAI
Samsung Galaxy S25 UltraSnapdragon 8 EliteWake Word51μsPASSQAI
Samsung Galaxy S25 UltraSnapdragon 8 EliteECG Arrhythmia51μsPASSQAI
Google Pixel 10Snapdragon 8s EliteTS Anomaly51μsPASSQAI
Snapdragon 7 Gen 4 QRDSnapdragon 7 Gen 4Bearing Anomaly51μsPASSQAI
Samsung Galaxy S25+Snapdragon 8 EliteECG Arrhythmia52μsPASSQAI
Snapdragon 8 Elite Gen 5 QRDSnapdragon 8 EliteECG Arrhythmia52μsPASSQAI
Google Pixel 6Tensor G1TS Anomaly52μsPASSQAI
Samsung Galaxy S25 UltraSnapdragon 8 EliteTS Anomaly52μsPASSQAI
Snapdragon X Plus 8-Core CRDSnapdragon X PlusFinSense EN52μsPASSQAI
Samsung Galaxy S24 (Family)Snapdragon 8 Gen 3Wake Word53μsPASSQAI
Samsung Galaxy S25UnknownECG Arrhythmia53μsPASSQAI
Google Pixel 7Tensor G2TS Anomaly53μsPASSQAI
Samsung Galaxy S25 (Family)UnknownTS Anomaly53μsPASSQAI
Snapdragon 8 Elite QRDSnapdragon 8 EliteNanoVision53μsPASSQAI
Samsung Galaxy S25 (Family)UnknownFinSense EN53μsPASSQAI
Samsung Galaxy S25 (Family)UnknownSpeech Emotion53μsPASSQAI
Samsung Galaxy S25+Snapdragon 8 EliteSpeech Emotion53μsPASSQAI
Google Pixel 8Tensor G3Bearing Anomaly53μsPASSQAI
Snapdragon 8 Elite Gen 5 QRDSnapdragon 8 EliteTS Anomaly54μsPASSQAI
Snapdragon 8 Elite Gen 5 QRDSnapdragon 8 EliteSpeech Emotion54μsPASSQAI
Snapdragon X2 Elite CRDSnapdragon X2 EliteHAR Activity54μsPASSQAI
Samsung Galaxy S24+Snapdragon 8 Gen 3Wake Word55μsPASSQAI
Google Pixel 10 Pro XLSnapdragon 8s EliteECG Arrhythmia55μsPASSQAI
Snapdragon 8 Elite Gen 5 QRDSnapdragon 8 EliteFinSense EN55μsPASSQAI
Samsung Galaxy S25 UltraSnapdragon 8 EliteSpeech Emotion55μsPASSQAI
Samsung Galaxy S25UnknownAudio Event55μsPASSQAI
Google Pixel 10Snapdragon 8s EliteECG Arrhythmia56μsPASSQAI
Google Pixel 7 ProTensor G2ECG Arrhythmia56μsPASSQAI
Snapdragon X Plus 8-Core CRDSnapdragon X PlusECG Arrhythmia56μsPASSQAI
Samsung Galaxy S25 UltraSnapdragon 8 EliteFinSense EN56μsPASSQAI
Samsung Galaxy S25UnknownSpeech Emotion56μsPASSQAI
Snapdragon X Plus 8-Core CRDSnapdragon X PlusSpeech Emotion56μsPASSQAI
Samsung Galaxy S25 (Family)UnknownAudio Event56μsPASSQAI
Samsung Galaxy S24 UltraSnapdragon 8 Gen 3Wake Word57μsPASSQAI
Snapdragon 8 Elite QRDSnapdragon 8 EliteECG Arrhythmia57μsPASSQAI
Google Pixel 7 ProTensor G2TS Anomaly57μsPASSQAI
Samsung Galaxy S24Snapdragon 8 Gen 3TS Anomaly57μsPASSQAI
Samsung Galaxy S24 UltraSnapdragon 8 Gen 3TS Anomaly57μsPASSQAI
Google Pixel 8 (Family)Tensor G3Bearing Anomaly57μsPASSQAI
Samsung Galaxy S24 (Family)Snapdragon 8 Gen 3ECG Arrhythmia58μsPASSQAI
Samsung Galaxy S24 UltraSnapdragon 8 Gen 3ECG Arrhythmia58μsPASSQAI
Samsung Galaxy S24 UltraSnapdragon 8 Gen 3NanoVision58μsPASSQAI
Snapdragon X2 Elite CRDSnapdragon X2 EliteFall Detection58μsPASSQAI
Samsung Galaxy S24 (Family)Snapdragon 8 Gen 3TS Anomaly59μsPASSQAI
Samsung Galaxy S24+Snapdragon 8 Gen 3TS Anomaly59μsPASSQAI
Samsung Galaxy S24Snapdragon 8 Gen 3NanoVision59μsPASSQAI
Samsung Galaxy S24 (Family)Snapdragon 8 Gen 3NanoVision59μsPASSQAI
Google Pixel 8 ProTensor G3Bearing Anomaly59μsPASSQAI
Samsung Galaxy S24+Snapdragon 8 Gen 3Audio Event59μsPASSQAI
Google Pixel 7 (Family)Tensor G2ECG Arrhythmia60μsPASSQAI
Samsung Galaxy S24+Snapdragon 8 Gen 3ECG Arrhythmia60μsPASSQAI
Google Pixel 9Tensor G4TS Anomaly60μsPASSQAI
Samsung Galaxy S24+Snapdragon 8 Gen 3Speech Emotion60μsPASSQAI
Samsung Galaxy S25 (Family)UnknownHAR Activity60μsPASSQAI
Samsung Galaxy S25+Snapdragon 8 EliteHAR Activity60μsPASSQAI
Samsung Galaxy S24Snapdragon 8 Gen 3Audio Event60μsPASSQAI
Samsung Galaxy S24 (Family)Snapdragon 8 Gen 3Speech Emotion61μsPASSQAI
Samsung Galaxy S25 UltraSnapdragon 8 EliteHAR Activity61μsPASSQAI
Samsung Galaxy S23Snapdragon 8 Gen 2NanoVision62μsPASSQAI
Samsung Galaxy S24+Snapdragon 8 Gen 3NanoVision62μsPASSQAI
Samsung Galaxy S24+Snapdragon 8 Gen 3FinSense EN62μsPASSQAI
Samsung Galaxy S24Snapdragon 8 Gen 3Speech Emotion62μsPASSQAI
Snapdragon 8 Elite QRDSnapdragon 8 EliteSpeech Emotion62μsPASSQAI
Google Pixel 6 (Family)Tensor G1Bearing Anomaly62μsPASSQAI
Samsung Galaxy S24 (Family)Snapdragon 8 Gen 3FinSense EN63μsPASSQAI
Snapdragon 8 Elite QRDSnapdragon 8 EliteFinSense EN63μsPASSQAI
Samsung Galaxy S21Snapdragon 888Bearing Anomaly63μsPASSQAI
Samsung Galaxy S21 UltraSnapdragon 888Bearing Anomaly63μsPASSQAI
Showing 100 of 923 results. Use search to filter.

Methodology

Qualcomm AI Hub

ONNX models profiled on real devices via cloud-hosted NPU profiling across 72 phones, tablets, XR headsets, and automotive platforms. Median latency reported.

Edge Impulse

TFLite models deployed and profiled on 29 MCU/MPU boards from Cortex-M4F to Jetson Orin. Includes BrainChip neuromorphic, NXP NPU, and Synaptics edge accelerators.

Apple CoreML / ANE

ONNX → PyTorch trace → CoreML via coremltools. Benchmarked natively in Swift on M4 Max Neural Engine. 3-second sustained burst.

Google AI Edge LiteRT

ONNX → TFLite via onnx2tf. Benchmarked with XNNPACK delegate on M4 Max CPU. 3-second sustained burst.

Power Measurement

Real power via Apple IOReport (no sudo). CPU+GPU channels sampled during inference. Energy = (power × time) / count. Idle baseline subtracted.

Failures

We publish every failure. Galaxy Tab A8 (2021) fails on QAI (device farm issue). Memryx MX3 and Qualcomm RB3 Gen 2 DK fail on Edge Impulse (platform bugs, not model issues).