Music Analysis
Technology
TuneLab computes tempo, key, mood, and structure directly from the audio waveform — not cached metadata from a decade ago. Custom deep learning models analyse everything from lo-fi recordings to complex polyrhythms and heavily compressed masters. Every model runs identically on your device and via API.
Float-Precision Tempo. Not a Rounded Integer.
TuneLab feeds audio through a multi-resolution spectral analysis stage, then into a recurrent neural architecture with long-term temporal context that outputs beat and downbeat probabilities across the full track.
A probabilistic sequence model then finds the globally optimal beat sequence — no greedy peak-picking, no dropped beats. The result: float-precision BPM (±0.1), half/double-tempo alternatives, and per-beat timestamps. Handles syncopation, polyrhythms, tempo changes, and even tracks with no clear downbeat.
Pitch-Class Spectral Analysis. 82.1% on GiantSteps.
A custom pitch-class spectrogram feeds a convolutional network trained on pitch-class spectrograms for 24-class key classification (12 major + 12 minor). No external signal processing libraries — the entire spectral frontend is purpose-built for key detection.
Tested against the GiantSteps-MTG benchmark (the standard dataset for key detection evaluation), TuneLab achieves 82.1% accuracy — significantly outperforming classical template-matching approaches. Camelot wheel mapping is computed automatically from the 24-class output.
Six Features. Computed from Audio. Not Metadata.
A transformer-based audio embedding model, pre-trained on 100K+ annotated tracks, produces a rich spectral representation from a 30-second clip. Specialised regression heads fine-tuned on curated reference data then extract six continuous features.
Energy, danceability, happiness, acousticness, instrumentalness, speechiness — each genuinely computed from the audio waveform, not looked up in a database. Holdout metrics are published because TuneLab's numbers hold up to scrutiny.
TuneLab Knows Where the Drop Is.
A self-similarity analysis across the full track detects structural repetition, then a novelty-based boundary detector locates transitions between sections. Hierarchical clustering labels each segment: intro, verse, chorus, drop, breakdown, outro.
Every section comes with start/end timestamps and a confidence score — the kind of structural data that rhythm games, VJ tools, and DJ software need but lost when Spotify deprecated their /audio-analysis endpoint.
Two Models. Four Stems. GPU-Accelerated.
An ensemble of state-of-the-art transformer and hybrid architectures powers TuneLab's separation pipeline. Two-stem vocal isolation produces studio-quality splits in roughly 30 seconds; full 4-stem separation (vocals, drums, bass, other) completes in approximately 45 seconds.
Both run on dedicated GPU containers with automatic scaling, custom-optimised for high-throughput inference. Results are stored as presigned URLs with 24-hour expiry. No audio is retained beyond that window.
Active Inference vs Cached Metadata
| Capability | Spotify Audio Features (deprecated) | TuneLab |
|---|---|---|
| Method | Cached lookup from ~2014 models | Active inference from source waveform |
| Model era | Echo Nest (~2014), frozen | 2022–2024, continuously updated |
| Tempo precision | Rounded integer (116) | Float-precision (116.01), ±0.1 BPM |
| Key accuracy | Undisclosed | 82.1% on GiantSteps-MTG benchmark |
| Beat grid | Deprecated with /audio-analysis | Float-precision timestamps, per-beat |
| Song structure | Deprecated with /audio-analysis | Section labels + timestamps + confidence |
| Status | 403 Forbidden (Nov 2024) | Production API with published changelog |
9.3M Tracks Resolved. Growing with Every Query.
TuneLab maintains a continuously growing catalog of acoustic data across three tiers — each serving a different layer of the intelligence stack. Cache misses trigger real-time DSP and are cached permanently.
Live Beatmatching. Two Streams. 10ms Tick.
A digital phase-locked loop continuously tracks tempo and phase across two live audio streams, correcting pitch in real time to keep them beatmatched. The control loop runs at a 10ms tick rate with ±2% pitch bend range — fast enough to lock onto tempo drift within seconds.
This is the engine behind the live radio demo — two independent internet radio streams, beatmatched automatically, mixed live in the browser. No pre-analysis, no metadata. Just the audio signal.
WASM-Accelerated Pipeline. Real Transition Forensics.
TuneLab's mix analyser runs a multi-stage WASM-accelerated pipeline on actual audio: multiple independent detection methods work in concert through a voting system to locate transitions with sub-second precision. Phase-locked beat analysis then measures drift in milliseconds at each one.
Multiple genre profiles — each with calibrated tolerances and scoring weights — produce a composite assessment: technical precision, harmonic compatibility, energy flow, and EQ quality, graded A to F with a fully transparent breakdown.
Every score is derived from the waveform itself — not from metadata lookups or pre-computed averages. Every transition is timestamped. Every grade is decomposed into sub-scores with transparent weighting. The full methodology is published, and results are reproducible: same audio in, same analysis out.
Purpose-Built for Problems Others Ignore.
Four tools that solve problems no one else addresses — each built on custom spectral analysis, WASM-accelerated pipelines, and proprietary detection algorithms. All processing runs on your device.
32 Audio Tools. Zero Uploads. No Limits.
Every tool runs directly on your device — custom WASM DSP pipelines compiled from AssemblyScript, ONNX Runtime for neural inference, and the Web Audio API for real-time signal routing. SharedArrayBuffer enables true multi-threaded processing with near-native throughput.
No file ever leaves your machine. No account required. No processing caps. From BPM detection and key analysis to loudness metering and chord recognition — the full pipeline executes locally, with results computed from your actual audio waveform.
Real Analysis. Not Cached Metadata.
A REST API built on the same DSP and neural inference that powers the tools. Synchronous responses for lookups and analysis — no polling loops, no per-request surprises. Cache hits return in under 100ms; cache misses trigger real-time compute and respond in 1–5 seconds.
Universal track resolution maps any platform ID to all known cross-references plus verified acoustic features — one request, every major streaming service resolved. Beat grids with float-precision timestamps, audio embeddings for custom ML, song structure with section labels, and DJ mix intelligence combining co-occurrence data from 652K+ analysed sets.
The Engine Powers Working Products.
AI DJ Radio — running since 2020. 24/7/365 continuous mix of underground electronic music, with automated track selection driven by energy curves, harmonic compatibility, and genre coherence. No human intervention. No pre-programmed playlists. The engine described on this page was built to make this work.
Coming soon — a real-time audio platform. Powered by TuneLab.
Engineering Principles
Private by Design
- Most tools process entirely on your device — audio never leaves your machine
- Cloud Assist uploads are deleted immediately after processing — no retention, no listening
- Accounts are only required for cloud processing — everything else works without sign-up
Deterministic & Reproducible
- Every score is decomposed into sub-scores with transparent weighting — no black-box numbers
- Same audio in, same analysis out, every time — results are deterministic
- Holdout metrics and benchmark results disclosed on this page — TuneLab's numbers hold up to scrutiny
Build With Real Music Analysis
32 free audio analysis tools running on your device. A production API with 9.3M resolved tracks. Float-precision tempo, verified key detection, and features computed from the actual waveform.