The Atlantic's Alex Reisner published a searchable public database of four AI music training datasets.

Status: sourced. Sources under review.

Two datasets contain 12 million and 9 million tracks respectively; two others contain over 100,000 songs each.

Status: sourced. Sources under review.

Google and Stability AI have confirmed using the datasets in published research papers.

Status: sourced. Sources under review.

Whether music rights holders will pursue legal action based on the database findings.

Status: developing. Sources under review.

How Google and Stability AI will respond to the increased transparency around their training data sources.

Status: developing. Sources under review.

The exact licensing status of all tracks across the four datasets.

Status: sketchy. Sources under review.

Whether specific artists' work was included without authorization.

Status: sketchy. Sources under review.

The Atlantic unveils searchable database of 21M+ songs used to train AI models

01What happened

The story, straight

The Atlantic's Alex Reisner has created a fully searchable public database cataloging four datasets of music used to train AI models. Two of the sets are massive — 12 million and 9 million tracks respectively — while the other two each contain over 100,000 songs. Google and Stability AI have both confirmed use of the datasets in published research papers, and the sets have been downloaded thousands of times. Some of the sourced music, including tracks from the Free Music Archive, is licensed for personal streaming but not for AI training use.

atlantic reporter alex reisner just built a searchable database of four massive music datasets used to train AI models. the two biggest hold 12 million and 9 million tracks each; the smaller ones still break 100K songs apiece. google and Stability AI have both confirmed using them in research papers, and the datasets have been downloaded thousands of times. some of the music — like stuff from the Free Music Archive — is only licensed for personal streaming, not for AI training.

02Spread timeline

Where it actually started

Jun 20, 2026Origin

The Verge reports on The Atlantic's Alex Reisner publishing a searchable public database of four AI music training datasets totaling over 21 million tracks.the verge reports the atlantic's reisner went public with a searchable database of four AI music training datasets — over 21M tracks

source

03Source receipts

Every claim, linked

The VergeCoverage of The Atlantic's searchable AI music training database, detailing dataset sizes, confirmed use by Google and Stability AI, and licensing issues with sources like Free Music Archive.
primaryrssreceipt

04What's solid, what isn't

What's solid and what isn't

Confirmed

The Atlantic's Alex Reisner published a searchable public database of four AI music training datasets.
Two datasets contain 12 million and 9 million tracks respectively; two others contain over 100,000 songs each.
Google and Stability AI have confirmed using the datasets in published research papers.
The datasets have been downloaded thousands of times.

Disputed

The exact licensing status of all tracks across the four datasets.
Whether specific artists' work was included without authorization.

Developing

Whether music rights holders will pursue legal action based on the database findings.
How Google and Stability AI will respond to the increased transparency around their training data sources.

05Why it matters

The editorial take

This is the first time anyone has made AI music training data this transparent and searchable. It gives artists and rights holders a concrete tool to check whether their work was used without permission, and it puts pressure on companies like Google and Stability AI to address how they sourced their training data. The database follows Reisner's earlier work exposing training datasets for image and text AI models.

artists have been screaming about their music getting scraped for AI training with zero transparency. this database is the first real receipt — searchable, public, 21 million tracks worth. now anyone can check if their work ended up in a training set google or stability used. the pressure just got concrete.

01What happened

The story, straight

02Spread timeline

Where it actually started

Jun 20, 2026Origin

source

03Source receipts

Every claim, linked

The VergeCoverage of The Atlantic's searchable AI music training database, detailing dataset sizes, confirmed use by Google and Stability AI, and licensing issues with sources like Free Music Archive.
primaryrssreceipt

04What's solid, what isn't

What's solid and what isn't

Confirmed

The Atlantic's Alex Reisner published a searchable public database of four AI music training datasets.
Two datasets contain 12 million and 9 million tracks respectively; two others contain over 100,000 songs each.
Google and Stability AI have confirmed using the datasets in published research papers.
The datasets have been downloaded thousands of times.

Disputed

The exact licensing status of all tracks across the four datasets.
Whether specific artists' work was included without authorization.

Developing

Whether music rights holders will pursue legal action based on the database findings.
How Google and Stability AI will respond to the increased transparency around their training data sources.

The Atlantic unveils searchable database of 21M+ songs used to train AI modelsThe Atlantic unveils searchable database of 21M+ songs used to train AI models

01What happened

The story, straight

02Spread timeline

Where it actually started

03Source receipts

Every claim, linked

04What's solid, what isn't

What's solid and what isn't

05Why it matters

The editorial take

Check it

The Atlantic unveils searchable database of 21M+ songs used to train AI modelsThe Atlantic unveils searchable database of 21M+ songs used to train AI models

01What happened

The story, straight

02Spread timeline

Where it actually started

03Source receipts

Every claim, linked

04What's solid, what isn't

What's solid and what isn't

05Why it matters

The editorial take

Check it

01What happened

The story, straight

02Spread timeline

Where it actually started

03Source receipts

Every claim, linked

04What's solid, what isn't

What's solid and what isn't

05Why it matters

The editorial take

Related from the desk

Justin Cary, Sixpence None the Richer bassist, dies at 50 after stroke

ENHYPEN fans flood South Korea's National Pension Service over Heeseung's departure

FKA twigs and Lil Yachty release collaborative single 'On Your Mind'

Tay Keith, producer of Travis Scott's 'Sicko Mode' and Drake's 'Nonstop,' dies at 29

Sleep Returns with First New Song in 8 Years 'Have Spacesuit Will Travel'

Gracie Abrams covers Ariana Grande's 'We Can't Be Friends' in BBC Radio 1 Live Lounge

Check it

01What happened

The story, straight

02Spread timeline

Where it actually started

03Source receipts

Every claim, linked

04What's solid, what isn't

What's solid and what isn't

05Why it matters

The editorial take

Related from the desk

Justin Cary, Sixpence None the Richer bassist, dies at 50 after stroke

ENHYPEN fans flood South Korea's National Pension Service over Heeseung's departure

FKA twigs and Lil Yachty release collaborative single 'On Your Mind'

Tay Keith, producer of Travis Scott's 'Sicko Mode' and Drake's 'Nonstop,' dies at 29

Sleep Returns with First New Song in 8 Years 'Have Spacesuit Will Travel'

Gracie Abrams covers Ariana Grande's 'We Can't Be Friends' in BBC Radio 1 Live Lounge

Check it