Version 25 (modified by anonymous, 15 years ago) (diff) |
---|
Here is a list of possible sources of Spoken Audio files that might be used for the creation of GPL Acoustic Models.
- Audio Source list:
- Gutenburg audio project
- Wikipedia Spoken Articles
- CMU:
- CMU Audio Database :
- Microphone array database
- Census (AN4) database
- Let's Go Speech Dialog Data - (license for research only)
- CMU_SIN Speech in Noise Database
- PDA database
- Resource Management (RM1) database (no wav - Sphinx mfc only)
- CMU Communicator KAL limited domain
- CMU Audio Database :
- CMU Communicator:
- Festvox:
- Festvox databases
- CMU ARCTIC (no restrictions)
- CMU_FAF (Facts and Fables) database
- CMU_SIN database Speech in Noise
- CMU Chaplain (for research only)
- CSTR US KED Timit
- Diphone Databases
- ldom Databases (Limited Domain)
- Festvox databases
- ISIP Switchboard
- American Rhetoric
- MICASE
- TalkBank TalkBank Audio Files (GNU license)
- Hansard Canada
- MICASE Michigan Corpus of Academic Spoken English
- AUE - alt-usage-english
Other Possible sources, but with licensing issues:
- Buckeye Corpus
- CSLU Speech Synthesis Research Group;
- OGIresLPC 2.1.0 voices (voice data not released yet - only for research/personal use ...)
(Also see Ticket #22)