Here is a list of possible sources of Spoken Audio files that might be used for the creation of GPL Acoustic Models. * Audio Source list: * [http://www.gutenberg.org/audio/ Gutenburg audio project ] * [http://en.wikipedia.org/wiki/Category:Spoken_articles Wikipedia Spoken Articles ] * CMU: * [http://www.speech.cs.cmu.edu/databases/ CMU Audio Database] : * [http://www.speech.cs.cmu.edu/databases/micarray Microphone array database ] * [http://www.speech.cs.cmu.edu/databases/an4 Census (AN4) database ] * [http://www.speech.cs.cmu.edu/letsgo/letsgodata.html Let's Go Speech Dialog Data ] - (license for research only) * [http://www.festvox.org/cmu_sin CMU_SIN Speech in Noise Database ] * [http://www.speech.cs.cmu.edu/databases/pda PDA database ] * [http://www.speech.cs.cmu.edu/databases/rm1 Resource Management (RM1) database ] (no wav - Sphinx mfc only) * [http://www.festvox.org/dbs/dbs_com.html CMU Communicator KAL limited domain ] * Festvox: * [http://www.festvox.org/dbs/ Festvox databases ] * [http://www.festvox.org/cmu_arctic/ CMU ARCTIC] 4 single speaker phonetically balanced databases (no restrictions) * [http://www.festvox.org/cmu_faf CMU_FAF (Facts and Fables) database] * [http://www.festvox.org/cmu_sin CMU_SIN database] Speech in Noise * [http://www.speech.cs.cmu.edu/Tongues CMU Chaplain] (for research only) * [http://www.festvox.org/dbs/dbs_kdt.html CSTR US KED Timit] * Diphone Databases * [http://www.festvox.org/dbs/dbs_kal.html CMU US KAL diphone] * [http://www.festvox.org/dbs/dbs_rab.html CSTR UK RAB diphone] * [http://www.cavs.msstate.edu/hse/ies/projects/switchboard/releases/ ISIP Switchboard Audio Database] * [http://www.americanrhetoric.com/ American Rhetoric] * [http://www.lsa.umich.edu/eli/micase/Audio/index.htm MICASE] * [http://www.talkbank.org/data/ TalkBank ] [http://www.talkbank.org/media/ TalkBank Audio Files ] (GNU license) * [http://www.talkbank.org/media/SWB/ Switchboard database ] * Hansard Canada * [http://www.parl.gc.ca/common/Chamber_House_Debates.asp?Language=E&Parl=39&Ses=1 House of Commons] * [http://www.parl.gc.ca/common/Chamber_Senate_Debates.asp?Language=E&Parl=39&Ses=1 Senate] * [http://micase.umdl.umich.edu/m/micase/ MICASE Michigan Corpus of Academic Spoken English] * [http://alt-usage-english.org/audio_archive.shtml AUE - alt-usage-english ] * Links * [http://devoted.to/corpora Bookmarks for Corpus-based Linguists] * [http://www.inf.ed.ac.uk/resources/corpora/ Corpora and other Language and Speech Data under DICE] Other Possible sources, but with licensing issues: * [http://buckeyecorpus.osu.edu/ Buckeye Corpus] (Also see Ticket #22)