Last modified 7 years ago
Last modified on 02/03/14 22:10:06
Dialog Managers
- dragonfly speech recognition framework DNS or WSR
- TRINDIKIT dialogue modeling architecture
- ravenclaw open-source dialog system toolkit
- Ariadne dialog manager
- Midiki dialogue toolkit
- Communicator projects
- Galaxy dialog systems infrastructure
- CMU Communicator
- Jaspis framework for adaptive speech applications
- WAMI: Web-Accessible Multimodal Applications
- DIPPER Dialogue Prototyping Equipment & Resources
- PED A Planner for Efficient Dialogues
- open-allure-ds
- OWLSpeak an ontology-based spoken dialogue management system
- Howe - a spoken dialogue system project designed for browsing instructions.
- Regulus - a Prolog-based toolkit for building spoken dialogue systems.
- PerlBox sphinx
Robotics
- OpenHRI - Opensource software components for Human Robot Interaction
speech synthesis and recognition library
- voce project cross-platform
Dialog Manager - information
- Speech Interface Guidelines - overview of speech interface design principles as applied to the range of applications that have been developed at Carnegie Mellon
State Chart XML (SCXML)
- [Synergy SCXML Web Laboratory http://www.ling.gu.se/~lager/Labs/SCXML-Lab/]
From a post from Will Walker on kde-accessibility:
List: kde-accessibility Subject: Re: [Kde-accessibility] Fwd: Re: paraphlegic KDE support From: Willie Walker <William.Walker () Sun ! COM> Date: 2006-02-23 16:57:34 Message-ID: 6072A454-C87C-4612-AB8E-648FB3CA746B () sun ! com [Download message RAW] Hi All: I just want to jump in on the speech recognition stuff. Having participated in several standards efforts (e.g., JSPAI, VoiceXML/SSML/ SGML) in this area, and having developed a number of speech recognition applications, and having seen the trials and tribulations of inconsistent SAPI implementations, and having led the Sphinx-4 effort, I'd like to offer my unsolicited opinion :-). In my opinion, there are enough differences in the various speech recognition systems and their APIs that I'm not sure efforts are best spent charging at the "one API for all" windmill. IMO, one could spend years trying to come up with yet another standard but not very useful API in this space. All we'd have in the end would be yet another standard but not very useful API with perhaps one buggy implementation on one speech engine. Plus, it would just be repeating work and making the same mistakes that have already been done time and time again. As an alternative, I'd offer the approach of centering an available recognition engine and designing the assistive technology first. Get your feet wet with that and use it as a vehicle to better understand the problems you will face with any speech recognition task for the desktop. Examples include: o how to dynamically build a grammar based upon stuff you can get from the AT-SPI o how to deal with confusable words (or discover that recognition for a particular grammar is just plain failing and you need to tweak it dynamically) o how to deal with unspeakable words o how to deal with deictic references o how to deal with compound utterances o how to handle dictation vs. command and control o how to deal with tapering/restructuring of prompts based upon recognition success/failure o how to allow the user to recover from misrecognitions o how to handle custom profiles per user o (MOST IMPORTANTLY) just what is a compelling speech interaction experience for the desktop? Once you have a better understanding of the real problems and have developed a working assistive technology, then take a look at perhaps genericizing a useful layer to multiple engines. The end result is that you will probably end up with a useful assistive technology sooner. In addition, you will also end up with an API that is known to work for at least one assistive technology. Will