You can use the javascript API to add speech-to-text and text-to-speech to your website. You can do this with just a few lines of code. You can use the code in these live examples as a starting point for your own speech enbaled wb pages.
The first step is to register and get a login and password that you can use in the API calls. You can sign up for an account here. You can also use the account to login to the web site and monitor your account activity.
The javascript API has three modes. Basic, Advanced and Automatic. Basic mode is the best way to get started. It provides a simple set of methods and callbacks that you can use in your web pages. Advanced mode integrates nicely with modern javascript toolkits like jquery. In advanced mode you design the grammar segments that will create events, and you write the javascript callbacks that will process those events. Automatic mode allows you to add basic speech processing to your web page with minimal amount of code. You indicate which html tags you want to speech enabled and add some grammar hints to thos tags.
Lets start with basic mode by taking a look at the code in the parrot example.
To use speechapi you need to include two javascript files in your webpage. The first one contains the specechapi javscript, the second one is a standard way to embed a flash object (swfobject.js). The flash object is used to access the mircrophone from the browser and to stream audio to and from the browser.
Next you need to initialize the speechapi flash component and to establish a connection to the server.
Note that we are using the basic swfobject.js and we are using Dyanmic Publishing. A few other items worth pointing out here:
var flashvars = {speechServer : "rtmp://www.speechapi.com:1935/firstapp"};
var params = {allowscriptaccess : "always"};
var attributes = {};
attributes.id = "flashContent";
swfobject.embedSWF("http://www.speechapi.com/static/lib/speechapi-1.2.swf",
"myAlternativeContent",
"215", "138", "9.0.28", false,flashvars, params, attributes);
speechapi.setup("eli","password",onResult,
onFinishTTS, onLoaded, "flashContent");
To enable recognition, use the setupRecognition method. It has two parameters: the grammar mode and the grammar. grammar mode can be either SIMPLE or JSGF. SIMPLE indicates the grammar is just a comma seperated list of words or phrases. The recognizer will reconize any of the words in the list. The comma indicates an 'OR'. You will get one of the items in the list as a result. If you have more complex requirements,, use the JSGF mode. JSGF grammars allow you to do some interesting things optional words, repeating words as well as embedding tags to help you with semantic interpretation of the results.
you can learn more about JSGF here.
In our parrot example we are setting up a simple grammar using the content of a textbox on the web page. Note that this done inside the onLoaded method. So we are enabling recognition as soon as the flash component is ready.
function onLoaded() {
speechapi.setupRecognition("SIMPLE", document.getElementById('words').value,false);
}
The following code snippet shows the recognition and tts callbacks. The recognition callback (onResult) receives the recognition results as a parameter. The result object contains the raw results in the result.text. In this example the callback displays results in a UI element on the page and then uses the text to speech api to repeat the results back to the user. Upon completion of the text to speech processing, you will recive a tts complete callback. In this case we are just alerting the user the tts is complete.
function onResult(result) {
document.getElementById('answer').innerHTML = result.text;
speechapi.speak(result.text,"male");
}
function onFinishTTS() {
alert("finishTTS");
}
The final element of the parrot example is the resetGrammar method. It provides a way to change the grammar for the next recognition event.
function resetGrammar() {
speechapi.setupRecognition("SIMPLE", document.getElementById('words').value,false);
}
The javascript API works by connecting to a flash control to enable the users microphone. The user will need to click on Allow before it works.

Users sometimes need to make sure that the right microphone is chosen by right clicking on the flash control, clicking on settings, and selecting the used microphone.
The flash component provides two ways to trigger start and endspeech events. The press to speak button and automatic mode. When using the press to speak button, a start speech event will be generated upon pressing the button and an end speech event is triggered upon releasing the button.
In automatic mode, the flash component automatically detects the speech start and end events.
There is a third way to trigger spoeech events, you can use startRecognition() and stopRecogntion() javascript methods
You can use either RTMP streaming or HTTP streaming by specifying the URL in the swf flashvar "speechServer".
//This tells the flash component to use a rtmp speech server
var flashvars = {speechServer : "rtmp://www.speechapi.com:1935/firstapp"};
//This tells the flash component to use a http speech erver
var flashvars = {speechServer : "http://www.speechapi.com:8000/speechcloud"};
| Copyright speechapi.com. 2009-2010 Contact Us |