gears

I recently bought an Amazon Echo, and I was seriously impressed by its ability to derive meaning from my commands. It really is a great product, but this isn’t an Echo review. This is an introduction to experimental Chrome APIs. With these tools, you can build your very own Alexa clone.

Speech Recognition

If you’ve ever clicked the microphone on the right-hand side of the Google search input, you’ve already experienced the power of the SpeechRecognition API. The speech recognizer will listen to what you say and convert your words to a string.

Browser support is currently limited to just Chrome for now. Firefox support can be enabled by setting the media.webspeech.recognition.enable flag in about:config.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// This API is currently prefixed in Chrome
var SpeechRecognition = (
  window.SpeechRecognition ||
  window.webkitSpeechRecognition
);

// Create a new recognizer
var recognizer = new SpeechRecognition();

// Start producing results before the person has finished speaking
recognizer.interimResults = true;

// Set the language of the recognizer
recognizer.lang = 'en-US';

// Define a callback to process results
recognizer.onresult = function (event) {
  var result = event.results[event.resultIndex];

  if (result.isFinal) {
   alert('You said: ' + result[0].transcript);
  } else {
   console.log('Interim result', result[0].transcript);
  }
};

// Start listening...
recognizer.start();

For more information, visit the Mozilla Developer Network.

Speech Synthesis

Another API that might surprise you is the browser’s speechSynthesis API. It allows you to make your browser talk!

Browser support is a little better for this API, as it works on both Chrome and Safari (including mobile browsers).

1
2
3
speechSynthesis.speak(
  new SpeechSynthesisUtterance('Howdy, partner')
);

You can also specify which voice to use:

1
2
3
4
var voices = speechSynthesis.getVoices();
var utterance = new SpeechSynthesisUtterance('Howdy, partner');
utterance.voice = voices[1];
speechSynthesis.speak(utterance);

For more information, visit the Mozilla Developer Network.