Skip to main content

How to add voice commands to your webpage with javascript

Did you ever in some phase of your life to give orders to a computer and expect an answer ? The way that Tony Stark talks to Jarvis is really fluid, pitifully what we can achieve with this article and this library is limited and you'll have to set it up almost all manually to give a fluid sensation while you talk to your computer thanks to Javascript and  the webkitSpeechRecognition.

Jarvis, make me a sandwich

Now that you are a developer, the great day has become, create a website with voice commands that can be so flexible as you want. The HTML5 Speech Recognition API allows JavaScript to have access to a browser's audio stream and convert it to text. Thanks to Artyom.js a voice commands library handler this task will be a piece of cake.

Note : WebkitSpeechRecognition is only available in Google Chrome, hopefully, in the future this feature would be a standard for all browser, but for now we only can try artyom in this browser.

About the commands

Every command is a literal object with a couple of key-value relations which are :

  • indexes : All the available words that trigger this command
  • description : Add a little description to your command
  • action : A function that will be executed if a spoken word triggers this command

Read more about the commands here.

Step 1.

Add the library to your document in the head tag (you can get a copy of the library in the official repository in github) :

<!DOCTYPE>
<html>
  <head>
    <title>Cooking with artyom.js</title>
    <!-- Important to load artyom in the head tag, this give time to load all the voices in the browser -->
    <script type="text/javascript" src="path/to/artyom.min.js"></script>
    <script>
         // Create a global accesible instance of artyom
         window.artyom = new Artyom();
    </script>
  </head>
  <body>

    <script>
      // Artyom is available!
    </script>
  </body>
</html>

Step 2.

Add your commands. Is important to read the documentation and understand how works the commands here, artyom allows you to add smart and normal commands.

The normal command will be triggered when the user speaks and the recognized text matches with some of the indexes of the commands (contained in the array), for example :

// A normal command

artyom.addCommands({
  indexes:["Hello","Hey","Hurra"],
  action: function(i){
    // i = index of the recognized option
    console.log("Something matches");
  }
});

The smart command allows to retrieve some spoken text of a command, useful to get the name of a variable action, for example :

artyom.addCommands({
  smart:true,// We need to say that this command is smart !
  indexes:["How many people live in *"], // * = the spoken text after How many people live in is recognized
  action:function(i,wildcard){
    switch(wildcard){
      case "berlin":
        alert("Why should i know something like this ?");
      break;
      case "paris":
        alert("I don't know");
      break;
      default:
        alert("I don't know what city is " + * + ". try to increase the switch cases !");
      break;
    }
  }
});

Step 3 (optional).

Verify if your command works using artyom.simulateInstruction, this function allows you to simulate a voice command and show how will work when the user talk, for example (using the previous commands) :

artyom.simulateInstruction("How many people live in Paris");
// alert("I don't know ._.");

Step 4.

Start artyom, the initialize function will do the trick for you. You only have to set it up correctly and everything will work fine, the basic options that you need to give are: 

  • lang : The code of the supporte artyom language (see list here)
  • continuous: Boolean, if you're using a httpsconnection you can set to true, otherwise set always to false (as this will activate the 1 command mode)
  • listen:Boolean, if set to true, artyom will start listening, otherwise only the previous settings will be saved.
  • debug: Boolean, if set to true, all the recognized text and many information will be shown in the console

And it's really simple to use :

artyom.initialize({
   lang:"en-GB",// More languages are documented in the library
   continuous:false,//if you have https connection, you can activate continuous mode
   debug:true,//Show everything in the console
   listen:true // Start listening when this function is triggered
});

// Artyom has been started ;)

Step 5.

If you want to stop artyom, use the fatality function. The instance of artyom will be stopped instantly.

artyom.fatality();

Important notes

Artyom is a robust wrapper of the speechRecognition and speechSynthesis api of google chrome, that means artyom have many awesome features that could be useful to personal voice command projects.

  • Read the official documentation of Artyom.js
  • Artyom.js can make your browser talk with artyom.say instruction easily
  • Artyom needs to be used in a local or remote server (http or https), otherwise for security reason you can't use webkitSpeechRecognition API
  • Artyom needs https protocol to work in continuous mode (a permanent voice assistant)

Comments

Popular posts from this blog

4 Ways to Communicate Across Browser Tabs in Realtime

1. Local Storage Events You might have already used LocalStorage, which is accessible across Tabs within the same application origin. But do you know that it also supports events? You can use this feature to communicate across Browser Tabs, where other Tabs will receive the event once the storage is updated. For example, let’s say in one Tab, we execute the following JavaScript code. window.localStorage.setItem("loggedIn", "true"); The other Tabs which listen to the event will receive it, as shown below. window.addEventListener('storage', (event) => { if (event.storageArea != localStorage) return; if (event.key === 'loggedIn') { // Do something with event.newValue } }); 2. Broadcast Channel API The Broadcast Channel API allows communication between Tabs, Windows, Frames, Iframes, and  Web Workers . One Tab can create and post to a channel as follows. const channel = new BroadcastChannel('app-data'); channel.postMessage(data); And oth...

Certbot SSL configuration in ubuntu

  Introduction Let’s Encrypt is a Certificate Authority (CA) that provides an easy way to obtain and install free  TLS/SSL certificates , thereby enabling encrypted HTTPS on web servers. It simplifies the process by providing a software client, Certbot, that attempts to automate most (if not all) of the required steps. Currently, the entire process of obtaining and installing a certificate is fully automated on both Apache and Nginx. In this tutorial, you will use Certbot to obtain a free SSL certificate for Apache on Ubuntu 18.04 and set up your certificate to renew automatically. This tutorial will use a separate Apache virtual host file instead of the default configuration file.  We recommend  creating new Apache virtual host files for each domain because it helps to avoid common mistakes and maintains the default files as a fallback configuration. Prerequisites To follow this tutorial, you will need: One Ubuntu 18.04 server set up by following this  initial ...

Working with Node.js streams

  Introduction Streams are one of the major features that most Node.js applications rely on, especially when handling HTTP requests, reading/writing files, and making socket communications. Streams are very predictable since we can always expect data, error, and end events when using streams. This article will teach Node developers how to use streams to efficiently handle large amounts of data. This is a typical real-world challenge faced by Node developers when they have to deal with a large data source, and it may not be feasible to process this data all at once. This article will cover the following topics: Types of streams When to adopt Node.js streams Batching Composing streams in Node.js Transforming data with transform streams Piping streams Error handling Node.js streams Types of streams The following are four main types of streams in Node.js: Readable streams: The readable stream is responsible for reading data from a source file Writable streams: The writable stream is re...