BrowserHelp: Control your Chrome browser via voice interactions
Inspiration
What it does
How I built it
Challenges I ran into
Accomplishments that I'm proud of
How Do I Get Started
Setup
Supported Phrases
What's Next for BrowserHelp

Published June 22, 2017 © GPL3+

Alexa BrowserHelp

BrowserHelp allows full voice control of your Chrome browser! Open any link, search with Google, scroll, and more

IntermediateFull instructions providedOver 3 days2,176

2nd Place

Amazon Alexa Skills Contest

Things used in this project

Hardware components

Amazon Alexa Amazon Echo

Software apps and online services

Amazon Alexa Alexa Skills Kit

Amazon Web Services AWS Lambda

Amazon Alexa Alexa Voice Service

Story

BrowserHelp: Control your Chrome browser via voice interactions

Inspiration

A large part of our interaction with the world nowadays comes from surfing the web. This requires constant interaction with keyboard and mouse, posing a large problem for those who are not able to do this due to physical impairments or disabilities. BrowserHelp attempts to tackle this problem by offering an alternative, a natural voice interaction with your Amazon Echo device to give you complete control over your Chrome Browser

What it does

After installing the Alexa skill and companion Chrome extension, you can navigate the web and perform all of your basic browser interactions without having to lay a finger on your keyboard or mouse! Searching with Google, scrolling, tab management, navigating to arbitrary links on a page and moving through your history are some examples, but you can also set your preferred news site and up to 3 favourites for easy access

How I built it

BrowserHelp consists of three components:

The BrowserHelp Alexa Skill, backed by a Node.JS Lambda function

The BrowserHelp Chrome extension

A NodeJS server, needed to:

Form a bridge between the HTTP-based requests from Lambda functions, and the websocket connections needed by the Chrome extension

Facilitate Login with Amazon when setting up the Chrome Extension

User interaction and conversation flow is handled within the Lambda function, which uses Account Linking to identify different users. Actions to be performed are sent from Lambda function to the server via a secure connection, together with a hashed user identifier. The Chrome extension, once installed, uses Login With Amazon via the server to acquire and store the same hashed identifier. After this, server and extension establish a dedicated and secure socket.io channel for that hash through which all communication for that user runs. The extension then performs requested actions using a mix of Chrome APIs and injected content scripts.

Challenges I ran into

The only scalable way of keeping a Chrome extension in sync with the actions it needs to perform, is by using websockets and the Publisher-Subscriber (PubSub) pattern. This does not, however, work well with the stateless architecture of Lambda functions, which cannot keep a websocket connection alive. The most scalable way I could find was relaying all of the lambda function requests to a server, which creates dedicated websocket channels for users to which their Chrome extension can subscribe

Free-form text input, as needed for searching any website or adding any input in a page, is still quite a challenge using Alexa. As a (hopefully) temporary solution, I decided to use the Web Speech API to interpret search queries

Accomplishments that I'm proud of

BrowserHelp is my first project going into production, and it's incredibly gratifying to finish a project and seeing it usable by people from all over the world

Receiving incredibly positive feedback from multiple users during beta testing and while live, and hearing about ways in which BrowserHelp is used I couldn't have thought about before

Despite Amazon's strict certification process, coming up with a unique use case for Alexa that's different from most Alexa skills which just stick to the voice-based interaction with your Echo

How Do I Get Started

Visit browserhelp.me to install the Alexa Skill and Chrome Extension.

After installing the skill, enable Account Linking for that skill via either the Alexa app on your phone or the Alexa web app.

When you've installed the companion Chrome Extension, Alexa BrowserHelp, you will be prompted to login via Login With Amazon. Log in using the same account details as used for installing the Alexa Skill.

Once this is complete, a message will appear telling you you can now close the tab and start using the skill, or setup your favourite websites and news site via the options page

You are now ready to start using the skill by saying "Alexa, start BrowserHelp". Alternatively, try "Alexa, ask Browserhelp to scroll down" or one of the other supported phrases listed below.

Setup

To install the skill, follow the following steps:

Clone the code from https://github.com/BerioFlow/BrowserHelp

Deploy the server on any platform, and enable https

Update all occurrences of the "serene-harbor-37271.herokuapp.com" URL in the project to the baseUrl of your own server

Install the chrome extension found in the extension folder as described in https://developer.chrome.com/extensions/getstarted#unpacked

Create a new Skill for Alexa, and when configuring the Interaction Model use the settings stored in intentScheme.json, LIST_OF_ITEMS.txt, and sampleUtterances.txt from the skill directory to define the allowed voice interactions

Configure and upload your skill via the AWS CLI as described in this link: https://developer.amazon.com/blogs/post/Tx1UE9W1NQ0GYII/publishing-your-skill-code-to-lambda-via-the-command-line-interface and use the already included publish.sh to re-upload

The setup should now be complete, and if the skill was uploaded correctly it has been automatically made available for usage on Alexa devices on which you are logged in with your Amazon account. Test the skill by asking 'Alexa, start BrowserHelp'

Supported Phrases

Try some of the following sample utterances:

Search with Google

Show News

Highlight links

Open Link {x}

Remove highlighting

Open favourite {1/2/3}

Help

Navigate {back/forward}

Scroll {up/down}

{Open/close} tab

Press Enter

Reload page

Open {Youtube/Google/Facebook/Twitter/Hacker News}

What's Next for BrowserHelp

Offer custom integration and specific voice commands for large platforms such as Youtube or Facebook

Inject Web Speech API for filling in any form and search box on the page, in the same way as currently done for highlighting and selecting links

Extend commands. Commands present in the next version include:

Dictate Page

List Bookmarks

Open Bookmark {x}

Press {Tab/Backspace/Spacebar/Up/Down/Left/Right}

Use Input {1/2/3}

Setting a single or repeated timer for any of the existing commands

Credits

Tim Nederveen

0 projects • 3 followers

MSc. student Software Engineering, Owner @ Flow ICT, Check out my personal projects at www.timnederveen.me !

Contact

Comments

Please log in or sign up to comment.

Awards

2nd Place

Amazon Alexa Skills Contest

Alexa BrowserHelp