Speech-to-text in browser

Image for post
Image for post

Voicer is a solution that transcribes speech to text and works in your browser (Google Chrome only). Designed for hearing impaired people to communicate with their friends more easily. Or follow an audio conversation without any sound.

Demo: https://nevolin.be/voicer/?room=medium

Image for post
Image for post
Image for post
Image for post

Voicer is a solution that takes your microphone input, transcribes it to text and broadcasts the text to your connected friends. It uses the Web Speech API which is currently only available in Google Chrome. It’s secured through HTTPS/SSL and respects everyone’s privacy, no data is stored nor shared with third-parties.

Open the app link in your Chrome browser, allow microphone access, enter your username and submit. Now you can start talking and you’ll see your words/sentences appear on screen.

Link to Source Code: https://github.com/healzer/voicer

Many months ago I was building a music bot for Discord with voice enabled controls (e.g. play next, pause, shuffle, play random, play ). That bot got some traction and I started getting attention from people with hearing conditions. Unfortunately that bot has to be configured and hosted, which may be a little too hard for non-tech people. So I started looking into simpler solutions, and so voicer was born. It only needs Google Chrome to work.

Other browsers such as Safari, Edge and FireFox have their Speech API in development, so hopefully they’ll be compatible soon.

It’s purely JavaScript/jQuery/HTML on the front-end, nothing too fancy. And NodeJS for the back-end. It uses web sockets for server-client communication to reduce latency to the minimum.

The beautiful part is that it allows you to join “rooms”, so many people can use it with just a single server running. My app runs on a basic $5 digitalocean cloud app.

You can use the app as is, or you can host it yourself. The server component does not store any sensitive information about the conversations. The speech-to-text part is done by Google Chrome, in your browser. The server component is nothing more than a broker for all the connected users.

You can use third-party software to keep your browser/tab stay on top of all your other windows, this way you can keep following the conversation while working/gaming. It won’t work for full-screen apps (so gamers need to be in windowed mode).

Written by

Become a rockstar programmer and try to reach genius status on codr https://nevolin.be/codr/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store