Python Speech Recognition program

1 Comments

Theory:
This document is targeted at the beginner to intermediate level windows and Linux user interested in learning about Speech Recognition and trying it out. It may also help the interested developer in explaining the basics of speech recognition programming.

I started this document when I began researching what speech recognition software and development libraries were available for Windows and Linux. Automated Speech Recognition (ASR or just SR) on Windows and Linux is just starting to come into its own, and I hope this document gives it a push in the right direction − by supporting both users and developers of ASR technology
A quality microphone is key when utilizing ASR. In most cases, a desktop microphone just won't do the job. They tend to pick up more ambient noise that gives ASR programs a hard time.

An utterance is the vocalization (speaking) of a word or words that represent a single meaning to the computer. Utterances can be a single word, a few words, a sentence, or even multiple sentences.

The ability of a recognizer can be examined by measuring its accuracy − or how well it recognizes utterances. This includes not only correctly identifying an utterance but also identifying if the spoken utterance is not in its vocabulary. Good ASR systems have an accuracy of 98% or more! The acceptable accuracy of a system really depends on the application.

Handheld microphones are also not the best choice as they can be cumbersome to pick up all the time. While they do limit the amount of ambient noise, they are most useful in applications that require changing speakers often, or when speaking to the recognizer isn't done frequently (when wearing a headset isn't an option).

The best choice, and by far the most common is the headset style. It allows the ambient noise to be minimized while allowing you to have the microphone at the tip of your tongue all the time. Headsets are available without earphones and with earphones (mono or stereo). I recommend the stereo headphones, but it's just a matter of personal taste.

Speech Recognition Basics:
Speech recognition is the process by which a computer (or another type of machine) identifies spoken words. Basically, it means talking to your computer AND having it correctly recognize what you are saying.

Speech recognition systems can be separated into several different classes by describing what types of utterances they have the ability to recognize. These classes are based on the fact that one of the difficulties of ASR is the ability to determine when a speaker starts and finishes an utterance. Most packages can fit into more than one class, depending on which mode they're using.

How Recognizers Work:
Recognition systems can be broken down into two main types. Pattern Recognition systems compare patterns to known/trained patterns to determine a match. Acoustic Phonetic systems use knowledge of the human body (speech production, and hearing) to compare speech features (phonetics such as vowel sounds). Most modern systems focus on the pattern recognition approach because it combines nicely with current computing techniques and tends to have higher accuracy.

About this program:

I was inspired by iron man movie specially Javis software in I have used bing speech recognition module. And I have developed in python because in python it is easy to make.
Task performed by program:

My program can play music by just saying "play music" and the program will give you two option 1st play all song and play a specific song
can send mail without touch our computer by just saying three things send mail, to an email address and message. it sends mail to that email address

Requirement:

This is python Speech Recognition program. This program needs to Two Modules

1:- Speech Recognition

2:- pyaudio

you can install this by using the command

A good quality of headphone required

pip install pyaudio

pip install speech recognition

You can also install manually

goto by download this file

Speech Recognition Code:-

    import speech_recognition as sr
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print"say!"
        audio = r.listen(source, timeout=1, phrase_time_limit=5)
        print "recognizing"

    BING_KEY = ""
    try:
        msg = r.recognize_bing(audio, key=BING_KEY)
        print msg
    except sr.UnknownValueError:
        print"Microsoft Bing Voice Recognition could not understand audio"
    except sr.RequestError as e:
        print"check internet or validation "

This is program for Windows and if you use google api then its create error 
and i want to tell you that Bing API Is better the Google api . Bing api Need 
Bing Key . you can get key by going on this Website https://azure.microsoft.com/en-in/services/cognitive-services/speech/  
YOu have to login then they give key.And if you makeing this program in Linux then 
You can use google api msg = r.recognize_google(audio) the you no need any key

If you have any error the Live Comment or contact me on Social Network