Award
Congratulations Rich, for answering the question and has received an EZ-Credit award! Share what you have learned and create a tutorial to help others by clicking here.

Voice Recognition: More Than One Person To Respond To.

Assistance Requested

Help Anthony XLRobots.com with their question and receive $10 of EZ-Credit to get more robots and parts from our store. The following information was provided about their previous efforts searching tutorials for a resolution.

Anthony XLRobots.com claims to have checked these sources:
 
#1

Was wondering how to setup the voice recognition to respond to more than just my voice. I wanted to have my 2 boys recognized as well. My problem is they are 5 and 7 yrs old and would not be able to go thru the ms setup. Also do not know if you can setup more than one voice to be recognized. Any help would be appreciated. Thanks.

I have found a few conversations, tutorials and activities that may help with your question. Take a look at these links. I've sorted them by what I believe to be most relevant but that is not always the case as I'm still learning.


Also, consider reviewing the Learn section for informative lessons and activities. Check it out!


#2

What I understand is Voice rec it seems has always had problems recognizing women and children. Deep voices, no problem.

#3

What you are asking for I believe is beyond Microsoft's SAPI, which is what EZ-Builder uses. It may be possible using Dragon Naturally Speaking though, however recognition of different people may still not work and would require more work to be done in DNS so that it controls EZ-Builder (default it will not and installing it doesn't replace the SAPI).

I have only seen one example of speech recognition recognising two different people and even that I have doubts as to if it was just scripted or real.

I welcome someone to say I am wrong, I want to be wrong but unfortunately I don't think I am.

#4

MS Sapi will do some amount of speaker independent recognition (ie, it will recognize without training, just not as well as with training).

it is better with multiple syllable words than things like yes and no, so if you have scripted commands and teach your kids to clearly enunciate a few key commands, it may be able to recognize them without training.

Alan

#5

I should elaborate a bit more on my reply, I assumed you wanted Lexi to know who was talking as well as know what was said.

Multiple users can use voice recognition but the software can't tell who is speaking. If that's what you were after.

If not, every user will need to speak clearly and, when possible, carry out training so Windows SAPI can understand better. My nephew can control Jarvis but only some commands, most commands are usually very inaccurate.

#6

I've had some success with multiple speakers getting a proper response from my B9 robot that's using Windows SAPI through EZ Builder. It's hit and miss though. I just tell them the robot has an attitude that day and doesn't want to respond to a lot of people. The more alike to the voice that's trained in Windows SAPI (mine) the better the response. I'm also using a multi directional mic called "The Blue Snowball" that seems to do a good job picking up commands from around the room. I can also set it to a directional setting but I'm still playing with it a little to see which is better. I wont really know till I get B9 into his final resting place in my home.

Rich, You mentioned Dragon. I know integrating DNS into EZ Builder somehow was being discussed a few months ago. I've been busy with summer and family related tasks for months and away from robot building and this forum. Have you guys found a way to use that excellent program with our EZ Builder voice recognition?

Dave Schulpius

#7

No, I didn't really get started on DNS and EZ-Builder and it's not moved on from then. In fact, I've since updated my PC and haven't reinstalled DNS yet. It is something I want to get working with EZ-Builder and from the (minimal) information I've looked up and read on using DNS it should be pretty straight forward. The main thing holding me back on it is that my Windows VR has been trained for over 2 years every day, it receives constant training through Vox Commando and so it's accuracy is now on par (although not quite there yet) with DNS out of the box (yes, DNS is better with no training than Windows with 2 years of training).

Voice recognition is always going to be difficult and the most annoying part of any voice controlled robot for these reasons. If you think about how it works it doesn't actually hear the words like we do, it matches the sound waves with that it knows. These vary from person to person so if two people with different accents, different voices say the same thing it will be different and the PC will not match the voice with the word.

You can have multiple voice profiles with Windows, this could be of use but you would need a way to change from one to the other. It's possible via the control panel of Windows but I haven't a clue how to do it any other way. If multiple users is something that is needed it would probably pay to look in to some windows voice recognition dedicated forums with people who have been using it and mastered it for years, I'm sure google can throw a few sites up in a search.

#8

@rich What I want her to do is respond to commands from my kids as well as myself.

#9

In that case, yes you can do that but it will need training to understand them, and the more people who train under the same profile the less accurate it will become.

You will find it difficult for any VR software to understand kids for many reasons, not only software limitations but the limitations of the kids speech too. Unless your kids speak like news anchors.

My nephews absolutely love Jarvis and Melvin (Jarvis mainly because of ironman). The accuracy, even after some training is probably around the 10% mark. Their behaviour too is a problem, as mentioned before. For instance, they aren't exactly clear, this doesn't get picked up so they start shouting, talking faster, shouting command after command after command...

My eldest nephew is getting better and has now realised he needs to speak clearer and slower (which is a problem for him as he has some speech problems, but this is helping him more than anything else IMO).

There is one thing that you could do to help it. You can train individual words or phrases in Windows Speech. Via the speech recognition in Windows (not the training or control panel but the recognition application) you can add words and phrases. It would probably work wonders to have each required phrase set up this way rather than training in the conventional sense. After all, with EZ-Builder it works from a phrase list rather than dictation.


User-inserted image

#10

We have multiple users with our Ai core, but it uses DNS. How it works is when the Ai recognises a face it switches over the user profile to that person this works quite well.

User-inserted image

How DNS works is that as you continue to use the software it is continually improving the phonetic model built up from the (profile) users voice, so it just gets better and more accurate the longer you use it. I am not sure but probably the Microsoft speech recognition engine 8 works in a similar way. The downside of this is that if you try to let multiple users use one profile then the phonetic model will get well screwed up and may effect accuracy.

I have it on my development milestones to link DNS11 into EZ-Builder after I have the EZ:2 robot fully built, I have to do this as I want our Ai core to work with EZ-Builder which will allow EZ-B robots to have the general conversation mode.