Award
Congratulations Rich, for answering the question and has received an EZ-Credit award! Share what you have learned and create a tutorial to help others by clicking here.

Convert Speech To Text (not Text To Speech), Possible?

Assistance Requested

Help Richard R with their question and receive $10 of EZ-Credit to get more robots and parts from our store. The following information was provided about their previous efforts searching tutorials for a resolution.

Richard R claims to have checked these sources:
 
#1

I am just throwing around some new ideas I have... One of them is I am trying to figure out if there is any way to use WriteFile and ReadFile in Ez Builder to create a database for my robot to learn new phrases and commands on his own... So instead of just using a list of typical canned responses to questions and phrases already programmed, I would like my robot to store new commands and phrases he has never heard before... Then store an answer or response to mirror the question or phrase just spoken... This is so that next time he hears the phrase or command he will now know how to respond accordingly... i.e. learning as he goes... If this is not possible yet in EZ Builder... Maybe a new feature possibility?

Any comments or suggestions welcomed...

Cheers
Richard

I have found a few conversations, tutorials and activities that may help with your question. Take a look at these links. I've sorted them by what I believe to be most relevant but that is not always the case as I'm still learning.


Also, consider reviewing the Learn section for informative lessons and activities. Check it out!


#2

Yes but not with EZ-Builder.

I've been toying with posting a tutorial for some other software which will do this so I'll vet on to that later.

There are problems though, accuracy is very low at best.

I'll post more when I'm not on my phone.

#3

Sweet, that would be awesome Rich.... I am wondering if this would ever be possible to add this to EZ Builder as a feature at some point? If DJ could some how accomplish this, it would be a seriously brilliant...

Cheers and thanks again Rich

#4

I don't think it's difficult to get in to EZ-Builder but the results are very poor which is likely to be why it isn't part of EZ-Builder.

I tend to use payload lists or set phrases despite having the ability to freely speak with Jarvis, the accuracy is so much higher. For instance, I have one command programmed for shopping lists for groceries, I can speak an item i.e. Bacon, that he knows and accuracy is 98%+ or I can add an item but accuracy is down the 75-80% mark at best. The payload list takes priority.

I should have mentioned, the software I use isn't free. It's affordable and cheaper than DNS (which is another thing that works well but costly) but you need to shell out a little bit - free trial available to test it out anyway so no big loss Smile

Details coming when I get home and have a chance to explain it all.

#5

It should be possible for DJ interface to Dragon Naturally Speaking, which is the market leader in speech recognition. They do have API's for 3rd party interface. It used to be easier because they would tie right into the Micrsoft SAPI, but that as when speech reco didn't come standard in Windows.

Alan

#6

Rich is right.

There are 2 modes in modern day speech recognition and they are "grammar mode" and "dictation mode". EZ-Builder uses grammar mode which basically compares the incoming phrase with pre-programmed phrases from your script etc. To do what you are suggesting you need to use dictation mode, where you can say any phrase and the SR engine tries to workout what that is.

Grammar mode is very accurate as it only has to compare phrases, dictation is not and depends on how good the SR engine is and how well the system knows the users phonetic profile.

EZ-Builder uses the internal Windows SR engine, which works great for grammar mode, but I (personally) have never had much success in dictation mode even after a lot of training. I have done what you are suggesting with our Ai core (ARIEL), but I had to use Dragon (DNS11) engine as this works great for me in dictation mode.

Tony

#7

Pandorabot kind of works as a reference, but mostly I get "low confidence" phrases being spit out... thanks guys for the ideas...

#8

@Toymaker... @ Alan.... thanks also..

#9

Yes, Pandorabot (I assume) uses dictation mode rather than grammar mode. If you aren't satisfied with the results from that then really it's going to be a case of training and upgrading hardware (i.e. mics). No amount of programming will alter that and you would also find the method I mentioned earlier would give very poor results too.

I've been training the same voice profile for 4 years now. It is constantly learning every time I speak a command. It has only just, over the last 6 months or so, started to give 90%+ positive results. Prior to this it was 80-90%, I have a required confidence level of 94% (otherwise the TV or even Jarvis himself will be picked up and end up in a never ending loop).

DNS works much better however I didn't have time to change over to it when I was trying it out last year. It's something I may move over to but if I do I will need to rework 4 years of work done with my current set up.

#10

It's something to consider now... I'll have to rethink what I want to do then.