Cortana Speech Recognition Integration With Ez-builder

 
#1

We have managed to integrate Cortana unlimited (same as dictation mode) speech recognition with EZ-Builder which now means we can input any speech into EZ-Builder (for no cost unlike Dragon). we are using the HTTP custom server as can be seen on the screendump below.

User-inserted image

We added a Pandorabot to EZ-Builder (to use it with unlimited speech) but it always seems to use it's own (default MS) speech recognition and we can't see a way to send our speech string into the Pandorabot.

Can DJ or anyone advise how we may do this - running Pandorabot with reliable/accurate unlimited speech recognition would be very neat!

Thanks in advance for any help.

Tony

#2

Exciting! ! I hope someone can come up with something m!

#3

There's a ControlCommand() to send text. ControlCommand() can be viewed for each control in the Cheat Sheet.

What does it do? Will you be making this into a plugin for easier distribution and integration?

#4

DJ, at the moment we are just trying to see if we can use Cortana as a speech recognition input for EZ-Builder giving us a diction type mode and not be stuck with grammar (limited vocabulary) mode. My plan is to integrate with Pandorabot where we can ask any questions not pre-defined ones like grammar mode forces. I have been playing with Cortana speech recognition for a few days now and get around 99% accuracy it is surprisingly good!

Can you give me a bit more info (code snippet etc) on the send text command control required.
As always thanks for your help here.

Tony

#5

Cortana says it uses the same speech engine as recognition in their documentation. Strange that you would get different results.

The ControlCommand() syntax for any control can be viewed in the Cheat Sheet tab when editing script. Add the pandora bot control and when editing a script, check the Cheat Sheet tab for example.

here is a link that explains the Cheat Sheet and ezscript editor: http://www.ez-robot.com/Tutorials/Lesson/23?courseId=6

#6

DJ, I think you are possibly mistaken, the Cortana speech recognition is cloud based and not PC based and I believe is a much better SR engine - my reasoning is detailed below.

"Cortana’s speech recognition is actually a cloud-based system, where blocks of speech are submitted to the cloud for translation"

The above is referenced here

http://www.develop-online.net/tools-and-tech/how-windows-10-and-cortana-are-bringing-speech-recognition-to-games/0215391

Cortana also passes speech through a NLP (natural language processor) filter which obviously would improve SR engine output.

"The natural language processing capabilities of Cortana are derived from Tellme Networks (bought by Microsoft in 2007) " from Wikipedia

Cortana has to have a better SR engine as

Talking to the Pandorabot via the EZ-Builder (PC based SR) I get about 80% accuracy and it also hears itself which causes false recognitions - if I disconnect from the net SR continues to work proving that it is (in my opinion the not very good at dictation) internal SR based on a derivative of the Microsoft SR engine 6.1.

Talking to Cortana through my app yields 99% accuracy (similar to accuracy from Dragon) - if I disconnect from the net SR stops working proving that its cloud based.

My Cortana based SR also waits for its name which is important to stop false recognitions when interacting with the Pandorabot.

I may be wrong here, but I then cannot explain the huge difference in performance that I am seeing?

Tony

#7

Yes, cortana is 100% cloud based and runs through a service on the computer that handles this. The services is quite a pain to get turned off if you don't want it running and consuming resources on the computer. The service is quite bloated but cortana does work fairly well without training. It is also free to use if you upgraded to windows 10. If you didn't, this upgrade to windows 10 is no longer available for free. Also, if you have an older mac running bootcamp, windows 10 isn't an option.

#8

That's what I originally thought, but some Microsoft documentation had led me astray sometime ago. This page: https://msdn.microsoft.com/cortana/getstarted

Says this...

Quote:


Windows speech

Windows speech is a set of UWP APIs that enable both speech recognition and speech synthesis across multiple languages on all Windows-10 based devices, including IoT hardware, phones, tablets, and PCs.

Cortana on Windows uses these speech APIs.



Perhaps what they are failing to say correctly is Cortana uses the speech API for synthesis, not recognition.

Lastly, if you want pandora bot to be disabled from listening to voice commands, simply pause the control with the checkbox. View available ControlCommand()s using the Cheat Sheet as previously stated.

#9

Interesting that Cortana uses TellMe. I thought Microsoft had sold them. Maybe just sold the commercial IVR business and kept the technology. I'll need to do some research (in my last job I did some work with M$/TellMe on a partnership that fell apart shortly after I was laid off.

Alan

#10

David, yes when running my Cortana app the resources are up a bit.

With my Cortana app (and EZ-Builder) running I get 3 to 8% CPU

With it not running I get 1 to 4% CPU

This is on an i3 Windows 10 Acer micro desktop PC

The main thing is that I seem to be getting SR (dictation) accuracy that is close to Dragon and its free. The ability to say anything into EZ-Builder has some good applications and gets it away from having to use pre-defined grammers.

Alan, the TellMe info came from this wikipedia page

https://en.wikipedia.org/wiki/Cortana_(software)

Tony