Text To Speech Ideas?

Published 18 April 6 5:1 AM | William
I've been getting back into my Speech Server book and well, there's a lot to it.  My 4th chapter covers Grammars in depth and is a lot longer than I anticpated. If you do much speech work, they you know that Grammars are everything and ostensibly the most important chunk of development. Anyway, i was proofing my Text To Speech chapter and one of the examples I came up with was well, fairly cool. To use the TTS engine, it's pretty straightforward, you really just need to call the Speak method of the SpVoice object passing in a String and some flags. Anyway, there's not really much to it so I was looking to come up with a cool angle.  There are many applications that will allow you to read the text on a page, but I just created an implementation that allows you to specify a URL and it will retrieve the text and read it back to you.  As I was writing it, I started to use my new found Regex expertise to strip out the HTML , and then I realized that wasn't a good idea. Why?  Because if you don't have the markup, then it's hard to differentiate between values on a blog or a web page. So what I need to do now is check for blog entries and instead of stripping out the tags, just replace them with recognizable text. That way, they will be read (probably don't need to closing tags) and I can tell the difference between blog titles, entries and the like.  In its simplest form (without stripping out markup), here's what the code looks like.  I'll be posting the stripped out version later - so you can have a simple blog reader than can read to you as you multi-task something else:

private void btnSpeak_Click(object sender, EventArgs e)
{
     SpeechVoiceSpeakFlags SpFlags = SpeechVoiceSpeakFlags.SVSFlagsAsync;
     SpVoice VoiceDemo = new SpVoice();
     WebRequest DemoWebRequest = WebRequest.Create("http://msmvps.com/WilliamRyan");
     WebResponse DemoWebResponse = DemoWebRequest.GetResponse();
     Stream ReceiveStream = DemoWebResponse.GetResponseStream();
     Encoding encoding = Encoding.GetEncoding("utf-8");
     StreamReader readStream = new StreamReader(ReceiveStream, encoding);
     String Response = readStream.ReadToEnd();
     VoiceDemo.Speak(Response, SpFlags);
     readStream.Close();
     DemoWebResponse.Close();
}


This application also allows you to point to a blog, and have it read the entries, and save them to a voice file.  I tried it with a few of my favorite blogs and it's pretty decent. Having a little trouble recognizing "Pr4n" and "a55" but that can be handled with Regular Expressions (hell, or even simple String.Replace methods). The only problem is that it doesn't lend itself well to material you want to publish.

In case your interested, the next part of this application that I go through in Chapter 4 is using voice input to recognize what you ask it, and retrieve just what you ask for. So you can ask it to read a specific blog , skip entries, repeat items it's just read and well, anything else I can think of.  Each of the latter demonstrate implementations of Grammar utilization and well, are amazingly powerful.  I started working on a Dual Tone Multi Frequency application which allows you to read stuff from the phone - so you can use Touch Tones to read blogs, select new ones, repeat what it read and the like.  This is giving me a little more trouble than I was expecting but I'm getting close.

So what do you think? Am I on the right track with the idea of replacing markup tags with clearly recognizable ones, just so that it makes a better 'being read to' experience?
Filed under: ,

Comments

# Brian Madsen said on April 18, 2006 5:50 AM:

Hey Bill..

i've been attempting to strip content off sites for yonks and generally give up due to bad HTML formatting on the sites i'm scouring.

but, for blogs, why don't you just access the RSS feed or the Atom feed?

wouldn't that be generally easier than attempting to locate the actual post within the HTML?

# Marshall Harrison said on April 18, 2006 6:39 AM:

Sounds like an interesting project. I'm also curious why you don't just grab the RSS. I would think that it is more standard and should have less formatting issues for you.

# William said on April 18, 2006 7:47 AM:

Thanks guys. I was going after the HTML at first so I could use it as a general reader of web pages but it would probably make sense to just detect if it's RSS and parse it differently. It will have some branch logic but it'd probably be a lot easier. Will give it a try tonight.

Thanks!

# Brian Madsen said on April 19, 2006 9:05 AM:

Hey Bill,

knowing less than nothing about speech server et al i was wondering what's required to run one?

sounds like an interesting project and i always like to get my hands on new toys - especially since we're decommissioning a ton of servers at work and i'm in line to get hold of quite a few of them

# William said on April 20, 2006 8:49 AM:

Basically you need an install file and a computer. If you want to do telephony, you need a phone board, I got an intel dialogix for under $200.00 on ebay (granted, I'm just one dork in a house and I'm not running Fedex's automated phone system with my $180.00 board or anything, but a cheapo board is all you need for testing purposes).

Download the Speech Application SDK SASDK 1.1 and off you go. It's more addictive then Biztalk or even Crack

# Marshall Harrison said on April 24, 2006 8:12 AM:

Brian,

Come on over to GotSpeech.Net and sign-up for the free Brooktrout Starter Kit we are giving away. Just follow the link on the right-hand side of our home page. If you win you'll have everything you need to start messing around with Speech Server.

Hey Bill - Jonathan Hassell emailed me.

# Brian Madsen said on April 27, 2006 10:58 AM:

Brilliant Marshal, thanks for that!! heading over there now.

didn't quite see the comment box here - seems that Billy here has gotten himself some lucky comment-spam :)

Search

This Blog

Tags

Community

Archives

News

  • William G Ryan William Ryan Bill Ryan W.G. Ryan Charles Mark Carroll Charles M Carroll
    My Blog Juice Microsoft MVP
    Bill Ryan W.G. Ryan William Ryan
    Cuckooz' MySpace Page View Bill Ryan's profile on LinkedIn
    My Profile on Twitter
    Please note that this is my personal blog and the opinions expressed are my own. Also, comment moderation is about one of the least important things in my life so please keep that in mind. I can't vouch for the authenticity of any of the posters so please don't hold me accountable. And whatever you do, don't pretend to be Noted Option Strict Off expert and AspFriend Charles Mark Carroll when you post. Doing so will lead him to become apoplectic and write absurd accusatory posts about me that are as coherent and thought out as they are factually correct. He does a stellar job proving his reputation is well deserved and he doesn't need any help from you making himself look foolish. If I have to listen to him banging his spoon off of his high chair one more time, I'm going to burst into flames so please don't make that happen!

    My other sites

    Cool Stuff

    Book Stuff

    Security

    ORM

    Data Access

    Funny Stuff

    Compact Framework Stuff

    Web Casts

    My KnowledgeBase Articles

    My MVP Profile

    Design Patterns

    Performance

    Debugging

    Remoting

    My Fellow Authors

    My Books

    LINQ

    Misc

    Speech

    Syndication

    Email Notifications