Progress on the SSMLWriter and lack thereof on Silverlight project
So as you can probably tell, any post that doesn't include something about Silverlight probably means that I'm procrastinating on it. Yep. Every time I start on it, I end up trying to dephuk it and end up doing the exact opposite. Anyway, I decided that using the StringBuilder approach to create a SSML string to pass into the SpeechSynthesizer.SpeakSsml method was totally lame even if it was just a temporary hack. So off I went to write a SSML writer class. My first pass used the XmlWriter as a base class. Problem was that with it, you get the keys to the kingdom so to speak. That means you can override damn near everything in the class. Or I guess you could just override a few things that you needed for SSML and Voice in particular. Any real SSMLWriter library would be worth doing well and doing it well would take more than the 4 hours I had allocated to screwing around today. The more I looked at it the more it looked like a great project for tomorrow when Kim and Sarah are at the Aquarium. I pretty much know where I think I want to go with it, but before going off 1/2 cocked, I think I need to spend a little more time learning the latest SSML spec (which means I know more than 40% of the tags without having to look them up which is barely the case now).
So the idea is to create a buttload of properties to allow you to manipulate the SpeechSynthesizer, the VoiceInfo Property and the VoiceInfo Properties in particular, ones that you can't really set now. SpeechSynthesizer can't be used as a base class (no surprise there) and as far as I can tell, there's no way to subclass VoiceInfo in any meaningful way to solve this problem. To be perfectly candid though, I haven't spent enough time looking into it. Hence, I think the best way to handle things is to create a SSMLWriter class that gives you all of the properties you'd want to set. From there, you can just deal with Strings and an intuitive set of properties and just return the .ToString() implementation of it. I need to think on it a little more but from what I can tell, that's probably the best way to go. Just like the XmlWriter abstracts things like Namespaces, declarations etc away so you don't have to remember each one of them and more importantly, you don't screw things up by a simple typo, this class will hopefully abstract as much of the SSML spec as possible. The spec isn't small so I think it's going to have to be done in steps. But that's the good part, so far, most of it looks very straightforward.
Here's a quick sample of how it'd work:
public String BuildString()
{
StringBuilder sb = new StringBuilder();
this.SSMLWriterInstance = XmlWriter.Create(sb,this.Settings );
this.SSMLWriterInstance.WriteStartDocument();
this.SSMLWriterInstance.WriteStartElement("speak", "http://www.w3.org/2001/10/synthesis");
this.SSMLWriterInstance.WriteAttributeString("version", "1.0");
this.SSMLWriterInstance.WriteAttributeString("xmlns", "xsi", null, "http://www.w3.org/2001/XMLSchema-instance");
this.SSMLWriterInstance.WriteAttributeString("xsi", "schemaLocation", null, "http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd");
this.SSMLWriterInstance.WriteAttributeString("xml", "lang", null, CultureRegion.US);
this.SSMLWriterInstance.WriteStartElement("voice");
this.SSMLWriterInstance.WriteAttributeString("gender", this.GenderType.Male);
//TextToSpeak = Kuz the boys in the hood are always hard, come talking that trash we'll pull your card, knowing nothing in life but to be legit, don't quote me boy kuz I ain't said
this.SSMLWriterInstance.WriteString(TextToSpeak);
this.SSMLWriterInstance.WriteEndElement();
this.SSMLWriterInstance.WriteEndElement();
this.SSMLWriterInstance.Flush();
return sb.ToString();
}
Yah, I have a long way to go. Playing with the VoiceInfo.Age, VoiceInfo.Gender, Variant etc can radically alter the effects here. Computerized versions of Boyz in the Hood lyrics really diminish the effect although I noticed that John Lennon songs sound a lot better than their real life counterparts. There's a ton of possibility here, particularly when you think about how much you could accomplish with an Xslt transform (yes, yes, yes, I was the same punk crying about how much Xslt slapped me around a few years ago, but I quickly saw the light and corrected the error of my ways). I started thinking about how easy it would be to Speak some phrases and have the Windows Live API go out and find you results, and speak them back to you. That is absolute child's play. Something cooler would be to use the Community Server Web Service API to pull down new blog posts from your subscribed blogs and read them to you. Again, that doesn't appear to be all that difficult. After all, the real challenge here isn't getting text into SSML (and you don't even need SSML if that's all you wanted to do, although it would make it cooler). But coming up with ideas is always the easy part, and any washed up self proclaimed ASP.NET guru can come up with ideas. It's implementing them that's the hard part, and since I don't want to sound anything like the aforementioned, well, I better stop thinking about what would be cool and focus on getting it done.
I just noticed. I spent way too much time writing this post and that sadly, doesn't leave me with enough time to get back to working on Silverlight. Bummer ;-)