Ubiquity command Say

So It’s being a long time I haven’t posted anything and bla bla bla… (Not going to go on with this useless text). Finally I managed to work on something interesting and think it makes sense to blog about it…

So the topic will be another Ubiquity command Say, which I’ve implemented recently. I got inspiration from the command line tool on OSX, called “say”. For some of you who had not seen this app it just an app which converts text to speech.

I wanted to play with a cool new feature from HTML5 - Video & Audio elements presented in Firefox 3.1. Besides I noticed some noise regarding some restrictions to this elements so I really wanted to try all this out to get a personal opinion on that.

Here are some links from this discussion

http://www.bluishcoder.co.nz/2008/11/video-audio-and-cross-domain-usage.html
http://blog.mozilla.com/schrep/2008/08/08/building-the-world-we-want-not-the-one-we-have/
http://ajaxian.com/archives/video-audio-cross-origin"

Well I had an idea, had cool new feature to play with. Only thing that I was missing was some online free text-to-speech service I could use for that. It appeared quite hard to find cause most of the services were non free. Among the free once some had pure quality, and the rest were using unsupported (by new elements) audio formats. I was really close to give up on this this idea after 6hrs of googleing, when another idea came to my mind it was bit sneaky, but still it was not illegal so I decided to go for it. The idea was to use demo pages for the non free text-to-speech services. Most of the restrictions for this sites are restricting API’s for the use from other domains. But non of them are restricting to open their pages in the browser without showing the content of the page to the user :) Think most of you got the point.

Finally I found service which could feet my needs >:)
http://www.research.ibm.com/tts/coredemo.shtml

As you might imagine there was a restriction to the service. It was possible to convert only 200 chars at a time. Well it just made my work more interesting only thing I had to do was to build a play-list from the selection to pronounce. It could be a very simple (thanks to the event listeners you can set to the video element), but I suddenly discovered that there was restriction to the video element even for the use from chrome. Well this made me very upset, I did not liked idea of restrictions at all but restriction for the use from chrome is even more stupid from my point of view. I could not believe that I was not able to play audio from the remote server, I thought I was doing something wrong, but apparently I was not :( So I went back to the old ugly but tested way of using plugins. Unfortunately I was not able to convert more then 200 char length text but still it was cool command and still useful for me, as my goal was to find how to pronounce words i don’t know correctly. (As you might guess from the part you’ve already read my English is quite poor and of course I’m not native English speaker).

I was not satisfied, so I was searching the way for work around it… And there was a quite simple way :). If you are using beta or nightly of Firefox 3.1 you might noticed that .wav, .ogg files are opened by Firefox in a new tab in the Video tag (No matter if the file is local or remote one) :) If some of you did not got the idea yet it is - an iframe where you load remote wav files. That’s what I did basically say command on preview creates hidden iframe inside the preview block, and is loads there the wav files which was generated by text-to-speech service. As I did not liked the restriction of the 200 chars I am splitting selected text into the parts, where each part contains several complete words and it’s length is <= 200 chars. While loading each part into the hidden iframe I also set event listener to the video tag inside the iframe to load next part in the frame when the video element will finish playing.Unfortunately it was not the last hack as I discovered soon that the text-to-speech service I mentioned above had a restriction on the server as well. (They block ip for their service after heavy use of the service). In this case solution was simple I just jumped to another service with bit worth API and without this restriction. Cool thing is that in this case restriction from 200 chars had grew till 300 chars. To avoid the new restrictions from the service owners I will not mention it in this blog!

After all this you can enjoy nice Ubiquity command Say. If some of you want look for the source you can get it here

http://github.com/Gozala/ubiquity/tree/master/commands/say