Controlling Text-To-Speech from a client application

Some times ago I talked about my library that allows to add Text-To-Speech capabilities to a .NET Gadgeteer application. In that occasion, we saw an example in which the text to speak was directly included into the application.

But we can do something more interesting. Using the network features of .NET Gadgeteer, it is simple to create a service that runs on the device and waits for text-to-speech requests coming from an external client: in other words, we send a text and the device will speak it.

So, let’s see how to create this system. We’ll use a FEZ Spider mainboard, to which an Ethernet module and a Music module are connected. In the ProgramStarted method, put the usual code to initialize the modules:

ethernet.NetworkUp += new GTM.Module.NetworkModule.NetworkEventHandler(ethernet_NetworkUp);
ethernet.NetworkDown += new GTM.Module.NetworkModule.NetworkEventHandler(ethernet_NetworkDown);
ethernet.UseDHCP();

music.MusicFinished += (m) => { };

speech = new SpeechSynthesizer(BING_APPID);
speech.GetSpeakBytesCompleted += 
                    new SpeechSynthesizer.GetSpeakBytesEventHandler(speech_GetSpeakBytesCompleted);

Note that we have created an event handler for the MusicFinished event, even if we are not not doing anything inside it, because otherwise the application will crash.

In the last two lines, we create an instance of SpeechSynthesizer class from my Text-To-Speech library. We’re using an old version of it, that requires a Bing Application ID, because the new version makes SSL requests that aren’t yet supported by .NET Gadgeteer. You can find this version of the library in the source code attached to this post.

Then, in the NetworkUp event we create the service and start the device Web Server on port 80:

private void ethernet_NetworkUp(GTM.Module.NetworkModule sender, 
                                                         GTM.Module.NetworkModule.NetworkState state)
{
    Debug.Print("Network up: " + ethernet.NetworkSettings.IPAddress);

    var webEvent = WebServer.SetupWebEvent("speech");
    webEvent.WebEventReceived += new WebEvent.ReceivedWebEventHandler(webEvent_WebEventReceived);
    WebServer.StartLocalServer(ethernet.NetworkSettings.IPAddress, 80);
}

With this code, the device starts a Web Server listening on port 80 that is able to answer to GET requests with the speech path, i.e. http://device_ip/speech. These requestes are handled in the webEvent_WebEventReceived handler:

private void webEvent_WebEventReceived(string path, WebServer.HttpMethod method, Responder responder)
{
    string text = responder.GetParameterValueFromURL("text");
    string language = responder.GetParameterValueFromURL("language");

    if (text != null && text.Trim().Length > 0 && language != null && language.Trim().Length > 0)
    {
        // Makes the request for speak bytes.
        speech.GetSpeakBytesAsync(text, language);
    }

    responder.Respond("OK");
}

private void speech_GetSpeakBytesCompleted(object sender, GetSpeakBytesEventArgs e)
{
    if (e.Error == null)
        music.Play(e.Data);
    else
        Debug.Print(e.Error.Message);
}

We get the text and language parameters from query string and, if they are correct, we make an asynchronous request to the speech library. When it completes, in the GetSpeakBytesCompleted event handler we reproduce the speech using the Music module.

That’s it. When the application runs and the network connection is up, the IP address is shown in the Output window of Visual Studio. You can make requests with a normal browser, using this form:

http://device_ip/speech?text=%5Byour_text%5D&language=%5Btext_language%5D

For example:

http://192.168.0.123/speech?text=Test&language=en

Or, if you prefer, you can use the Remote Speech Client console application that is included in the source code that goes with this post. It is interesting because it shows how you can send requests to the device from any .NET application.

Here is the complete Program.cs code of our .NET Gadgeteer application:

public partial class Program
{
    private const string BING_APPID = "";
    private SpeechSynthesizer speech;

    // This method is run when the mainboard is powered up or reset.   
    void ProgramStarted()
    {
        ethernet.NetworkUp += new GTM.Module.NetworkModule.NetworkEventHandler(ethernet_NetworkUp);
        ethernet.NetworkDown += new GTM.Module.NetworkModule.NetworkEventHandler(ethernet_NetworkDown);
        ethernet.UseDHCP();

        music.MusicFinished += (m) => { };

        speech = new SpeechSynthesizer(BING_APPID);
        speech.GetSpeakBytesCompleted +=
                       new SpeechSynthesizer.GetSpeakBytesEventHandler(speech_GetSpeakBytesCompleted);

        // Use Debug.Print to show messages in Visual Studio's "Output" window during debugging.
        Debug.Print("Program Started");
    }

    private void ethernet_NetworkDown(GTM.Module.NetworkModule sender, 
                                                        GTM.Module.NetworkModule.NetworkState state)
    {
        Debug.Print("Network down!");
    }

    private void ethernet_NetworkUp(GTM.Module.NetworkModule sender, 
                                                      GTM.Module.NetworkModule.NetworkState state)
    {
        Debug.Print("Network up: " + ethernet.NetworkSettings.IPAddress);

        var webEvent = WebServer.SetupWebEvent("speech");
        webEvent.WebEventReceived += new WebEvent.ReceivedWebEventHandler(webEvent_WebEventReceived);
        WebServer.StartLocalServer(ethernet.NetworkSettings.IPAddress, 80);
    }

    private void webEvent_WebEventReceived(string path, WebServer.HttpMethod method, Responder responder)
    {
        string text = responder.GetParameterValueFromURL("text");
        string language = responder.GetParameterValueFromURL("language");

        if (text != null && text.Trim().Length > 0 && language != null && language.Trim().Length > 0)
        {
            // Makes the request for speak bytes.
            speech.GetSpeakBytesAsync(text, language);
        }

        responder.Respond("OK");
    }

    private void speech_GetSpeakBytesCompleted(object sender, GetSpeakBytesEventArgs e)
    {
        if (e.Error == null)
            music.Play(e.Data);
        else
            Debug.Print(e.Error.Message);
    }
}

The complete application is available for download.

.NET Gadgeteer; Text-To-Speech

This entry was posted on March 11, 2012, 5:47 PM and is filed under Embedded Microcontrollers. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.

Integral Design