Text to Speech using Java
Text to speech conversions have fascinated developers for many years. Most modern languages support the technique. In this article Akhilesh demonstrates Java powered text to speech conversion.
This tutorial teaches how to convert the text input into Speech using Java. This can be implemented in two ways. They are,
1) JSAPI 1.0(Java Speech Application Programming Interface)
2) Free TTS(Free Text To Speech)
JSAPI 1.0:-
JSAPI 1.0 is developed by Sun MicroSystems. Jsapi contains two core technologies, viz. Speech Synthesis and Speech Recognition.
The Speech synthesis is a speech engine that converts the text into speech.
The Speech Recognition is used for Speech to text conversion. More Details about the JSAPI have been given below.
FreeTTS:-
FreeTTS is an open source package. It is entirely written by making use of Java programming language. This package can also be used to convert the text into speech.
Necessity for the conversion of the Text into Speech:
Speech synthesis of text in a word processor is an aid to proof-reading. It is easier to detect grammatical and stylistic problems. In the TTS, if we save the file in audio format, the size of the file will be larger than that of text file. It may be more useful in mobile phones in which we can hear the message that we have received instead of reading the SMS.
Methods to convert Text into Speech
Structure analysis: Structure analysis processes the input text to determine where paragraphs, sentences and other structures start and end.
Text pre- processing: This expands any abbreviations or other forms of shorthand.
e.g. St. Joseph becomes Saint Joseph.
Text-to-phoneme conversion: It converts each word to phonemes which refers to a basic unit of sound in a language.
Prosody analysis: It processes the sentence structure, words and phonemes to determine appropriate prosody for the sentence.
Waveform production: Finally, the phonemes and prosody information are used to produce the audio waveform for each sentence.
1) JSAPI 1.0(Java Speech Application Programming Interface)
2) Free TTS(Free Text To Speech)
JSAPI 1.0:-
JSAPI 1.0 is developed by Sun MicroSystems. Jsapi contains two core technologies, viz. Speech Synthesis and Speech Recognition.
The Speech synthesis is a speech engine that converts the text into speech.
The Speech Recognition is used for Speech to text conversion. More Details about the JSAPI have been given below.
FreeTTS:-
FreeTTS is an open source package. It is entirely written by making use of Java programming language. This package can also be used to convert the text into speech.
Necessity for the conversion of the Text into Speech:
Speech synthesis of text in a word processor is an aid to proof-reading. It is easier to detect grammatical and stylistic problems. In the TTS, if we save the file in audio format, the size of the file will be larger than that of text file. It may be more useful in mobile phones in which we can hear the message that we have received instead of reading the SMS.
Methods to convert Text into Speech
Structure analysis: Structure analysis processes the input text to determine where paragraphs, sentences and other structures start and end.
Text pre- processing: This expands any abbreviations or other forms of shorthand.
e.g. St. Joseph becomes Saint Joseph.
Text-to-phoneme conversion: It converts each word to phonemes which refers to a basic unit of sound in a language.
Prosody analysis: It processes the sentence structure, words and phonemes to determine appropriate prosody for the sentence.
Waveform production: Finally, the phonemes and prosody information are used to produce the audio waveform for each sentence.
Requirements for Text to Speech:
The following jar files are very important
cmu_us_kal.jar,
cmulex.jar,
en_us.jar;
freetts.jar;
cmulex.jar;
jsapi.jar;
The above files are available in FreeTTS-1.2.1-bin. FreeTTS is a open source package. Download freetts-1.2.1-bin.zip from http://sourceforge.net/projects/freetts/
1. Unzip the freeTTS binary package and check inside the \lib directory, that all the above jar files are available except jsapi.jar.
2. The jsapi.exe file will be available.
3. Run Jsapi.exe, and you will get jsapi.jar.
4. Copy all the Jars (jsapi.jar, freetts.jar, cmu_time_awb.jar, cmu_us_kal.jar, etc.) to that working folder or C:\Program files\java\ jdk1.6.0_03\jre\lib\ext
Important Classes in javax.speech package:
Import javax.speech.*; import javax.speech.synthesis.*;
Speech package contains two important packages namely synthesis and recognizer.
Engine:
The Engine interface is available inside the speech package. The Engine interface is the parent interface for all speech engines including Recognizer and Synthesizer. “Speech engine” is the generic term for a system designed to deal with either speech input or speech output.
import javax.speech.Engine;
The basic processes for using a speech engine in an application are as follows.
1. Identify the application’s functional requirements for an engine (e.g, language or dictation capability).
2. Locate and create an engine that meets those functional requirements.
3. Allocate the resources for the engine.
4. Set up the engine.
5. Begin operation to allocate the engine and resume it.
6. Use the engine. Deallocate the resources of the engine.
Central:
The Central class is the initial access point to all speech input and output capabilities. Central provides the ability to locate, select and create speech recognizers and speech synthesizers
SynthesizerModeDesc
SynthesizerModeDesc extends the EngineModeDesc with properties that are specific to speech synthesizers. A SynthesizerModeDesc inherits engine name, mode name, locale and running properties from EngineModeDesc. SynthesizerModeDesc adds two properties:List of voices provided by the synthesizer Voice to be loaded when the synthesizer is started.
Synthesizer
The Synthesizer interface provides primary access to speech synthesis capabilities
Voice:
A description of one output voice of a speech synthesizer. Voice objects can be used in selection of synthesis engines (through the SynthesizerModeDesc). The current speaking voice of a Synthesizer can be changed during operation with the setVoice method of the SynthesizerProperties object.
Create a simple program using jsapi speech synthesis.
Import javax.speech.*; import javax.speech.synthesis.*;
Step 1: We first import the following Packages.
Step 2 : Create a Synthesizer:-
Synthesizer syn = Central.createSynthesizer(null);
This method creates a default Systhesizer. This Synthesizer gets the default locale.
SynthesizerModeDesc desc=new SynthesizerModeDesc();desc.setLocale(new Locale("de", ""));desc.addVoice(new Voice(null, GENDER_FEMALE, AGE_DONT_CARE, null));Synthesizer synthesizer = Central.createSynthesizer(desc);
The above coding is to select a particular Locale and Particular Voice.
Step 3:
The following code is to allocate and resume the systhesizer.
synthesizer.allocate(); synthesizer.resume();
Step 4:
Voice[] voices = desc.getVoices();
Get the available voices:
The get voices method return all the available voices from the selected Synthesizer.
synthesizer.getSynthesizerProperties(). setVoice(voice);
Step 5:
The setVoice() Method sets a particular Voice.
Step 6:Speak The text
synthesizer.speakPlainText(speaktext, null); synthesizer.waitEngineState(Synthesizer.QUEUE_EMPTY);
The speakPlainTex(speaktext,null) method speaks the given text until queue is empty.
Step 7:
Deallocate the Synthesizer:
synthesizer.deallocate();
deallocate() method is deallocate the synthesizer.
Demo Programs:
Two demo programs are given below. The first one uses the jsapi.jar and the second one uses freetts.jar.
import javax.speech.*;
import java.util.*; import javax.speech.synthesis.*; public class demojsapi { String speaktext; public void dospeak(String speak,String voicename) { speaktext=speak; String voiceName =voicename; try { SynthesizerModeDesc desc = new SynthesizerModeDesc(null,"general", Locale.US,null,null); Synthesizer synthesizer = Central.createSynthesizer(desc); synthesizer.allocate(); synthesizer.resume(); desc = (SynthesizerModeDesc) synthesizer.getEngineModeDesc(); Voice[] voices = desc.getVoices(); Voice voice = null; for (int i = 0; i < voices.length; i++) { if (voices[i].getName().equals(voiceName)) { voice = voices[i]; break; } } synthesizer.getSynthesizerProperties().setVoice(voice); synthesizer.speakPlainText(speaktext, null); synthesizer.waitEngineState(Synthesizer.QUEUE_EMPTY); synthesizer.deallocate(); }catch (Exception e) { String message = " missing speech.properties in " + System.getProperty("user.home") + "\n"; System.out.println(""+e); System.out.println(message);} } public static void main(String[] args) { demojsapi obj=new demojsapi(); obj.despeak(args[0],” Akhilesh006”); } } |
Running Procedure:
1) create a folder named texttospeech.
2) Copy all the Jars (jsapi.jar, freetts.jar, cmu_time_awb.jar, cmu_us_kal.jar, etc.) to that folder or C:\Program files\java\ jdk1.5.0\jre\lib\ext.
3) create the program named demojsapi.java
4) If all the jars files are in your working folder, then set the class path as
set classpath=cmu_us_kal.jar;en_us.jar;
freetts.jar;cmulex.jar;jsapi.jar;
If all the jar files do not exist in your working folder, set the class path as
(ex)
Set classpath=C:\Program files\java\ jdk1.5.0\jre\lib;C:\Program files\java\ jdk1.5.0\jre\lib\cmu_us_kal.jar; C:\Program files\java\jdk1.5.0\jre\lib\en_us.jar;C:\Program files\java\ jdk1.5.0\jre\lib\freetts.jar;C:\Program iles\java\ jdk1.5.0\lib\cmulex.jar;C:\Program files\java\ jdk1.5.0\jre\lib\jsapi.jar;
4)set the java path:
eg: set path=c:\windows\system32;c:\jdk1.4\bin
5) Compile the program
javac demojsapi.java
6) Run the program
java demojsapi “Welcome to ACCET”
(If you get the error : “missing speech.properties in user home: c:\documents and setting\user”, copy the speech.properties from freetts1.2.1 folder and paste it into user home(if we use windows xp, the user home is c:\documents and setting\user)
7)You will get the expected result.
2) demofreetts.java
import com.sun.speech.freetts.*;
import java.util.*; public class demofreetts { private String speaktext; public void dospeak(String speak,String voice) { speaktext = speak; try { VoiceManager voiceManager = VoiceManager.getInstance(); Voice voices = voiceManager.getVoice(voice); Voice sp=null; if(voices!=null) { sp=voices; } else { System.out.println("No Voice Available"); } //======================== sp.allocate(); sp.speak(speaktext); sp.deallocate(); //========================= }catch(Exception e){e.printStackTrace();} } public static void main(String[] args) { demofreetts obj=new demofreetts(); obj.despeak(args[0],”Akhilesh006”) } } |
The procedure for running the second program is the same as that of the first one.
About Author:
Akhilesh Dubey is final year MCA Student. You can reach him on akhil.bca08@gmail.com
About Author:
Akhilesh Dubey is final year MCA Student. You can reach him on akhil.bca08@gmail.com
http://www.ling.helsinki.fi/kit/2004s/ctl310gen/L7-Speech/JSAPI/
ReplyDeletehttp://www.oracle.com/technetwork/java/index-138300.html#top
ReplyDelete