Skip to content
/ mimic Public
forked from MycroftAI/mimic1

Mycroft's TTS engine, based on CMU's Flite (Festival Lite)

License

Notifications You must be signed in to change notification settings

Seb-Leb/mimic

 
 

Repository files navigation

Stories in Ready Mimic, the Mycroft TTS Engine

Mimic is a lightweight run-time speech synthesis engine, based on Flite (Festival-Lite). The Flite project website can be found here: http://www.festvox.org/flite/ - further information can be found in the ACKNOWLEDGEMENTS file in the Mimic repo.

###Requirements:

  • A good C compiler, some of these files are quite large and some C compilers might choke on these, gcc is fine. Sun CC 3.01 has been tested too. Visual C++ 6.0 is known to fail on the large diphone database files. We recommend you use GCC under Cygwin or mingw32 instead.
  • GNU Make
  • An audio device isn't required as mimic can write its output to a waveform file.

Supported platforms:

We have successfully compiled and run on

  • Linux, with both ARM and Intel architectures under it
  • Mac OS X
  • Android

###Compilation

TODO: update this to reflect compilation

####Linux: Obtain copy of the git repo:

git clone https://github.com/MycroftAI/mimic.git

Navigate to the mimic directory:

cd mimic

Setup configuration files for your machine:

./configure

Build mimic:

make

Note: When rebuilding, often you will only need to run make. If you make changes to compile flags you will probably want to run make clean before recompiling with make.

###Usage:

TODO: Shorten and update this to reflect current process,
update relevant filenames.

The ./bin/mimic voices contains all supported voices and you may choose between the voices with the -voice flag and list the supported voices with the -lv flag. Note the kal (diphone) voice is a different technology from the others and is much less computationally expensive but more robotic. For each voice additional binaries that contain only that voice are created in ./bin/mimic_FULLVOICENAME, e.g. ./bin/mimic_cmu_us_awb.

If it compiles properly a binary will be put in bin/, note by default -g is on so it will be bigger than is actually required

./bin/mimic "Flite is a small fast run-time synthesis engine" mimic.wav

Will produce an 8KHz riff headered waveform file (riff is Microsoft's wave format often called .WAV).

./bin/mimic doc/alice

Will play the text file doc/alice. If the first argument contains a space it is treated as text otherwise it is treated as a filename. If a second argument is given a waveform file is written to it, if no argument is given or "play" is given it will attempt to write directly to the audio device (if supported). if "none" is given the audio is simply thrown away (used for benchmarking). Explicit options are also available.

./bin/mimic -v doc/alice none

Will synthesize the file without playing the audio and give a summary of the speed.

./bin/mimic doc/alice alice.wav

will synthesize the whole of alice into a single file (previous versions would only give the last utterance in the file, but that is fixed now).

An additional set of feature setting options are available, these are debug options, Voices are represented as sets of feature values (see lang/cmu_us_kal/cmu_us_kal.c) and you can override values on the command line. This can stop mimic from working if malicious values are set and therefore this facility is not intended to be made available for standard users. But these are useful for debugging. Some typical examples are

./bin/mimic --sets join_type=simple_join doc/intro.txt Use simple concatenation of diphones without prosodic modification

./bin/mimic -pw doc/alice Print sentences as they are said

./bin/mimic --setf duration_stretch=1.5 doc/alice Make it speak slower

./bin/mimic --setf int_f0_target_mean=145 doc/alice Make it speak higher

The talking clock is an example talking clode as discussed on http://festvox.org/ldom it requires a single argument HH:MM under Unix you can call it ./bin/mimic_time `date +%H:%M`

./bin/mimic -lv List the voices linked in directly in this build

./bin/mimic -voice rms -f doc/alice Speak with the US male rms voice

./bin/mimic -voice awb -f doc/alice Speak with the "Scottish" male awb voice

./bin/mimic -voice slt -f doc/alice Speak with the US female slt voice

./bin/mimic -voice http://www.festvox.org/flite/packed/flite-2.0/voices/cmu_us_ksp.flitevox -f doc/alice Speak with KSP voice, download on the fly from festvox.org

./bin/mimic -voice voices/cmu_us_ahw.mimicvox -f doc/alice Speak with AHW voice loaded from the local file.

Voice names are identified as loadable files if the name includes a "/" (slash) otherwise they are treated as internal names. So if you want to load voices from the current directory you need to prefix them with "./".

###Voices

TODO: Explain where to find voices, and how to obtain new ones.

The voices/ directory contains several flitevox voices.

You can also find existing Flite voices here: http://www.festvox.org/flite/packed/flite-2.0/voices/

###Debugging:

The debug flag -g is already set when compiling. (This should probably be removed on release build)

Note: Currently the configure script enables compiler optimizations. These optimizations are the reason for any weird behavior while stepping through the code. (Due to the fact that the compiler has reordered/removed many lines of code.)

For now to disable optimizations edit the file mimic/config/config by hand. Near the top of the file change: CFLAGS = -g -O2 -Wall to read: CFLAGS = -g -O0 -Wall (You can also put any other debug flags here that you wish) Keep in mind that this file is auto generated by the configure script and will be overwritten if the script is run. Run make clean and then make to rebuild with the new flags.

Now you can run the program in the debugger: gdb --args ./bin/mimic -t "Hello. Doctor. Name. Continue. Yesterday. Tomorrow."

About

Mycroft's TTS engine, based on CMU's Flite (Festival Lite)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 99.8%
  • Other 0.2%