Abstract:
People use speech recognition and speech synthesis to help, support, and boost
their daily activities. With just one of the speech technologies, developer can produce
various software. Combine both speech technologies, developer could produce more
various software. One of the combinations is mimic human speech. This research will
discuss about Speech Recognition that use Convolutional Neural Network as machine
learning model and Speech Synthesis that use Concatenative Synthesis with syllables
as speech unit. Different with the recent related works, this research has simpler
approach to mimic speech in Bahasa Indonesia. The purpose of this research is to
develop applications to collect, train, and mimic speech in Bahasa Indonesia. User can
participate to record their speech. Those speeches are collected to be trained for
recognizing speech in the application later. With the trained model, now user is able to
make the computer mimic their speech. First, user must identify their speech to be
recognized by the application. This step is necessary to create the user digital speech.
After that, based on registered syllables, which the speech has been identified by the
application, user is able to generate speech by making sentences from those syllables.
The applications to collect and mimic speech are developed as web-based application
and the application to train is developed as text-based application.