Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


Combining Lightly-supervised Learning and User Feedback to Construct and Improve a Statistical Parametric Speech Synthesizer for Malay

1Lau Chee Yong, 2Oliver Watts and 2Simon King
1Asia Pacific University, Technology Park Malaysia, Bukit Jalil, 57000 Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia
2Centre for Speech Technology Research, University of Edinburgh, UK
Research Journal of Applied Sciences, Engineering and Technology  2015  11:1227-1232
http://dx.doi.org/10.19026/rjaset.11.2229  |  © The Author(s) 2015
Received: July ‎19, ‎2015  |  Accepted: August ‎30, ‎2015  |  Published: December 15, 2015

Abstract

In this study, we aim to reduce the human effort in preparing training data for synthesizing human speech and improve the quality of synthetic speech. In spite of the learning-from-data used to train the statistical models, the construction of a statistical parametric speech synthesizer involves substantial human effort, especially when using imperfect data or working on a new language. Here, we use lightly-supervised methods for preparing the data and constructing the text-processing front end. This initial system is then iteratively improved using active learning in which feedback from users is used to disambiguate the pronunciation system in our chosen language, Malay. The data are prepared using speaker diarisation and lightly-supervised text-speech alignment. In the front end, grapheme-based units are used. The active learning used small amounts of feedback from a listener to train a classifier. We report evaluations of two systems built from high-quality studio data and lower-quality `found' data respectively and show that the intelligibility of each can be improved using active learning.

Keywords:

Active learning, lightly-supervised methods, statistical parametric speech synthesis,


References


Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved