Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


Improve the Quality of Synthetic Speech Trained with Found Data using Silence Cutter

Lau Chee Yong, Tan Tian Swee and Mohd Nizam Mazenan
Medical Implant Technology Group (MediTEG), Cardiovascular Engineering Center, Material Manufacturing Research Alliance (MMRA), Faculty of Biosciences and Medical Engineering (FBME), Universiti Teknologi Malaysia, Malaysia
Research Journal of Applied Sciences, Engineering and Technology  2014  14:1691-1694
http://dx.doi.org/10.19026/rjaset.8.1151  |  © The Author(s) 2014
Received: July ‎14, ‎2014  |  Accepted: September ‎20, ‎2014  |  Published: October 10, 2014

Abstract

Using found data as training data in statistical parametric speech synthesis can alleviate various problems in tedious database construction. However, the extra silences resided in found data degrades the quality of synthetic speech. Therefore, in this study, silence cutter was created to eliminate the extra silences in the training data. The motivation is the extra silences would be incorrectly assigned to training script and result in unnatural synthetic speech. Therefore, in this study, a Malay speech synthesis system has been constructed using found data from internet. Silence cutter has been utilized to cut out extra silences. The synthetic speech using found data with and without silence cutter was verified and compared to find out the effect of silence cutter. Result showed that silence cutter has help to improve synthetic speech naturalness and reduce the Word Error Rate (WER) in intelligibility test. In short, using found data can alleviate the problem of preparing high quality training data and silence cutter can be used to refine the found data to generate better quality of synthetic speech.

Keywords:

Found data, hidden Markov model, statistical parametric speech synthesis,


References

  1. Benoît, C., M. Grice and V. Hazan, 1996. The SUS test: A method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences. Speech Commun., 18(4): 381-392.
    CrossRef    
  2. Chopde, S. and U. Pushpa, 2014. HMM-based speech synthesis. Int. J. Mod. Eng. Res. (IJMER), 3(4): 1894-1899.
  3. Dempster, A.P., N.M. Laird and D.B. Rubin, 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B Met., 39(1): 1-38.
  4. Ibe, O.C., 2013. 14-Hidden Markov Models. In: Ibe, O.C. (Ed.), Markov Processes for Stochastic Modeling. 2nd Edn., Elsevier, Oxford, pp: 417-451.
    CrossRef    
  5. Kawahara, H., 2006. STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds. Acoust. Sci. Technol., 27(6): 349.
    CrossRef    
  6. Tokuda, K., T. Yoshimura, T. Masuko, T. Kobayashi and T. Kitamura, 2000. Speech parameter generation algorithm for HMM-based speech synthesis. Proceeding of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), pp: 1315-1318.
    CrossRef    
  7. Tokuda, K., Z. Heiga and A.W. Black, 2002. An HMM-based speech synthesis system applied to english. Proceeding of 2002 IEEE Workshop on Speech Synthesis, pp: 227-230.
  8. Watts, O., J. Yamagishi and S. King, 2010. Letter-based speech synthesis. Proceeding of Speech Synthesis Workshop 2010.
  9. Young, S.J., J.J. Odell and P.C. Woodland, 1994. Tree-based state tying for high accuracy acoustic modelling. Proceeding of ARPA Human Language Technology Workshop, pp: 307-312.
    CrossRef    
  10. Zen, H., T. Nose, J. Yamagishi, S. Sako, T. Masuko, A. Black and K. Tokuda, 2007. The HMM-based speech synthesis system (HTS) version 2.0. Proceeding of the 6th ISCA Workshop on Speech Synthesis. Bonn, Germany, August 22-24, 2007.
  11. Zen, H., K. Tokuda and A.W. Black, 2009. Statistical parametric speech synthesis. Speech Commun., 51(11): 1039-1064.
    CrossRef    

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved