Real Time Talking System for Virtual Human based on ProPhone

Itimad Raheem Ali; Ghazali Sulong; Hoshang Kolivand

doi:10.19026/rjaset.13.3046

Research Journal of Applied Sciences, Engineering and Technology

Research Article | OPEN ACCESS

Real Time Talking System for Virtual Human based on ProPhone

Itimad Raheem Ali, Ghazali Sulong and Hoshang Kolivand

MaGIC-X (Media and Games Innovation Centre of Excellence), UTM-IRDA Digital Media Centre Universiti Teknologi Malaysia, 81310 Skudai, Johor, Malaysia

Research Journal of Applied Sciences, Engineering and Technology 2016 8:611-616

http://dx.doi.org/10.19026/rjaset.13.3046 | © The Author(s) 2016

Received: May ‎25, ‎2015 | Accepted: July ‎26, ‎2015 | Published: October 15, 2016

Back to issue | PDF | HTML

Abstract

Lip-syncing is a process of speech assimilation with the lip motions of a virtual character. A virtual talking character is a challenging task because it should provide control on all articulatory movements and must be synchronized with the speech signal. This study presents a virtual talking character system aimed to speeding and easing the visual talking process as compared to the previous techniques using the blend shapes approach. This system constructs the lip-syncing using a set of visemes for reduced phonemes set by a new method named Prophone. This Prophone depend on the probability of appearing the phoneme in the sentence of English Language. The contribution of this study is to develop real-time automatic talking system for English language based on the concatenation of the visemes, followed by presenting the results that was evaluated by the phoneme to viseme table using the Prophone.

Keywords:

Phoneme , prophone , real-time talking , virtual character , visemes,

References

Akagunduz, E., U. Halici and K. Ulusoy, 2004. Simulation of Turkish lip motion and facial expressions in a 3D environment and synchronization with a Turkish speech engine. Proceeding of the IEEE 12th Signal Processing and Communications Applications Conference, pp: 276-279.
Balci, K., 2004. Xface: MPEG-4 based open source toolkit for 3D facial animation. Proceeding of the Working Conference on Advance Visual Interfaces, pp: 399-402.
Boersma, P., and D. Weenink, 2001. Praat 3.9. 15 [Computer Software]. Institute of Phonetic Sciences, Amsterdam, the Netherlands.
Esposito, A. and A.M. Esposito, 2011. On Speech and Gestures Synchrony. In: Esposito A. et al. (Eds.), Analysis of Verbal and Nonverbal Communication and Enactment. The Processing Issues. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 6800: 252-272.
Gonseth, C., A. Vilain and C. Vilain, 2013. An experimental study of speech/gesture interactions and distance encoding. Speech Commun., 55(4): 553-571.
CrossRef Direct Link
Scobbie, J.M., O.B. Gordeeva and B. Matthews, 2006. Acquisition of Scottish english phonolgy: An overview. Proceeding of QMUC Speech Science Research Centre Working Paper WP-7, Queen Margaret University College, 7: 3-30.
L�pez-Colino, F. and J. Col�s, 2012. Spanish sign language synthesis system. J. Visual Lang. Comput., 23(3): 121-136.
CrossRef Direct Link
NET Framework Conceptual Overview, 2012. Microsoft Developer Network Platform, Retrieved from http://msdn.microsoft.com/enus/library/w0x726c2%28v=vs.90%29.aspx.
Direct Link
Schuller, B., S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. M�ller and S. Narayanan, 2013. Paralinguistics in speech and language�State-of-the-art and the challenge. Comput. Speech Lang., 27(1): 4-39.
CrossRef Direct Link
Serra, J., M. Ribeiro, J. Freitas, V. Orvalho and M.S. Dias, 2012. A Proposal for a Visual Speech Animation System for European Portuguese. In: Toledano, D.T. et al. (Eds.), Advances in Speech and Language Technologies for Iberian Languages. Springer-Verlag, Berlin, Heidelberg, 328: 267-276.
CrossRef Direct Link
TRueSpel, 2001. English-Truespel (USA Accent) Text Conversion Tool. Retrieved from: http://www.foreignword.com/dictionary/truespel/transpel.htm.
Direct Link
Wang, L., H. Chen, S. Li and H.M. Meng, 2012. Phoneme-level articulatory animation in pronunciation training. Speech Commun., 54(7): 845-856.
CrossRef Direct Link
Xu, Y., A.W. Feng, S. Marsella and A. Shapiro, 2013. A practical and configurable lip sync method for games. Proceeding of the Motion on Games (MIG'13), pp: 131-140.
Zhang, L., M. Jiang, D. Farid and M.A. Hossain, 2013. Intelligent facial emotion recognition and semantic-based topic detection for a humanoid robot. Expert Syst. Appl., 40(13): 5160-5168.
CrossRef Direct Link

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online): 2040-7467
ISSN (Print): 2040-7459

Information

Sales & Services



Journal Home \| Aim & Scope \| Author(s) Information \| Editorial Board \| MSP Download Statistics