Home           Contact us           FAQs           
     Journal Home     |     Aim & Scope    |    Author(s) Information      |     Editorial Board     |     MSP Download Statistics
2014 (Vol. 7, Issue: 5)
Article Information:

Linear Reranking Model for Chinese Pinyin-to-Character Conversion

Xinxin Li, Xuan Wang, Lin Yao and Muhammad Waqas Anwar
Corresponding Author:  Xinxin Li 

Key words:  Dependency model, minimum error learning method, part-of-speech tagging, word n-gram model, , ,
Vol. 7 , (5): 975-980
Submitted Accepted Published
January 31, 2013 February 25, 2013 February 05, 2014

Pinyin-to-character conversion is an important task for Chinese natural language processing tasks. Previous work mainly focused on n-gram language models and machine learning approaches, or with additional hand-crafted or automatic rule-based post-processing. There are two problems unable to solve for word n-gram language model: out-of-vocabulary word recognition and long-distance grammatical constraints. In this study, we proposed a linear reranking model trying to solve these problems. Our model uses minimum error learning method to combine different sub models, which includes word and character n-gram LMs, part-of-speech tagging model and dependency model. Impact of different sub models on the conversion are fully experimented and analyzed. Results on the Lancaster Corpus of Mandarin Chinese show that our new model outperforms word n-gram language model.
Abstract PDF HTML
  Cite this Reference:
Xinxin Li, Xuan Wang, Lin Yao and Muhammad Waqas Anwar, 2014. Linear Reranking Model for Chinese Pinyin-to-Character Conversion.  Research Journal of Applied Sciences, Engineering and Technology, 7(5): 975-980.
    Advertise with us
ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Current Information
   Sales & Services
Home  |  Contact us  |  About us  |  Privacy Policy
Copyright © 2015. MAXWELL Scientific Publication Corp., All rights reserved