Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology

    Abstract
2014(Vol.7, Issue:5)
Article Information:

Linear Reranking Model for Chinese Pinyin-to-Character Conversion

Xinxin Li, Xuan Wang, Lin Yao and Muhammad Waqas Anwar
Corresponding Author:  Xinxin Li 
Submitted: January 31, 2013
Accepted: February 25, 2013
Published: February 05, 2014
Abstract:
Pinyin-to-character conversion is an important task for Chinese natural language processing tasks. Previous work mainly focused on n-gram language models and machine learning approaches, or with additional hand-crafted or automatic rule-based post-processing. There are two problems unable to solve for word n-gram language model: out-of-vocabulary word recognition and long-distance grammatical constraints. In this study, we proposed a linear reranking model trying to solve these problems. Our model uses minimum error learning method to combine different sub models, which includes word and character n-gram LMs, part-of-speech tagging model and dependency model. Impact of different sub models on the conversion are fully experimented and analyzed. Results on the Lancaster Corpus of Mandarin Chinese show that our new model outperforms word n-gram language model.

Key words:  Dependency model, minimum error learning method, part-of-speech tagging, word n-gram model, , ,
Abstract PDF HTML
Cite this Reference:
Xinxin Li, Xuan Wang, Lin Yao and Muhammad Waqas Anwar, . Linear Reranking Model for Chinese Pinyin-to-Character Conversion. Research Journal of Applied Sciences, Engineering and Technology, (5): 975-980.
ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved