Home           Contact us           FAQs           
 
   Journal Page   |   Aims & Scope   |   Author Guideline   |   Editorial Board   |   Search
    Abstract
2013 (Vol. 5, Issue: 07)
Article Information:

A Grammatical Evolution Approach for Content Extraction of Electronic Commerce Website

Wei Qing-jin and Peng Jian-sheng
Corresponding Author:  Wei Qing-jin 

Key words:  DOM, grammatical evolution, web content extraction, Xpath, , ,
Vol. 5 , (07): 2426-2432
Submitted Accepted Published
July 26, 2012 September 12, 2012 March 11, 2013
Abstract:

Web content extraction, a problem of identifying and extracting interesting information from Web pages, plays an important role in integrating data from different sources for advanced information-based services. In this paper, an approach and techniques of extracting electronic commercial information from the Web pages without any given template is investigated in a way of Grammatical Evolution (GE) method. Although a lot of research used the Xpath technique to extract the content of Web pages, but due to the complexity of the Xpath grammar, it is too difficult to perform the processing automatically for evolutional tools. Hence, a reduced language integrating Xpath and DOM techniques is given to generate the solution of parse in a BNF grammar form, which is used in the GE. Moreover, a fitness function evaluation method is also proposed on the fuzzy membership of the two parts in the chromosome. Finally, empirical results on several real Web pages show that the new proposed technique can segment data records and extract data from them accurately, automatically and flexibly.
Abstract PDF HTML
  Cite this Reference:
Wei Qing-jin and Peng Jian-sheng, 2013. A Grammatical Evolution Approach for Content Extraction of Electronic Commerce Website.  Research Journal of Applied Sciences, Engineering and Technology, 5(07): 2426-2432.
    Advertise with us
 
ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Current Information
   Sales & Services
   Contact Information
  Executive Managing Editor
  Email: admin@maxwellsci.com
  Publishing Editor
  Email: support@maxwellsci.com
  Account Manager
  Email: faisalm@maxwellsci.com
  Journal Editor
  Email: admin@maxwellsci.com
  Press Department
  Email: press@maxwellsci.com
Home  |  Contact us  |  About us  |  Privacy Policy
Copyright © 2009. MAXWELL Science Publication, a division of MAXWELLl Scientific Organization. All rights reserved