Home            Contact us            FAQs
    
      Journal Home      |      Aim & Scope     |     Author(s) Information      |      Editorial Board      |      MSP Download Statistics

     Research Journal of Applied Sciences, Engineering and Technology


An Efficient EM based Ontology Text-mining to Cluster Proposals for Research Project Selection

1D. Saravana Priya and 2M. Karthikeyan
1Department of Information Technology, P.A. College of Engineering and Technology, Pollachi, India
2Department of ECE, Tamilnadu College of Engineering, Coimbatore, India
Research Journal of Applied Sciences, Engineering and Technology  2014  12:1435-1441
http://dx.doi.org/10.19026/rjaset.8.1118  |  © The Author(s) 2014
Received: June ‎08, ‎2014  |  Accepted: July ‎19, ‎2014  |  Published: September 25, 2014

Abstract

Both the internet and the intranets contain more resources and they are called as text documents. Research and Development (R&D) scheme selection is a type of decision-making normally present in government support agencies, universities, research institutes and technology intensive companies. Text Mining has come out as an authoritative technique for extracting the unknown information from large text document. Ontology is defined as a knowledge storehouse in which concepts and conditions are defined in addition to relationships between these concepts. Ontology's build the task of searching alike pattern of text that to be more effectual, efficient and interactive. The present method for combine proposals for selection of research project is proposed by ontology based text mining technique to the data mining approach of cluster research proposals support on their likeness in research area. This proposed method is efficient and effective for clustering research proposals. Though the research proposal regarding particular research area is cannot always be accurate. This study proposed an ontology based text mining to group research proposals, external reviewers based on their research area. The proposed method like Efficient Expectation-Maximization algorithm (EEM) is used to cluster the research proposal and gives better results in more efficient way.

Keywords:

Apriori , document clustering , ontology,


References

  1. Aas, K. and L. Eikvil, 1999. Text categorisation: A survey. Technical Report, Raport NR 941, Norwegian Computing Center.
  2. Albala, A., 1975. Stage approach for the evaluation and selection of R&D projects. IEEE T. Eng. Manage., 22: 153-164.
    CrossRef    
  3. Arunachala, E.S., S. Hismath Begum and M. Uma Makeswari, 2013. An ontology based text mining framework for R&D project selection. Int. J. Comput. Sci. Inform. Technol., 5(1).
  4. Baeza-Yates, R. and B. Ribeiro-Neto, 1999. Modern Information Retrieval. Addison Wesley, Wokingham, UK.
  5. Baker, N. and J. Freeland, 1975. Recent advances in R&D benefit measurement and project selection methods. Manage. Sci., 21(10): 1164-1175.
    CrossRef    
  6. Berners-Lee, T., 1999. Weaving the Web. Harper, San Francisco.
  7. Choi, C. and Y. Park, 2006. R&D proposal screening system based on text mining approach. Int. J. Technol. Intell. Plan., 2(1): 61-72.
    CrossRef    
  8. Dasgupta, S. and L.J.A. Schulman, 2000. A two-round variant of EM for gaussian mixtures. Proceeding of the 16th Conference on Uncertainty in Artificial Intelligence (UAI '00). In: Craig Boutilier and Mois\&\#233;s Goldszmidt (Eds.), Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp: 152-159.
  9. Decker, S., S. Melnik, F. Van Harmelen, D. Fensel, M. Klein, J. Broekstra, M. Erdmann and I. Horrocks, 2000. The semantic web: The roles of XML and RDF. IEEE Internet Comput., 4(5): 63-74.
    CrossRef    
  10. Ding, Y. and S. Foo, 2002. Ontology research and development: Part 1-a review of ontology generation. J. Inform. Sci., 28(2).
    CrossRef    
  11. Dumais, S.T., 1991. Improving the retrieval of information from external sources. Behav. Res. Meth. Ins. C., 23(2): 229-236.
    CrossRef    
  12. Fabrizio, S., 2002. Machine learning in automated text categorization. ACM Comput. Surv., 34(1): 1-47.
    CrossRef    
  13. Fahrni, P. and M. Spatig, 1990. An application oriented guide to R&D selection and evaluation methods. R&D Manage., 20(2): 155-171.
    CrossRef    
  14. Foskett, D.J., 1997. Thesaurus. In: Willett, P. and K. Sparck-Jones (Eds.), Reproduced in Readings in Information Retrieval. Morgan Kaufmann, San Francisco, CA, pp: 111-134.
  15. Ghasemzadeh, F. and N.P. Archer, 2000. Project portfolio selection through decision support. Decis. Support Syst., 29(2000): 73-88.
    CrossRef    
  16. Girotra, K., C. Terwiesch and K.T. Ulrich, 2007. Valuing R&D projects in a portfolio: Evidence from the pharmaceutical industry. Manage. Sci., 53(9): 1452-1466.
    CrossRef    
  17. Henriksen, A.D. and A.J. Traynor, 1999. A practical R&D project-selection scoring tool. IEEE T. Eng. Manage., 46(2): 158-170.
    CrossRef    
  18. Hotho, A., S. Staab and A. Maedche, 2001. Ontology-based text clustering. Proceeding of the UCAI-2001 Workshop on Text Learning: Beyond Supervision, Seattle.
  19. Joachims, T., 1997. A probabilistic analysis of the rocchio algorithm with TFIDF for text categorization. Proceeding of the 14th International Conference on Machine Learning (ICML '97), pp: 143-151.
  20. Lewis, D.D., 1992. Feature selection and feature extraction for text categorization. Proceeding of the Workshop on Speech and Natural Language, pp: 212-217.
    CrossRef    
  21. Li, X. and B. Liu, 2003. Learning to classify texts using positive and unlabeled data. Proceeding of the International Joint Conference on Artificial Intelligence (IJCAI, 03), pp: 587-594.
  22. Li, Y., C. Zhang and J.R. Swan, 2000. An information filtering model on the web and its application in jobagent. Knowl-based Syst., 13(5): 285-296.
    CrossRef    
  23. Liberatore, M. and G. Titus, 1983. The practice of management science in R&D project selection. Manage. Sci., 29(8): 962-974.
    CrossRef    
  24. Lockett, G., B. Hetherington and P. Yallup, 1986. Modeling a research portfolio using AHP: A group decision process. R&D Manage., 16(2): 151-160.
    CrossRef    
  25. Martino, J.P., 1995. R&D Project Selection. Wiley, New York.
  26. Rasmussen, E., 1992. Clustering Algorithms. In: Frakes, W. and R. Baeza-Yates (Eds.), Information Retrieval: Data Structures and Algorithms. Prentice Hall, Englewood Cliffs, USA.
  27. Robertson, S. and I. Soboroff, 2002. The Trec 2002 filtering track report. Proceeding of the 11th Text Retrieval Conference (TREC, 2002). Retrieved from: trec.nist.gov/pubs/ trec11/papers/ OVER.FILTERING.ps.gz.
  28. Salton, G., 1989. Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley, NY.
    PMid:2464746    
  29. Salton, G. and C. Buckley, 1988. Term-weighting approaches in automatic text retrieval. Inform. Process. Manag. Int. J., 24(5): 513-523.
    CrossRef    
  30. Schmidt, R.L. and J.R. Freeland, 1992. Recent progress in modeling R&D project-selection processes. IEEE T. Eng. Manage., 39(2): 189-201.
    CrossRef    
  31. Shawkat Ali, A.B.W., 2008. K-means Clustering Adopting RBF-Kernel, Data Mining and Knowledge Discovery Technologies. In: David, T. (Ed.), IGI Pub., Hershey, pp: 118-142.
  32. Staab, S. and R. Studer, 2004. Handbook on Ontologies. Springer, NY.
    CrossRef    
  33. Steinbach, M., G. Karypis and V. Kumar, 2000. A comparison of document clustering techniques. Proceeding of the KDD Workshop on Text Mining'00.
  34. Tar, H.H. and T.T.S. Nyunt, 2011. Ontology-based concept weighting for text documents. Proceeding of the International Conference on Information Communication and Management (IPCSIT, 2011). IACSIT Press, Singapore, Vol. 16.
  35. Tian, Q., J. Ma and O. Liu, 2002. A hybrid knowledge and model system for R&D project selection. Expert Syst. Appl., 23(3): 265-271.
    CrossRef    

Competing interests

The authors have no competing interests.

Open Access Policy

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Copyright

The authors have no competing interests.

ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Information
   Sales & Services
Home   |  Contact us   |  About us   |  Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved