Research Article | OPEN ACCESS
Graph-Based Text Representation: A Survey of Current Approaches
1Geehan Sabah Hassan, 1Asma Khazaal Abdulsahib and 2Siti Sakira Kamaruddin
1College of Education for Human Science-Ibn Rushd, University of Baghdad, Baghdad, Iraq
2School of Computing, Universiti Utara Malaysia, 06010 UUM Sintok, Malaysia
Research Journal of Applied Sciences, Engineering and Technology 2017 9:334-340
Received: May 4, 2017 | Accepted: July 6, 2017 | Published: September 15, 2017
Abstract
Lately, we have seen the problem of sparsity data has increased due to the increase in the amount of available documentation on the Internet, to take care of this issue need to choose the best strategy for the representation of the content. In recent years, scientists have been switched to the representation of the content graphically. Because the results of previous studies proved that the represented data as graphs reduce the problem of sparse data. So this study aims to review the sorts of graphs used to represent the content of documents. Were the exploratory outcomes recommended that our methodologies are superior to other methodologies in each of the synthetic global data sets and the real.
Keywords:
Concept Frame Graph (CFG), Conceptual Graphs Model (CGM), Dependency Graph (DG), Formal Concept Analysis (FCA), sparsity problem, text representation schemes,
References
- Abdulsahib, A.K. and S.S. Kamaruddin, 2015. Graph based text representation for document clustering. J. Theor. Appl. Inform. Technol., 76(1): 1-13.
Direct Link
-
Balmas, F., 2004. Displaying dependence graphs: A hierarchical approach. J. Softw-Evol. Proc., 16(3): 151-185.
Direct Link - Beck, F. and S. Diehl, 2013. On the impact of software evolution on software clustering. Empir. Softw. Eng., 18(5): 970-1004.
CrossRef -
Bloehdorn, S., P. Cimiano, A. Hotho and S. Staab, 2005. An ontology-based framework for text mining. J. Comput. Linguist. Lang. Technol., 20(1): 87-112.
Direct Link -
Carninci, P., T. Kasukawa, S. Katayama, J. Gough, M.C. Frith, N. Maeda, R. Oyama, T. Ravasi, B. Lenhard, C. Wells et al., 2005. The transcriptional landscape of the mammalian genome. Science, 309(5740): 1559-1563.
CrossRef PMid:16141072 - Chakravarthy, S., A. Venkatachalam and A. Telang, 2010. A graph-based approach for multi-folder email classification. Proceeding of the IEEE 10th International Conference on the Data Mining (ICDM).
CrossRef - Chu, S. and B. Cesnik, 2001. Knowledge representation and retrieval using conceptual graphs and free text document self-organisation techniques. Int. J. Med. Inform., 62(2-3): 121-133.
CrossRef -
Cimiaon, P., A. Hotho and S. Staab, 2005. Learning concept hierarchies from text corpora using formal concept analysis. J. Artif. Intell. Res., 24: 305-339.
CrossRef - Dietrich, J., V. Yakovlev, C. McCartin, G. Jenson and M. Duchrow, 2008. Cluster analysis of Java dependency graphs. Proceeding of the 4th ACM Symposium on Software Visualization, pp: 91-94.
CrossRef - Harish, B.S., D.S. Guru and S. Manjunath, 2010. Representation and classification of text documents: A brief review. IJCA Special Issue on "Recent Trends in Image Processing and Pattern Recognition", 2: 110-119.
Direct Link -
Hassan, G.S., S.K. Mohammad and F.M. Alwan, 2015. Categorization of 'Holy Quran-Tafseer' using K-nearest neighbor algorithm. Int. J. Comput. Appl., 129(12).
Direct Link -
Hensman, S. and J. Dunnion, 2004. Automatically building conceptual graphs using VerbNet and WordNet. Proceeding of the International Symposium on Information and Communication Technologies, pp: 115-120.
Direct Link - Hensman, S. and J. Dunnion, 2005. Constructing conceptual graphs using linguistic resources. Proceeding of the 4th WSEAS International Conference on Telecommunications and Informatics (TELE-INFO'05). Stevens Point, Wisconsin, USA, Article No. 34.
Direct Link - Holder, L.B., 2009. Graph-based Data Mining. In: Encyclopedia of Data Warehousing and Mining. 2nd Edn., IGI Global, pp: 943-949.
CrossRef -
Hulpus, I., C. Hayes, M. Karnstedt and D. Greene, 2013. Unsupervised graph-based topic labelling using dbpedia. Proceeding of the 6th ACM International Conference on Web Search and Data Mining, pp: 465-474.
CrossRef -
Mitchell, B.S. and S. Mancoridis, 2010. Clustering module dependency graphs of software systems using the bunch tool. Department of Mathematics and Computer Science, Drexel University, Philadelphia, PA, USA.
Direct Link - Montes-y-Gómez, M., A. López-López and A. Gelbukh, 2000. Information retrieval with conceptual graph matching. Proceeding of the International Conference on Database and Expert Systems Applications, LNCS, 1873: 312-321.
CrossRef -
Ordo-ez-Salinas, S. and A. Gelbukh, 2010. Information retrieval with a simplified conceptual graph-like representation. Proceeding of the 9th Mexican International Conference on Artificial Intelligence (MICAI'10), Part I. Springer-Verlag, Berlin, Heidelberg, pp: 92-104.
Direct Link - Patel, C., A. Hamou-Lhadj and J. Rilling, 2009. Software clustering using dynamic analysis and static dependencies. Proceeding of the 13th European Conference on IEEE Software Maintenance and Reengineering (CSMR'09).
CrossRef -
Priss, U., 2006. Formal concept analysis in information science. Annu. Rev. Inform. Sci., 40(1): 521-543.
CrossRef - Qadi, A.E., D. Aboutajedine and Y. Ennouary, 2010. Formal concept analysis for information retrieval. Int. J. Comput. Sci. Inform. Secur., 7(2).
Direct Link - Quynh, T.N. and A. Napoli, 2012. A graph model for text analysis and text mining. M.Sc. Thesis, Université de Lorraine.
- Rajani, N., K. McArdle and I.S. Dhillon, 2015. Parallel k nearest neighbor graph construction using tree-based data Rajani, N., K. McArdle and I.S. Dhillon, 2015. Parallel k nearest neighbor graph construction using tree-based data structures. Proceeding of 1st High Performance Graph Mining Workshop. Sydney, August 10, 2015.
CrossRef - Rajaraman, K. and A.H. Tan, 2002. Knowledge discovery from texts: A concept frame graph approach. Proceeding of the 11th International Conference on Information and Knowledge Management, pp: 669-671.
CrossRef - Rajaraman, K. and A.H. Tan, 2003. Mining semantic networks for knowledge discovery. Proceedings of the 3rd IEEE International Conference on Data Mining.
CrossRef - Schenker, A., M. Last, H. Bunke and A. Kandel, 2003. Classification of web documents using a graph model. Proceeding of the 7th International Conference on IEEE Document Analysis and Recognition.
CrossRef -
Siti, S.K., 2011. Frame work for deviation detection in text. Thesis, Universit Kebangsaan Malaysia, Bangi.
-
Sowa, J.F. and E.C. Way, 1986. Implementing a semantic interpreter using conceptual graphs. IBM J. Res. Dev., 30(1): 57-69.
CrossRef - Stumme, G., 2002. Formal concept analysis on its way from mathematics to computer science. In: Priss, U., D. Corbett and G. Angelova (Eds.), Conceptual Structures: Integration and Interfaces. ICCS-ConceptStruct, 2002. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2393: 2-19.
CrossRef - Valatkaite, I. and O. Vasilecas, 2004. Automatic enforcement of business rules as ADBMS triggers from Conceptual Graphs model. Inform. Technol. Control, 31(2).
Direct Link - Wang, L. and X. Liu, 2008. A new model of evaluating concept similarity. Knowl-Based Syst., 21(8): 842-846.
CrossRef - Wang, Y., X. Ni, J.T. Sun, Y. Tong and Z. Chen, 2011. Representing document as dependency graph for document clustering. Proceeding of the 20th ACM International Conference on Information and Knowledge Management, pp: 2177-2180.
CrossRef -
Wille, R., 1982. Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts. In: Rival, I. (Eds.), Ordered Sets. Springer, Netherlands, pp: 445-470.
CrossRef - Zimmermann, T. and N. Nagappan, 2008. Predicting defects using network analysis on dependency graphs. Proceeding of the ACM/IEEE 30th International Conference on Software Engineering (ICSE'08).
CrossRef
Competing interests
The authors have no competing interests.
Open Access Policy
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Copyright
The authors have no competing interests.
|
|
|
ISSN (Online): 2040-7467
ISSN (Print): 2040-7459 |
|
Information |
|
|
|
Sales & Services |
|
|
|