Research Article | OPEN ACCESS
Term Frequency Based Cosine Similarity Measure for Clustering Categorical Data using Hierarchical Algorithm
S. Anitha Elavarasi and J. Akilandeswari
Department of Computer Science and Engineering, Sona College of Technology, Salem,Tamil Nadu, India
Research Journal of Applied Sciences, Engineering and Technology 2015 7:798-805
Received: June 8, 2015 | Accepted: July 8, 2015 | Published: November 05, 2015
Abstract
Object in real world are categorical in nature. Categorical data are not analyzed as numerical data because of the absence of inherit ordering. In this study performance of cosine based hierarchical clustering algorithm for categorical data is evaluated. It make use of two functions such as Frequency Computation, Term Frequency based Cosine Similarity Matrix (TFCSM) computation. Clusters are formed using TFCSM based hierarchical clustering algorithm. Results are evaluated for vote real life data set using TFCSM based hierarchical clustering and standard hierarchical clustering algorithm using single link, complete link and average link method.
Keywords:
Categorical data, clustering, cosine similarity, hierarchical clustering, term frequency,
Competing interests
The authors have no competing interests.
Open Access Policy
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Copyright
The authors have no competing interests.
|
|
|
ISSN (Online): 2040-7467
ISSN (Print): 2040-7459 |
|
Information |
|
|
|
Sales & Services |
|
|
|