Abstract
|
Article Information:
Improvement Tfidf for News Document Using Efficient Similarity
Abdolkarim Elahi, Reza Javanmard Alitappeh and Ali Shokouhi Rostami
Corresponding Author: Hamid Alinejad-Rokny
Submitted: January 30, 2012
Accepted: March 10, 2012
Published: October 01, 2012 |
Abstract:
|
This study proposed a new method about clustering in documents. Clustering is a very powerful data
mining technique for topic discovery from documents. In document clustering, it must be more similarity
between intra-document and less similarity between intra-document of two clusters. The cosine function
measures the similarity of two documents. When the clusters are not well separated, partitioning them just based
on the pair wise is not good enough because some documents in different clusters may be similar to each other
and the function is not efficient. To solve this problem, a measurement of the similarity in concept of neighbors
and links is used. In this study, an efficient method for measurement of the similarity with a more accurate
weighting in bisecting k-means algorithms is proposed. Having evaluated by the data set of documents, the
efficiency is compared with the cosine similarity criterion and traditional methods. Experimental results show
an outstanding improvement in efficiency by applying the proposed criterion.
Key words: Neighbor, news document clustering, link function, weighting improvement, , ,
|
Abstract
|
PDF
|
HTML |
|
Cite this Reference:
Abdolkarim Elahi, Reza Javanmard Alitappeh and Ali Shokouhi Rostami, . Improvement Tfidf for News Document Using Efficient Similarity. Research Journal of Applied Sciences, Engineering and Technology, (19): 3592-3600.
|
|
|
|
|
ISSN (Online): 2040-7467
ISSN (Print): 2040-7459 |
|
Information |
|
|
|
Sales & Services |
|
|
|