Research Article | OPEN ACCESS
A Study of Authorship Attribution in English and Tamil Emails
1A. Pandian and 2Mohamed Abdul Karim
1Department of MCA, SRM University, Chennai-603 203, India
2College of Applied Sciences, Sohar, Ministry of Higher Education, Oman
Research Journal of Applied Sciences, Engineering and Technology 2014 2:203-211
Received: March 14, 2014 | Accepted: April 22, 2014 | Published: July 10, 2014
Abstract
The aim of our study is to identify author of unknown emails of Tamil and English. The recent approaches in Authorship Attribution show that apart from lexical measures some other features of written language are considerably effective as discriminators of author style. However, there have been no attempts to compare the attribution potential of these features. The aim of the present study, then, has to examine the effectiveness of several styles-markers in authorship attribution between the following two languages, English and Tamil equally important, however, we have to compare the usefulness of the chosen style-markers across a two languages the results proved high attribution effectiveness can be achieved in both the language.
Keywords:
Echo state neural network, english emails, fishers linear discriminant method, lexical features, radial basis function, syntactic features, Tamil emails,
References
-
Abbasi, A. and H. Chen, 2005. Applying authorship analysis to extremist-group Web forum messages. IEEE Intell. Syst., 20(5): 67-75.
CrossRef
-
Argamon, S., M. Koppel, J. Fine and A. Shimoni, 2003. Gender, genre and writing style in formal written texts. Text Talk, 23(3).
CrossRef
-
Argamon, S., M. Koppel, J. Pennebaker and J. Schler, 2009. Automatically profiling the author of an anonymous text. Commun. ACM, 52(2): 119-123.
CrossRef
-
Baayen, H., H. van Halteran, A. Neijt and F. Tweedie, 2002. An experiment in authorship attribution. Proceeding of 6es Journ ́ees Internationales d’Analyse Statistique Des Donn ́Ees Textuelles (JADT, 2002).
-
Bagavandas, M., H. Abdul and G. Manimannan, 2009. Neural computation in authorship attribution: The case of selected Tamil articles. J. Quant. Linguist., 16(2): 115-131.
CrossRef
-
Binongo, J.N.G., 2003. Who wrote the 15th book of Oz? An application of multivariate analysis to authorship attribution. Chance, 16(2): 9-17.
CrossRef
-
Corney, M., O. de Vel, A. Anderson and G. Mohay, 2002. Gender-preferential text mining of e-mail discourse. Proceedings of the 18th Annual Computer Security Applications Conference (ACSAC '02), pp: 282.
CrossRef
-
De Vel, O., A. Anderson, M. Corney and G. Mohay, 2001. Mining e-mail content for author identification forensics. SIGMOD Rec., 30(4): 55-64.
CrossRef
-
Diederich, J., J. Kindermann, E. Leopold and G. Paass, 2003. Authorship attribution with support vector machines. J. Appl. Intell. Arch., 19(1-2): 109-123.
CrossRef
-
Farkhund, I., H. Binsalleeh, B.C.M. Fung and M. Debbabi, 2010. Mining writeprints from anonymous e-mails for forensic investigation. Digit. Invest., 7: 56-64.
CrossRef
-
Farkhund, I., R. Hadjidj, B.C.M. Fung and M. Debbabi, 2008. A novel approach of mining write-prints for authorship attribution in e-mail forensics. Digit. Invest., 5: S42-51.
CrossRef
-
Genkin, A., D.D. Lewis and D. Madigan, 2007. Large-scale Bayesian logistic regression for text categorization. Technometrics, 49(3): 291-304.
CrossRef
-
Graham, N., G. Hirst and B. Marthi, 2005. Segmenting documents by stylistic character. Nat. Lang. Eng., 11(4): 397-415.
CrossRef
-
Grieve, J., 2007. Quantitative authorship attribution: An evaluation of techniques. Lit. Linguist. Comput., 22(3): 251-270.
CrossRef
-
Holmes, D.I., L. Gordon and C. Wilson, 2001a. A widow and her soldier: Stylometry and the american civil war. Lit. Linguist. Comput., 16(4): 403-420.
CrossRef
-
Holmes, D.I., M. Robertson and R. Paez, 2001b. Stephen Crane and the New-York tribune: A case study in traditional and non-traditional authorship attribution. Comput. Humanities, 35(3): 315-331.
CrossRef
-
Jaeger, H., 2001a. Short term memory in echo state networks. GMD Report 152, German National Research Center for Information Technology, German.
-
Jaeger, H., 2001b. The echo state approach to analyzing and training recurrent neural networks. GMD Report 148, German National Research Center for Information Technology, German.
-
Koppel, M. and J. Schler, 2003. Exploiting stylistic idiosyncrasies for authorship attribution. Proceedings of IJCAI'03 Workshop on Computational Approaches to Style Analysis and Synthesis, pp: 69-72.
-
Koppel, M., S. Argamon and A.R., Shimoni, 2002. Automatically categorizing written texts by author gender. Lit. Linguist. Comput., 17(4): 401-412.
CrossRef
-
Luyckx, K. and W. Daelemans, 2008. Authorship attribution and verification with many authors and limited data. Proceeding of the 20th Belgian-Netherlands Conference on Artificial Intelligence (BNAIC, 2008). Enschede, Netharlands.
CrossRef
-
Madigan, D., A. Genkin, D.D. Lewis, S. Argamon, D. Fradkin and L. Ye, 2005. Author identification on the large scale. Proceeding of the Meeting of the Classification Society of North America.
-
Pandian, A. and A.K. Sadiq, 2011. Email authorship identification using radial basis function. Int. J. Comput. Sci. Inform. Secu., 9: 68-75.
-
Purushothaman, S. and D. Suganthi, 2008. fMRI segmentation using echo state neural network. Int. J. Image Process., 2(1): 1-9.
-
Zhao, Y. and J. Zobel, 2005. Effective authorship attribution using function word. Proceeding of the 2nd Asian Information Retrieval Symposium (AIRS, 2005), AIRS, Springer, USA, pp: 174-190.
Competing interests
The authors have no competing interests.
Open Access Policy
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Copyright
The authors have no competing interests.
|
|
|
ISSN (Online): 2040-7467
ISSN (Print): 2040-7459 |
|
Information |
|
|
|
Sales & Services |
|
|
|