Scroll to Top

Volume : I, Issue : VIII, September - 2011

Document clustering for Information Retrieval – A General Perspective

P.Prabhu

Published By : Laxmi Book Publication

Abstract :

Information Retrieval (IR) is an emerging subfield of information science concerning representation, storage; access and retrieval of information .Current research areas within the field of IR include searching and querying, ranking of search results, navigating and browsing information, optimizing information representation, storage, document classification and clustering. The primary objective of this paper is to understand the method of using document clustering to improve their information retrieval. This paper first discussed method for clustering documents for information retrieval in easy steps by introducing various types of web/electronic repositories. Second explains the steps involved for preprocessing the documents. Second Clustering method especially k-means algorithm is discussed for clustering documents.

Keywords :


Article :


Cite This Article :

P.Prabhu, (2011). Document clustering for Information Retrieval – A General Perspective. Indian Streams Research Journal, Vol. I, Issue. VIII, http://oldisrj.lbp.world/UploadedData/504.pdf

References :

  1. A. El-Hamdouchi and P. Willet, Comparison of HierarchicAgglomerative Clustering Methods for Document Retrieval, The Computer Journal, Vol. 32, No. 3, 1989.
  2. Cai.D,He.X, and Han.J, ”Document Clustering Using Locality Preserving Indexing,” IEEE Trans. Knowledge and Data Eng.,Vol.17,no.12, Dec.2005.
  3. Gerald Kowalski, Information Retrieval Systems – Theory and Implementation, KluwerAcademic Publishers, 1997.
  4. Jiawei Han, Micheline Kamber, “Data Mining concepts and Techniques”, Morgan Kaufmann Publishers, San Fracisco, CA, USA.
  5. J.Hyma, Y.Jhansi and S.Anuradha, A new hybridized approach of PSO & GA for document clustering, International Journal of Engineering Science and TechnologyVol.2(5),2010,1221-1226.
  6. Margaret H. Dunham, Data Mining Introductory and AdvancedTopics, Pearson Education in SouthAsia.
  7. Mehmed Kantardzic, Data Mining: Concepts, Models, Methods, and Algorithms, IEEE Press & John Wiley, November 2002.
  8. Michael J.A.Berry Gordon Linoff, Mastering Data Mining” John wiley&sons ptd, Ltd, Singapore 2001.
  9. P.Prabhu and N.Anbazhagan, “Improving the performance of k-means clustering for high dimensional dataset', International Journal of Computer Science and Engineering”,Vol 3. No.6. Pg 2317-2322 June 2011.
  10. P.Prabhu,' Discovery of Novel Patterns in Animal Dataset using Hierarchical Techniques', Indian Streams Research Journal, Vol I, Issue V, [June 2011] Information Technology.

Article Post Production

Article Indexed In

Comments :

Enter Name :
Email ID :
Comments :

Previous Comments :

Creative Commons License
Indian Streams Research Journal by Laxmi Book Publication is licensed under a Creative Commons Attribution 4.0 International License.
Based on a work at http://oldisrj.lbp.world/Default.aspx.
Permissions beyond the scope of this license may be available at http://oldisrj.lbp.world/Default.aspx
Copyright � 2014 Indian Streams Research Journal. All rights reserved
Looking for information? Browse our FAQs, tour our sitemap, or contact ISRJ
Read our Privacy Policy Statement and Plagairism Policy. Use of this site signifies your agreement to the Terms of Use