Volume : I, Issue : VIII, September - 2011 Document clustering for Information Retrieval – A General PerspectiveP.Prabhu Published By : Laxmi Book Publication Abstract : Information Retrieval (IR) is an emerging subfield of information science concerning representation, storage; access
and retrieval of information .Current research areas within the field of IR include searching and querying, ranking
of search results, navigating and browsing information, optimizing information representation, storage, document
classification and clustering. The primary objective of this paper is to understand the method of using document
clustering to improve their information retrieval. This paper first discussed method for clustering documents for
information retrieval in easy steps by introducing various types of web/electronic repositories. Second explains the
steps involved for preprocessing the documents. Second Clustering method especially k-means algorithm is discussed
for clustering documents. Keywords : Article : Cite This Article : P.Prabhu, (2011). Document clustering for Information Retrieval – A General Perspective. Indian Streams Research Journal, Vol. I, Issue. VIII, http://oldisrj.lbp.world/UploadedData/504.pdf References : - A. El-Hamdouchi and P. Willet, Comparison of HierarchicAgglomerative Clustering Methods for Document Retrieval, The Computer Journal, Vol. 32, No. 3, 1989.
- Cai.D,He.X, and Han.J, ”Document Clustering Using Locality Preserving Indexing,” IEEE Trans. Knowledge and Data Eng.,Vol.17,no.12, Dec.2005.
- Gerald Kowalski, Information Retrieval Systems – Theory and Implementation, KluwerAcademic Publishers, 1997.
- Jiawei Han, Micheline Kamber, “Data Mining concepts and Techniques”, Morgan Kaufmann Publishers, San Fracisco, CA, USA.
- J.Hyma, Y.Jhansi and S.Anuradha, A new hybridized approach of PSO & GA for document clustering, International Journal of Engineering Science and TechnologyVol.2(5),2010,1221-1226.
- Margaret H. Dunham, Data Mining Introductory and AdvancedTopics, Pearson Education in SouthAsia.
- Mehmed Kantardzic, Data Mining: Concepts, Models, Methods, and Algorithms, IEEE Press & John Wiley, November 2002.
- Michael J.A.Berry Gordon Linoff, Mastering Data Mining” John wiley&sons ptd, Ltd, Singapore 2001.
- P.Prabhu and N.Anbazhagan, “Improving the performance of k-means clustering for high dimensional dataset', International Journal of Computer Science and Engineering”,Vol 3. No.6. Pg 2317-2322 June 2011.
- P.Prabhu,' Discovery of Novel Patterns in Animal Dataset using Hierarchical Techniques', Indian Streams Research Journal, Vol I, Issue V, [June 2011] Information Technology.
|
Article Post Production
Article Indexed In
|