DISCOVERING WEB DOCUMENT CLUSTERING USING WEIGHTED SCORE MATRIX AND FUZZY LOGIC

Pranali R. Raut, Prof. Nilesh R. Khochare

Abstract


In computer analysis, many files are usually analysis much of the data but in those files consists of unstructured data, and that data examined by computer examiners are more difficult to be performed. For this purpose it uses clustering documents, it can gives new and useful knowledge from the documents under analysis. Recently used clustering algorithms have some disadvantage like data preparation, outliers. The main theme is web documents is converted into clustering documents with the help of data preprocessing, features extraction, and weighted scores matrix techniques.


Keywords


Web Documents, Web Crawler, Fuzzy Logic, Weighted score Matrix, Feature extraction, Clustered Document.

Full Text:

PDF

References


“Digital forensic text string searching: Improving information retrieval effectiveness by thematically clustering search results,” N. L. Beebe and J. G. Clark, Digital Investigation, Elsevier, vol. 4, no. 1, pp. 49–54,2007.Year of publication: 2007.

“A Cluster-based Approach to Browsing Large Document Collections”, Cutting, D. R., Karger, D. R., Pedersen, J. O., and Tukey.W Proc. Of SIGIR’92 (pp. 318–329).

“Web document clustering: a feasibility demonstration,” O. Zamir and O. Etzioni, in Proceedings of 19th international ACM SIGIR conference on research and development in information retrieval (SIGIR 98), 1998, pp. 4654.

“Searching the world wide web,” S. Lawrence and C. L. Giles, Science, vol. 280, no. 5360, pp. 98100, 1998.

“Search technologies for the internet,” M. Henzinger, Science, vol. 317, no. 5837, pp.468471, 2007.

“Semantic web content analysis: A study in proximity-based collaborative clustering,” V. Loia, W. Pedrycz, and S. Senatore, IEEE T. Fuzzy Systems, vol. 15, no. 6, pp. 12941312, 2007.

“An approach to flexible information access systems using soft computing,” H. L. Larsen, in Proc. of the 32nd Annual Hawaii International Conference on System Sciences, Hawaii, 1999, p. 231.

“Document clustering with cluster refinement and non-negative matrix factorization,” S. Park, D. U. An, B. R. Cha, and C. W. Kim, in Proceedings of the 16th International Conference on Neural Information Processing, Bangkok, Thailand, 2009,pp. 281288.

“A sentence-to-sentence clustering procedure for pattern analysis,” S. Lu and K. Fu, IEEE Transactions on Systems, Man and Cybernetics, vol. 8, pp. 381389, 1978.

“A Text Mining Technique Using Association Rules Extraction,” Mahgoub Hany, Nabil Ismail, Torkey, Fawzy, v-4, pp.21-28, 2008.


Refbacks





Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2017 INTERNATIONAL EDUCATION AND RESEARCH JOURNAL