DEPLOYING AND RESEARCHING HADOOP ALGORITHMS ON VIRTUAL MACHINES AND ANALYZING LOG FILES

Authors

  • Suyash S. Sathe
  • Ankit N. Pokharna
  • Akshay D. Lagad
  • Karan R. Hule

Abstract

The user behaviors analysis using logs under the big data environment is attractive to the industry profitability for that it can discover the user behaviors to the potential customers. However, the user behaviors are dynamic which is difficult to capture the users’ comprehensive behaviors in a single device by capturing or collecting the static dataset. Specially, the increase of the users, network traffic and network services bring many challenges such as fast data collection, processing and storage. Therefore, we propose and implement a log analysis system in this paper, which is based on the hadoop distribution platform to capture the traffic and analyze the user & machine behaviors, in terms of the search keywords, user shopping trends, website posts and replies, and web visited history to acquire the uses’ dynamic behaviors. To evaluate our system, we capture the logs in the systems, and the results show that our system can capture the users’ long-term behaviors and acquire the user behaviors in detail.  in computer log management and intelligence, log analysis (or system and network log analysis) is an art and science seeking to make sense out of computer-generated records (also called log or audit trail records). The process of creating such records is called data logging.

References

Mahout: Scalable machine-learning and data-mining

library. http://mapout.apache.org, 2010.

http://hadoop.apache.org/

http://pig.apache.org/

Additional Files

Published

15-03-2016

How to Cite

Suyash S. Sathe, Ankit N. Pokharna, Akshay D. Lagad, & Karan R. Hule. (2016). DEPLOYING AND RESEARCHING HADOOP ALGORITHMS ON VIRTUAL MACHINES AND ANALYZING LOG FILES. International Education and Research Journal (IERJ), 2(3). Retrieved from https://ierj.in/journal/index.php/ierj/article/view/153