BD4111 Big Data Computing Laboratory Syllabus:

BD4111 Big Data Computing Laboratory Syllabus – Anna University PG Syllabus Regulation 2021

COURSE OBJECTIVES:

 To set up single and multi-node Hadoop Clusters.
 To solve Big Data problems using Map Reduce Technique.
 To learn NoSQL queries.
 To design algorithms that uses Map Reduce Technique to apply on Unstructured and structured data.
 To learn Scalable machine learning using Mahout.

LIST OF EXPERIMENTS:

1. Set up a pseudo-distributed, single-node Hadoop cluster backed by the Hadoop Distributed File System, running on Ubuntu Linux. After successful installation on one node, configuration of a multi-node Hadoop cluster (one master and multiple slaves).
2. MapReduce application for word counting on Hadoop cluster.
3. Unstructured data into NoSQL data and do all operations such as NoSQL query with API.
4. K-means clustering using map reduce.
5. Page Rank Computation.
6. Mahout machine learning library to facilitate the knowledge build up in big data analysis.
7. Application of Recommendation Systems using Hadoop/mahout libraries.

HARDWARE/SOFTWARE REQUIREMENTS

1. Java
2. Hadoop
3. Mahout
4. HBase/MongoDB

COURSE OUTCOMES:

CO1: Set up single and multi-node Hadoop Clusters.
CO2: Apply Map Reduce technique for various algorithms.
CO3: Design new algorithms that use Map Reduce to apply on Unstructured and structured data.
CO4: Develop Scalable machine learning algorithms for various Big data applications using Mahout.
CO5: Represent NoSQL data.

TOTAL: 30 PERIODS

REFERENCES:

1. Kristina Chodorow, “MongoDB: The Definitive Guide – Powerful and Scalable Data Storage”, O’Reilly, 3rd Edition, 2019.
2. Lars George, “HBase: The Definitive Guide”, O’Reilly, 2015.
3. Tom White, “Hadoop: The Definitive Guide – Storage and Analysis at Internet Scale”, O’Reilly, 4th Edition, 2015.
4. Robin Anil, Sean Owen, Ellen G. Friedman, Ted Dunning, “Mahout in Action”, Manning Publications, 2011.