BD4071 High Performance Computing for Big Data Syllabus:

BD4071 High Performance Computing for Big Data Syllabus – Anna University PG Syllabus Regulation 2021

COURSE OBJECTIVES:

 To learn the fundamental concepts of High Performance Computing.
 To learn the network & software infrastructure for high performance computing.
 To understand real time analytics using high performance computing.
 To learn the different ways of security perspectives and technologies used in HPC.
 To understand the emerging big data applications.

UNIT I INTRODUCTION

The Emerging IT Trends- IOT/IOE-Apache Hadoop for big data analytics-Big data into big insights and actions – Emergence of BDA discipline – strategic implications of big data – BDA Challenges – HPC paradigms – Cluster computing – Grid Computing – Cloud computing – Heterogeneous computing – Mainframes for HPC – Supercomputing for BDA – Appliances for BDA.

UNIT II NETWORK & SOFTWARE INFRASTRUCTURE FOR HIGH PERFORMANCE BDA

Design of Network Infrastructure for high performance BDA – Network Virtualization – Software Defined Networking – Network Functions Virtualization – WAN optimization for transfer of big data – started with SANs- storage infrastructure requirements for storing big data – FC SAN – IP SAN – NAS – GFS – Panasas – Luster file system – Introduction to cloud storage.

UNIT III REAL TIME ANALYTICS USING HIGH PERFORMANCE COMPUTING

Technologies that support Real time analytics – MOA: Massive online analysis – GPFS: General parallel file system – Client case studies – Key distinctions – Machine data analytics – operational analytics – HPC Architecture models – In Database analytics – In memory analytics

UNIT IV SECURITY AND TECHNOLOGIES

Security, Privacy and Trust for user – generated content: The challenges and solutions – Role of real time big data processing in the IoT – End to End Security Framework for big sensing data streams – Clustering in big data.

UNIT V EMERGING BIG DATA APPLICATIONS

Deep learning Accelerators – Accelerators for clustering applications in machine learning – Accelerators for classification algorithms in machine learning – Accelerators for Big data Genome Sequencing

TOTAL: 45 PERIODS

COURSE OUTCOMES:

Upon completion of the course, the student should be able to:
CO1: Understand the basics concepts of High Performance computing systems.
CO2: Apply the concepts of network and software infrastructure for high performance computing
CO3: Use real time analytics using high performance computing.
CO4: Apply the security models and big data applications in high performance computing
CO5: Understand the emerging big data applications.

REFERENCES:

1. Pethuru Raj, Anupama Raman, Dhivya Nagaraj and Siddhartha Duggirala, “HighPerformance Big-Data Analytics: Computing Systems and Approaches”, Springer, 1st Edition, 2015.
2. “Big Data Management and Processing”, Kuan-Ching Li , Hai Jiang, Albert Y. Zomaya, CRC Press,1st Edition,2017.
3. “High Performance Computing for Big Data: Methodologies and Applications”, Chao wang ,CRC Press,1st Edition,2018
4. “High-Performance Data Mining And Big Data Analytics” , Khosrow Hassibi, Create Space Independent Publishing Platform,!st Edition,2014
5. “High performance computing: Modern systems and practices”, Thomas Sterling, Matthew Anderson, Morgan Kaufmann publishers,1st Edition,2017

WEB REFERENCES:

1. https://www.hpcwire.com/

ONLINE RESOURCES:

1. http://hpc.fs.uni-lj.si/sites/default/files/HPC_for_dummies.pdf
2. https://www.nics.tennessee.edu/computing-resources/what-is-hpc