BD4071 High Performance Computing for Big Data Syllabus:
BD4071 High Performance Computing for Big Data Syllabus – Anna University PG Syllabus Regulation 2021
COURSE OBJECTIVES:
To learn the fundamental concepts of High Performance Computing.
To learn the network & software infrastructure for high performance computing.
To understand real time analytics using high performance computing.
To learn the different ways of security perspectives and technologies used in HPC.
To understand the emerging big data applications.
UNIT I INTRODUCTION
The Emerging IT Trends- IOT/IOE-Apache Hadoop for big data analytics-Big data into big insights and actions – Emergence of BDA discipline – strategic implications of big data – BDA Challenges – HPC paradigms – Cluster computing – Grid Computing – Cloud computing – Heterogeneous computing – Mainframes for HPC – Supercomputing for BDA – Appliances for BDA.
UNIT II NETWORK & SOFTWARE INFRASTRUCTURE FOR HIGH PERFORMANCE BDA
Design of Network Infrastructure for high performance BDA – Network Virtualization – Software Defined Networking – Network Functions Virtualization – WAN optimization for transfer of big data – started with SANs- storage infrastructure requirements for storing big data – FC SAN – IP SAN – NAS – GFS – Panasas – Luster file system – Introduction to cloud storage.
UNIT III REAL TIME ANALYTICS USING HIGH PERFORMANCE COMPUTING
Technologies that support Real time analytics – MOA: Massive online analysis – GPFS: General parallel file system – Client case studies – Key distinctions – Machine data analytics – operational analytics – HPC Architecture models – In Database analytics – In memory analytics
UNIT IV SECURITY AND TECHNOLOGIES
Security, Privacy and Trust for user – generated content: The challenges and solutions – Role of real time big data processing in the IoT – End to End Security Framework for big sensing data streams – Clustering in big data.
UNIT V EMERGING BIG DATA APPLICATIONS
Deep learning Accelerators – Accelerators for clustering applications in machine learning – Accelerators for classification algorithms in machine learning – Accelerators for Big data Genome Sequencing
TOTAL: 45 PERIODS
COURSE OUTCOMES:
Upon completion of the course, the student should be able to:
CO1: Understand the basics concepts of High Performance computing systems.
CO2: Apply the concepts of network and software infrastructure for high performance computing
CO3: Use real time analytics using high performance computing.
CO4: Apply the security models and big data applications in high performance computing
CO5: Understand the emerging big data applications.
REFERENCES:
1. Pethuru Raj, Anupama Raman, Dhivya Nagaraj and Siddhartha Duggirala, “HighPerformance Big-Data Analytics: Computing Systems and Approaches”, Springer, 1st Edition, 2015.
2. “Big Data Management and Processing”, Kuan-Ching Li , Hai Jiang, Albert Y. Zomaya, CRC Press,1st Edition,2017.
3. “High Performance Computing for Big Data: Methodologies and Applications”, Chao wang ,CRC Press,1st Edition,2018
4. “High-Performance Data Mining And Big Data Analytics” , Khosrow Hassibi, Create Space Independent Publishing Platform,!st Edition,2014
5. “High performance computing: Modern systems and practices”, Thomas Sterling, Matthew Anderson, Morgan Kaufmann publishers,1st Edition,2017
WEB REFERENCES:
1. https://www.hpcwire.com/
ONLINE RESOURCES:
1. http://hpc.fs.uni-lj.si/sites/default/files/HPC_for_dummies.pdf
2. https://www.nics.tennessee.edu/computing-resources/what-is-hpc