BN4012 Natural Language Processing (NLP) Syllabus:

BN4012 Natural Language Processing (NLP) Syllabus – Anna University PG Syllabus Regulation 2021

OBJECTIVES:

➢ To familiarize the concept and fundamentals of Natural Language Processing

UNIT – I INTRODUCTION TO TEXT MINING

Introduction to text mining; NLP – Purpose of using text for analysis – Text mining vs NLP – Challenges in NLP – Syntaxes Semantics – Introduction to language models – NLP methods & Workflow – Applications of NLP.

UNIT – II FUNDAMENTALS OF TEXT MINING

Data extraction – Introduction to pre-process text – Steps in text pre- processing – Tokenization –Stop Words removal – Removing HTML tags, emojis, smileys etc., – Stemming & Lemmatization – Text vectorization and DTM – TFIDF and Topic modeling – Text visualization.

UNIT – III WORD EMBEDDING

Introduction to Word embedding – Limitations of count vectorizers – Cosine Similarity – Co-occurrence Matrix – Pre-trained word embedding – Applications of word embedding – Vector space models – Manipulating words in Vector spaces.

UNIT – IV SENTIMENT ANALYSIS WITH LOGISTIC REGRESSION

Supervised ML & Sentiment Analysis – Vocabulary & Feature Extraction – Negative and Positive Frequencies – Feature Extraction with Frequencies – Preprocessing – Logistic Regression – Training and Testing – Visalizing Tweets and building Logistic Regression Models.

UNIT – V APPLICATION OF PYTHON FOR NLP

Application of Python for NLP – Processing Text files – PDF files – Utilizing regular expressions for pattern searching – Tokenization using Python – Vocabulary matching with Spacy – Speech Tagging – Entity recognition – Chatbots – Natural language generation.

TOTAL: 45 PERIODS

COURSE OUTCOMES:

➢ Appreciate the significance and important of NLP in Business analytics.

REFERENCES:

1. “Natural Language Processing with Python” by Steven Bird, Ewan Klein, and Edward Loper, 2009 O’Reilly.
2. “Text Analytics with Python” by Dipanjan Sarkar, 2019, APRESS
3. Speech and Language Processing, Dan Jurafsky, James H. Martin Prentice Hall, 2009.
4. Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze, 2008, Cambridge University Press.
5. Deep Learning for Natural Language Processing, Palash Goyal, Sumit Pandey, and Karan Jain, 2018, APRESS.
6. Machine Learning for Text, Charu C. Aggarwal, Springer International Publishing, 2023.
7. Hands-On Machine Learning with Scikit-Learn, Keras, and Tensor Flow, Aurélien Géron, 2019, O’REILLY
8. Practical Natural Language Processing, Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, and Harshit Surana (Chapter 9), 2020, O;REILLY. Edition, 2014.