Natural Language Processing
π Course Overview
| Course Information | |||
|---|---|---|---|
| Instructor: | Edwin Puertas, PhD. | Email: | epuerta@utb.edu.co |
| Office: | AL-304 | School: | School of Digital Transformation |
| Hours: | 3 hours per week | Credits: | 4 |
| Modality: | Face-to-face | Methodology: | Lectures - Theoretical |
π― Course Purpose
Develop theoretical and practical competencies in Natural Language Processing (NLP), enabling students to apply machine learning techniques and models to analyze, understand, and generate text in various contexts, fostering the resolution of organizational and research problems with an ethical and sustainable approach.
Specific Objectives
- Understand the theoretical foundations of NLP, including language modeling, sentiment analysis, vector semantics, and neural networks
- Implement supervised and unsupervised learning models using Python, NLTK, and TensorFlow
- Evaluate and optimize NLP models through appropriate metrics and hyperparameter adjustments
- Develop NLP projects in teams, applying communication skills, collaboration, and critical thinking
π Course Content
| Module 1: Fundamentals of NLP | Module 2: Machine Learning & NLP | Module 3: Applications and Ethics |
|---|---|---|
|
|
|
π Course Methodology
The learning process is supported by four main activities:
π Thematic Presentations
Synthesis of topics presented by the professor, enriched with valuable contributions and insights.
π€ Student Assignments
Individual activities validating students' understanding and preparation of course materials.
π₯ Workshops
Group activities reinforcing learning through practical application of concepts and techniques.
π Exams
Individual evaluations measuring learning progress throughout the course.
π What is NLP?
Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical and practical issues in the design and implementation of computer systems for processing human languages.
π§© NLP Fundamentals
Language Processing Levels
Phonetics
The study of speech sounds, examining how sounds are produced, transmitted, and perceived
Phonology
The study of sound systems and how sounds function within a particular language
Morphology
The study of word formation, examining morphemes (smallest meaningful units of language)
Syntax
The study of sentence structure and arrangement of words and phrases
Semantics
The study of meaning in words and sentences, including lexical and compositional semantics
Pragmatics
The study of language use in social contexts, considering factors like speaker intention and implied meaning
Key NLP Components
| Component | Description |
|---|---|
| Text Preprocessing | Preparing raw text for analysis by transforming it into machine-readable format |
| Feature Extraction | Converting raw text into numerical representations |
| POS Tagging | Identifying the grammatical function of each word |
| Named Entity Recognition | Identifying useful entities like names, locations, and dates |
| Coreference Resolution | Identifying when different words refer to the same entity |
| Parsing | Analyzing grammatical structure to extract meaning |
Fundamental Techniques & Algorithms
| Technique/Algorithm | Description |
|---|---|
| Tokenization | Dividing text into smaller units |
| Lemmatization/Stemming | Reducing words to base form |
| POS Tagging | Identifying grammatical functions |
| Dependency Parsing | Understanding syntactic relationships |
| Bag of Words | Simple representation of word frequency |
| TF-IDF | Weighting word importance in documents |
| Word Embeddings | Vector representations of words |
| N-gram Models | Predicting words based on context |
| Pre-trained Models | Models trained on massive text corpora |
π Practical Applications of NLP
π Sentiment Analysis
Determining opinions or emotions expressed in text
π€ Chatbots
Automated conversations with users
π Machine Translation
Translating text between languages
π Information Extraction
Identifying and extracting relevant data from large text corpora
βοΈ Text Generation
Automatically creating original text
π Bibliography
- Dan Jurafsky and James H. Martin (2020), Speech and Language Processing (3rd ed. draft)
- Beysolow II, T. (2018). Applied Natural Language Processing with Python: Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing. Apress.
- Vajjala, S., Majumder, B., Gupta, A., & Surana, H. (2020). Practical Natural Language Processing: A Comprehensive Guide to Building Real-World NLP Systems. O'Reilly Media.
- Srinivasa-Desikan, B. (2018). Natural Language Processing and Computational Linguistics: A practical guide to text analysis with Python, Gensim, spaCy, and Keras. Packt Publishing Ltd.