Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Comprehensive training syllabus
- Introduction to NLP
- Core concepts of NLP
- Key NLP frameworks
- Commercial implementations of NLP
- Web data scraping techniques
- Utilizing APIs to extract textual data
- Managing and storing text corpora, including content and relevant metadata
- Benefits of using Python and a crash course on NLTK
- Practical insights into Corpora and Datasets
- The necessity of corpora
- Corpus analysis methods
- Various data attribute types
- Different file formats for storing corpora
- Preparing datasets for NLP applications
- Understanding Sentence Structure
- NLP components
- Natural language comprehension
- Morphological analysis: stems, words, tokens, and part-of-speech tags
- Syntactic analysis
- Semantic analysis
- Managing ambiguity
- Text Data Preprocessing
- Corpus: raw text
- Sentence tokenization
- Stemming for raw text
- Lemmatization of raw text
- Removal of stop words
- Corpus: raw sentences
- Word tokenization
- Word lemmatization
- Working with Term-Document/Document-Term matrices
- Tokenizing text into n-grams and sentences
- Customized and practical preprocessing strategies
- Corpus: raw text
- Analysis of Text Data
- Basic NLP features
- Parsers and parsing techniques
- Part-of-speech (POS) tagging and taggers
- Named Entity Recognition
- N-grams
- Bag of Words
- Statistical features of NLP
- Linear algebra concepts for NLP
- Probabilistic theories in NLP
- TF-IDF
- Vectorization
- Encoders and Decoders
- Normalization
- Probabilistic models
- Advanced feature engineering and NLP
- Foundations of word2vec
- Key components of the word2vec model
- Underlying logic of the word2vec model
- Extensions of the word2vec concept
- Practical applications of the word2vec model
- Case study: Applying the Bag of Words model for automatic text summarization using simplified and authentic Luhn's algorithms
- Basic NLP features
- Document Clustering, Classification, and Topic Modeling
- Document clustering and pattern mining (including hierarchical clustering, k-means, and other methods)
- Comparing and classifying documents using TFIDF, Jaccard, and cosine distance metrics
- Document classification using Naïve Bayes and Maximum Entropy models
- Identifying Key Text Elements
- Dimensionality reduction: Principal Component Analysis, Singular Value Decomposition, and non-negative matrix factorization
- Topic modeling and information retrieval via Latent Semantic Analysis
- Entity Extraction, Sentiment Analysis, and Advanced Topic Modeling
- Sentiment degrees: positive versus negative
- Item Response Theory
- Part-of-speech tagging applications: identifying people, places, and organizations in text
- Advanced topic modeling: Latent Dirichlet Allocation
- Case studies
- Mining unstructured user reviews
- Sentiment classification and visualization of product review data
- Analyzing search logs to identify usage patterns
- Text classification
- Topic modeling
Requirements
Familiarity with NLP principles and an understanding of how AI is applied in business contexts.
21 Hours
Testimonials (1)
Individual support