Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Foundations of Audio Classification
- Types of sound events: environmental, mechanical, and human-generated
- Use case overview: surveillance, monitoring, and automation
- Differentiating audio classification, detection, and segmentation
Audio Data and Feature Extraction
- Audio file types and formats
- Key considerations: sampling rate, windowing, and frame size
- Extracting MFCCs, chroma features, and mel-spectrograms
Data Preparation and Annotation
- Work with UrbanSound8K, ESC-50, and custom datasets
- Annotating sound events and defining temporal boundaries
- Strategies for dataset balancing and audio augmentation
Building Audio Classification Models
- Applying convolutional neural networks (CNNs) to audio data
- Input formats: raw waveforms versus extracted features
- Managing loss functions, evaluation metrics, and overfitting
Event Detection and Temporal Localization
- Detection strategies: frame-based and segment-based approaches
- Post-processing techniques: thresholding and smoothing
- Visualizing predictions on audio timelines
Advanced Topics and Real-Time Processing
- Utilizing transfer learning for scenarios with limited data
- Deploying models via TensorFlow Lite or ONNX
- Considerations for streaming audio processing and latency
Project Development and Application Scenarios
- Designing end-to-end pipelines from data ingestion to classification
- Developing proof-of-concept solutions for surveillance, quality control, or monitoring
- Implementing logging, alerting, and integration with dashboards or APIs
Summary and Next Steps
Requirements
- Foundational understanding of machine learning concepts and model training processes
- Practical experience with Python programming and data preprocessing
- Familiarity with the fundamentals of digital audio
Target Audience
- Data scientists
- Machine learning engineers
- Researchers and developers specializing in audio signal processing
21 Hours