AC3348

Machine Learning Security Training

Protect your Machine Learning (ML) applications. This ML Security training course teaches developers the specialized secure coding skills needed to protect their ML applications from attacks. Attendees learn how to identify and address potential security vulnerabilities in their applications and how to avoid the security pitfalls of the Python programming language.

Course Details

Duration

4 days

Prerequisites

Python developers working on machine learning systems

Skills Gained

  • Understand the basics of cyber security
  • Identify different types of machine learning models
  • Understand the different ways that machine learning can be used in cybersecurity
  • Apply the concepts to real-world problems
Course Outline
  • Cyber Security Basics
    • What is security?
    • Threat and risk
    • Cyber security threat types
    • Consequences of insecure software
      • Constraints and the market
      • The dark side
    • Categorization of bugs
      • The Seven Pernicious Kingdoms
      • Common Weakness Enumeration (CWE)
      • CWE Top 25 Most Dangerous Software Errors
      • Vulnerabilities in the environment and dependencies
  • Cyber Security in Machine Learning
    • ML-specific cyber security considerations
    • What makes machine learning a valuable target?
    • Possible consequences
    • Inadvertent AI failures
    • Some real-world abuse examples
    • ML threat model
      • Creating a threat model for machine learning
      • Machine learning assets
      • Security requirements
      • Attack surface
      • Attacker model – resources, capabilities, goals
      • Confidentiality threats
      • Integrity threats (model)
      • Integrity threats (data, software)
      • Availability threats
      • Dealing with AI/ML threats in software security
  • Using ML in Cyber Security
    • Static code analysis and ML
    • ML in fuzz testing
    • ML in anomaly detection and network security
    • Limitations of ML in security
  • Malicious Use of AI and ML
    • Social engineering attacks and media manipulation
    • Vulnerability exploitation
    • Malware automation
    • Endpoint security evasion
  • Adversarial Machine Learning
    • Threats against machine learning
    • Attacks against machine learning integrity
      • Poisoning attacks
      • Poisoning attacks against supervised learning
      • Poisoning attacks against unsupervised and reinforcement learning
      • Evasion attacks
      • Common white-box evasion attack algorithms
      • Common black-box evasion attack algorithm
      • Transferability of poisoning and evasion attacks
    • Some defense techniques against adversarial samples
      • Adversarial training
      • Defensive distillation
      • Gradient masking
      • Feature squeezing
      • Using reformers on adversarial data
      • Caveats about the efficacy of current adversarial defenses
      • Simple practical defenses
    • Attacks against machine learning confidentiality
      • Model extraction attacks
      • Defending against model extraction attacks
      • Model inversion attacks
      • Defending against model inversion attacks
  • Denial of Service
    • Denial of Service
    • Resource exhaustion
    • Cash overflow
    • Flooding
    • Algorithm complexity issues
    • Denial of service in ML
      • Accuracy reduction attacks
      • Denial-of-information attacks
      • Catastrophic forgetting in neural networks
      • Resource exhaustion attacks against ML
      • Best practices for protecting availability in ML systems
  • Input Validation Principles
    • Blacklists and whitelists
    • Data validation techniques
    • What to validate – the attack surface
    • Where to validate – defense in depth
    • How to validate – validation vs transformations
    • Output sanitization
    • Encoding challenges
    • Validation with regex
    • Regular expression denial of service (ReDoS)
    • Dealing with ReDoS
  • Injection
    • Injection principles
    • Injection attacks
    • SQL injection
    • SQL injection basics
    • Attack techniques
    • Content-based blind SQL injection
    • Time-based blind SQL injection
    • SQL injection best practices
    • Input validation
    • Parameterized queries
    • Additional considerations
    • SQL injection and ORM
    • Code injection
      • Code injection via input()
      • OS command injection
    • General protection best practices
  • Integer Handling Problems
    • Representing signed numbers
    • Integer visualization
    • Integers in Python
    • Integer overflow
    • Integer overflow with ctypes and NumPy
    • Other numeric problems
  • Files and Streams
    • Path traversal
    • Path traversal-related examples
    • Additional challenges in Windows
    • Virtual resources
    • Path traversal best practices
    • Format string issues
  • Unsafe Native Code
    • Native code dependence
    • Best practices for dealing with native code
  • Input Validation in Machine Learning
    • Misleading the machine learning mechanism
    • Sanitizing data against poisoning and RONI
    • Code vulnerabilities causing evasion, misprediction, or misclustering
    • Typical ML input formats and their security
  • Security Features
    • Authentication
      • Authentication basics
      • Multi-factor authentication
      • Authentication weaknesses - spoofing
      • Password management
    • Information exposure
      • Exposure through extracted data and aggregation
      • Privacy violation
      • System information leakage
      • Information exposure best practices
  • Time and State
    • Race conditions
      • File race condition
      • Avoiding race conditions in Python
    • Mutual exclusion and locking
      • Deadlocks
    • Synchronization and thread safety
  • Errors
    • Error handling
      • Returning a misleading status code
      • Information exposure through error reporting
    • Exception handling
      • In the except, catch block. And now what?
      • Empty catch block
      • The danger of assert statements
  • Using Vulnerable Components
    • Assessing the environment
    • Hardening
    • Malicious packages in Python
    • Vulnerability management
      • Patch management
      • Vulnerability management
      • Bug bounty programs
      • Vulnerability databases
      • Vulnerability rating – CVSS
      • DevOps, the build process and CI / CD
      • Dependency checking in Python
    • ML Supply Chain Risks
      • Common ML system architectures
      • ML system architecture and the attack surface
      • Protecting data in transit – transport layer security
      • Protecting data in use – homomorphic encryption
      • Protecting data in use – differential privacy
      • Protecting data in use – multi-party computation
    • ML frameworks and security
      • General security concerns about ML platforms
      • TensorFlow security issues and vulnerabilities
  • Cryptography for Developers
    • Cryptography basics
    • Cryptography in Python
    • Elementary algorithms
      • Random number generation
      • Hashing
    • Confidentiality protection
      • Symmetric encryption
    • Homomorphic encryption
      • Basics of homomorphic encryption
      • Types of homomorphic encryption
      • FHE in machine learning
    • Integrity protection
      • Message Authentication Code (MAC)
      • Digital signature
    • Public Key Infrastructure (PKI)
      • Some further key management challenges
      • Certificates
  • Security Testing
    • Security testing methodology
      • Security testing – goals and methodologies
      • Overview of security testing processes
      • Threat modeling
    • Security testing techniques and tools
      • Code analysis
      • Dynamic analysis
  • Wrap Up
    • Secure coding principles
      • Principles of robust programming by Matt Bishop
      • Secure design principles of Saltzer and Schröder
    • And now what?
      • Software security sources and further reading
      • Python resources
      • Machine learning security resources