Building Conversational AI Applications Training

Conversational AI is the technology that powers automated messaging and speech-enabled applications, and its applications are used in diverse industries to improve the overall customer experience and customer service efficiency. Conversational AI pipelines are complex and expensive to develop from scratch. In this course, you’ll learn how to build conversational AI services using the NVIDIA® Riva framework. With Riva, developers can create customized language-based AI services for intelligent virtual assistants, virtual customer service agents, real-time transcription, multi-user diarization, chatbots, and much more.

In this workshop, you’ll learn how to quickly build and deploy a conversational AI pipeline including transcription, NLP, and speech. You’ll explore automatic speech recognition (ASR) and text-to-speech (TTS) models and their customization in detail with the NVIDIA NeMo framework and learn how to deploy the models with Riva. Finally, you’ll explore the production-level deployment performance and scaling considerations of Riva services with Helm charts and Kubernetes clusters.

Course Details


1 day


  • Basic Python programming experience
  • Fundamental understanding of a deep learning framework, such as TensorFlow, PyTorch, or Keras
  • Basic understanding of neural networks

Skills Gained

  • How to customize and deploy ASR and TTS models on Riva.
  • How to build and deploy an end-to-end conversational AI pipeline, including ASR, NLP, and TTS models, on Riva.
  • How to deploy a production-level conversational AI application with a Helm chart for scaling in Kubernetes clusters.
Course Outline
  • Introduction
  • Introduction to Conversational AI
    • Explore the conversational AI landscape and gain a deeper understanding of the key components of ASR pipelines.
    • Work through an ASR model example from audio to spectrogram to text.
    • Explore decoders, customizations, and additional models, including inverse text normalization (ITN), punctuation and capitalization, and language identification.
    • Deploy Riva ASR.
  • Customized Conversational AI Pipelines
    • Explore the key components of the TTS pipeline and full pipeline customizations.
    • Explore the spectrogram generator model and the vocoder model.
    • Work with text normalization and grapheme to phoneme (G2P) conversion to customize pronunciations.
    • Deploy a full ASR-NLP-TTS custom pipeline in Riva.
  • Inference and Deployment Challenges
    • Explore challenges related to performance, optimization, and scaling in production deployment of conversational AI applications.
    • Gain an understanding of the inference deployment process.
    • Analyze non-functional requirements and their implications.
    • Use a Helm chart to deploy a conversational AI application with a Kubernetes cluster.
  • Final Review