Building Multimodal RAG Applications with Google Gemini
This course explains how to use Google Gemini in real-world applications and provides practical knowledge for accessing and utilizing its capabilities.
This course will introduce you to Google Gemini, a family of multimodal large language models developed by Google.
You’ll start with learning about LLMs, the evolution of Google Gemini, its architecture and APIs, and its diverse capabilities. Next, you’ll complete hands-on exercises using Gemini models for unimodal and multimodal text generation. You’ll understand the retrieval augmented-generation (RAG) process using Gemini and LangChain. You’ll implement an RAG application for generating textual responses based on the provided unimodal prompts and an external knowledge source. Finally, you’ll develop a customer service assistant application with a Streamlit interface that integrates RAG and Gemini for multimodal prompting using image and text prompts.
After completing this course, you will have an in-depth knowledge of using Google Gemini for unimodal and multimodal prompting in real-world AI-based applications.
There are no reviews yet.