Member-only story
How to create a Multimodal Retail Recommendations system using Gemini Pro Vision
1 min readJan 16, 2024
Gemini Pro Vision Model
Gemini Pro Vision is a powerful multimodal large language model (LLM) developed by Google AI.
It’s part of the larger Gemini family of models, which are known for their ability to handle a variety of tasks across different modalities, including text, images, and videos.
In the below video, you will learn how to
- How to use the Gemini Pro Vision model to perform visual understanding
- How to consider multimodality in prompting for the Gemini Pro Vision model
- How the Gemini Pro Vision model can be used to create retail recommendation applications out-of-the-box
Video Link: https://www.youtube.com/watch?v=mzMfPMV_xSk
Steps:
- Task 1. Open Python Notebook and Install Packages
- Task 2. Use the Gemini Pro Vision model
- Task 3. Visual understanding with Gemini Pro Vision
Done !!!
If you want to know more, you can refer to the below docs
Thank you :)