This application allows users to:
- Upload images and detect objects using Google Cloud Vision, Vertex AI AutoML, and Gemini API.
- Search images using tags (object labels) saved in Firestore from above results.
- Upload videos and perform label detection using Google Video Intelligence API.
-
Image Upload & Object Detection
- Vision API: Object localization
- Vertex AI: AutoML object detection
- Gemini: Multimodal vision using Gemini 2.0
-
Image Search
- Search previously uploaded images using tag-based metadata from detections.
-
Video Analysis
- Upload a video and detect labels using Google Cloud's Video Intelligence API.
Before running the app, make sure to set up the following:
-
✅ Enable APIs in GCP:
-
🔑 Create Gemini API key from:
-
🔗 Link GCP project to Firebase:
- Use Firebase Console to associate your GCP project.
-
🛠️ Install Google Cloud SDK
-
🧠 Upload and train your model in Vertex AI:
- Vertex AI Training Guide
- PASCAL VOC 2008 dataset: https://www.kaggle.com/datasets/sulaimannadeem/pascal-voc-2008
-
🪣 Create a Cloud Storage bucket (used for image storage)
-
🔐 Create and update Secret Manager entries for:
GCS_BUCKET_NAMEVERTEX_PROJECT_IDVERTEX_ENDPOINT_IDVERTEX_LOCATION(e.g.,us-central1)GEMINI_API_KEY
-
📄 Update PROJECT_ID in app.yaml as your GCP project number
-
🔥 Create Firestore database (used to store image metadata and search results)
- Create a default database in GCP Firestore: Ensure it is set up with a native structure.
-
Enable App Engine in GCP. Make sure the service account that is created has Secret Manager Admin and Storage Admin roles.
# In Google Cloud SDK shell
gcloud init # Select the appropriate Google account and project
gcloud app create # Choose the region where the app will be deployed
# In VS Code, go to root directory
gcloud app deploy app.yaml --quiet # Deploy the app
gcloud app browse # Open the deployed application URL