rdsx.dev
Tue Feb 11 2025


Image Analysis App
Website - https://theoddysey-visual-answering-transformers-m-streamlit-app-vimg1u.streamlit.app/GitHub - https://github.com/TheODDYSEY/Visual-Answering-Transformers-Model.git
AI
Computer Vision
BLIP
Streamlit
Hugging Face
Transformers
An intelligent image analysis app using BLIP and Streamlit. Upload an image, ask questions, and get AI-powered answers instantly.
Overview
- Visual Answering Transformers Model is a deep learning-powered web app.
- Built using:
- π Python
- π Streamlit
- π€ Hugging Face Transformers
- Enables users to:
- Upload images
- Ask questions (custom or predefined)
- Receive AI-generated responses
- Powered by Salesforceβs BLIP (Bootstrapped Language-Image Pretraining) model.
- Perfect for:
- π¨βπ¬ Researchers
- π¨βπ» Developers
- π€ AI Enthusiasts
Features
1. AI-Powered Image Question Answering
- Upload any image (JPEG, PNG).
- Ask natural-language questions about the image.
- Get smart, AI-generated answers using BLIP.
2. Predefined + Custom Questions
- Choose from a list of common queries.
- Or type in your own questions to dive deeper.
3. Real-Time Feedback with Streamlit UI
- Interactive web interface.
- Smooth drag-and-drop image upload.
- Instant AI responses with loading indicators.
4. Optimized for GPU Acceleration
- Detects CUDA availability.
- Uses GPU (if available) for faster processing.
5. Clean, Minimal UI
- Streamlit-powered dashboard.
- Easy for users of all levels to explore image analysis with AI.
Quick Start
Prerequisites
Ensure the following are installed:
Cloning the Repository
git clone https://github.com/TheODDYSEY/Visual-Answering-Transformers-Model.git
cd ai-image-question-answering
Installation
pip install -r requirements.txt
Running the App
streamlit run streamlit_app.py
Access the app via http://localhost:8501
Future Enhancements
- β‘ Improve inference speed
- π Integrate OCR for reading text in images
- π§© Add visualizations for more intuitive feedback