RSS Google AI Blog
Follow
StreetReaderAI: Towards making street view accessible via context-aware multimodal AI
Interactive streetscape tools like Google Street View offer virtual exploration but lack accessibility for blind and low-vision users due to uninterpretable imagery. A new prototype, StreetReaderAI, leverages multimodal AI to make these immersive experiences inclusive. Developed collaboratively by blind and sighted researchers, it integrates context-aware AI and accessible navigation. Key features include real-time audio descriptions of surroundings and conversational AI for exploring scenes and geography. Users navigate via voice commands or keyboard shortcuts, receiving directional and location-based feedback. StreetReaderAI utilizes Gemini's AI Describer and AI Chat subsystems for scene analysis and interactive Q&A. AI Describer provides navigation-focused or tour-guide style descriptions based on chosen prompts. AI Chat allows users to ask detailed questions about their current and past views, retaining conversational memory. A study with blind users showed positive reception, highlighting the usefulness of virtual navigation and AI interaction. Participants found AI Chat more engaging than AI Describer, using it six times more frequently. Future development aims for autonomous AI agents, enhanced route planning, and richer audio feedback for a more immersive experience.