Media
Every photo + video clip across the project — daily logs, RFIs, punch items, vendor notes, submittals — AI-tagged at upload and searchable in plain English.
What you see
Media aggregates every photo and video clip across the project into one browsable + searchable surface. Source pills filter to one origin (Daily Log / RFI / Punch / Submittal / Vendor / Manual). View toggles between Gallery (40-per-page grid), Timeline (grouped by capture date), and Map (geo-tagged photos when EXIF GPS is present).
How AI search works
When a photo is uploaded — from any surface — it runs through Gemini Flash vision to extract a one-sentence description and 3-8 construction tags (rebar / drywall / safety_hazard / column / south_wing / …), then through Gemini's multimodal embedding model to land in a vector space text queries can hit. Search at retrieval time is a single SQL call — typically under 1 second across thousands of photos. The alternative (run vision on every photo on every search) would take 10+ minutes per query and cost 100× more.
Every control
Search barPlain English. Hybrid retrieval (BM25 over description+tags + multimodal embedding similarity, fused by RRF).
Source pillsScope to Daily Log / RFI / Punch / Submittal / Vendor / Meeting / Task / Manual. Each color-tinted by source.
Gallery view40 tiles per page, AI tags beneath each thumbnail. Click any tile to open the lightbox.
Timeline viewSame content grouped by capture date, latest first.
Map viewCoord pins for photos with EXIF GPS. Geofencing filter ships v1.1.
LightboxClick a tile → full-size image / video with AI description + tag chips + capture metadata (timestamp, GPS, dimensions) on the right.
Load moreCursor-paginated by captured_at — new uploads always land at page 1.
What gets indexed
Photos: JPEG, PNG, WebP, HEIC. Videos: MP4, WebM, MOV — for video, the first keyframe is extracted and embedded. EXIF data (captured_at, GPS, dimensions) is parsed client-side before upload. A perceptual hash is computed for soft-dedupe.