1. Home
  2. Help
  3. Workspace
  4. Media
Workspace

Media

Every photo + video clip across the project — daily logs, RFIs, punch items, vendor notes, submittals — AI-tagged at upload and searchable in plain English.

What you see

Media aggregates every photo and video clip across the project into one browsable + searchable surface. Source pills filter to one origin (Daily Log / RFI / Punch / Submittal / Vendor / Manual). View toggles between Gallery (40-per-page grid), Timeline (grouped by capture date), and Map (geo-tagged photos when EXIF GPS is present).

When a photo is uploaded — from any surface — it runs through Gemini Flash vision to extract a one-sentence description and 3-8 construction tags (rebar / drywall / safety_hazard / column / south_wing / …), then through Gemini's multimodal embedding model to land in a vector space text queries can hit. Search at retrieval time is a single SQL call — typically under 1 second across thousands of photos. The alternative (run vision on every photo on every search) would take 10+ minutes per query and cost 100× more.

Every control

Search bar

Plain English. Hybrid retrieval (BM25 over description+tags + multimodal embedding similarity, fused by RRF).

Source pills

Scope to Daily Log / RFI / Punch / Submittal / Vendor / Meeting / Task / Manual. Each color-tinted by source.

Gallery view

40 tiles per page, AI tags beneath each thumbnail. Click any tile to open the lightbox.

Timeline view

Same content grouped by capture date, latest first.

Map view

Coord pins for photos with EXIF GPS. Geofencing filter ships v1.1.

Lightbox

Click a tile → full-size image / video with AI description + tag chips + capture metadata (timestamp, GPS, dimensions) on the right.

Load more

Cursor-paginated by captured_at — new uploads always land at page 1.

What gets indexed

Photos: JPEG, PNG, WebP, HEIC. Videos: MP4, WebM, MOV — for video, the first keyframe is extracted and embedded. EXIF data (captured_at, GPS, dimensions) is parsed client-side before upload. A perceptual hash is computed for soft-dedupe.