Job Description

Building a living model of the world that people and machines can talk to. Powered by a proprietary database of over 30 billion posed images and a next-gen digital map, they are developing the spatial intelligence that helps humans and machines understand, navigate, and engage with the physical world. 
As a Technical Anchor in their London R&D hub, you will bridge the gap between 3D computer vision and Vision-Language Models (VLMs), creating a unified framework where machines can reason about their surround. 
What You’ll Doing: Architect Semantic Grounding: Lead research into cross-modal grounding connecting 3D spatial features with language embeddings. 
Scale Understand Capabilities: Develop algorithms for continuous semantics, allowing 3D maps to evolve and improve situational awareness. 
Agentic Frameworks: Build the spatial brain for Embodied AI, enabling robots, drones, and machines to move into mission-level reasoning. 
Mult...
            

Apply for This Position

Ready to take the next step? Click the button below to submit your application.

Submit Application

Computer Vision Researcher (VLM)

Job Description

What You’ll Doing:

Apply for This Position