Microsoft Industry · Research

Applied AI Scientist

CHF 130'000 – 150'000 / year
ZÜRICH
AI-TITLEMACHINE LEARNINGDEEP LEARNINGCOMPUTER VISIONGENERATIVE MODELPOST-TRAININGFINETUNINGML SYSTEMSLLMSPYTORCHTENSORFLOW

Overview

The Spatial AI Lab is part of the Applied Sciences Group, a Microsoft research and development organization dedicated to creating next-generation human-computer interaction technologies leveraging the most recent AI developments and exploring new hardware capabilities and device form-factors. Our team of scientists and engineers has strong expertise in computer vision, multi-modal AI,spatial and embodied AI. Your main job will be to help create smart systems for new types of agents by training and improving multimodal AI models. This role will help you gain more experience in building and using AI models for Microsoft products and large-scale AI systems. You will also have the opportunity to join cutting edge research working with partners like ETH Zurich to publish in top-tier venues, present at workshops, and mentor students. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

Responsibilities:

  • Research novel machine learning algorithms and models.
  • Work on pre and /or post training of f oundational multimodal models.
  • Build data and learning solutions for scalability, efficiency, and performance.
  • Curat e training and evaluation datasets/benchmark.
  • Optimize models for CPUs, GPUs and NPUs and integrate into product s.
  • Collaborate across Microsoft research and engineering teams.

Qualifications

Required Qualifications:

  • A PhD in Machine Learning / Computer Vision or 3+ years of relevant industry experience.
  • E ngineering skills in programming languages such as Python and/ or C++.
  • Hands-on experience with modern deep learning frameworks (e.g. Pytorch / Tensorflow /Jax).
  • Self-motivated team-player, problem solver, and keen to learn.
  • Ability to present complex technical concepts to a diverse audience.

Preferred Qualifications:

  • Experience in one or more of the following areas:
    • Multimodal Models hands-on experience in any of the following topics: Pre and /or post training of large vision language models;
    • Experience in techniques such as pruning, distillation and finetuning.
    • LLMs; Large vision-language models (VLMs); Video generative models and diffusion algorithms; or action-based transformers and Vision Language Action models (VLAs).
  • Large-Scale ML Systems Experience with large scale machine l earning compute system s .
  • Publications Track record of impact, either via research publications at top-tier machine learning or computer vision conferences ( NeurIPS , ICML, CVPR, ECCV, ICCV ) , or via contributions to successful industry initiatives.