Research internship: AI Scientist (master level)

Status

Open

Contract

Internship (6 months)

Location

Paris/Remote (European timezones)

Apply now

About exxa

exxa's mission: Empower businesses and people to run frontier AI models on hardware they own.

We're democratizing access to state-of-the-art generative AI through self-hosted solutions that free companies and individuals from cloud dependencies. Whether it's your AI servers, laptop, or edge device: we're making high-level intelligence assistants accessible on hardware you control.

To achieve this, we're building a world-class research team to push the boundaries of efficient deployment through advanced model compression, novel architectures optimized for constrained environments, and cutting-edge inference acceleration. We're not just making AI smaller and faster, we're designing it to efficiently leverage hardware resources while maintaining frontier-level capabilities.

If you're excited to work in a high-pace environment where your research could directly enable AI ownership for millions, we'd love to hear from you!

About the internship

Duration: 6 months
Location: Paris/Remote (European timezones)
Compensation: Competitive internship with potential for full-time conversion
Autonomy: You will be expected to work autonomously on your work projects, under the supervision of exxa CTO (Etienne Balit)
Academic collaboration: Opportunity to continue research through CIFRE programs.

What you'll do

Research state-of-the-art algorithms for speculative decoding and quantized inference
Train draft models tailored for speculative decoding
Develop benchmark and evaluation metrics for draft model quality, and speculative decoding efficiency
Contribute to publications and open-source projects like vLLM and SGLang
Contribute to design novel model architectures optimized for constrained hardware

About you

Required qualifications:

Currently pursuing or recently completed Master in Computer Science, Machine Learning, or related field from a top-tier engineering school/university. We will prioritize candidates in their final Master year.
Strong mathematical foundation and programming skills. Preferably in Python, with experience in PyTorch, JAX, or similar ML frameworks
Proven ability to implement complex algorithms and conduct rigorous experimental validation
Strong understanding of transformer and LLM architectures and training techniques

Preferred qualifications:

Experience in LLMs and/or VLMs
Knowledge of speculative decoding or other compression techniques
Experience with HPC infrastructure and distributed training/inference systems
Familiarity with inference frameworks (vLLM, SGLang, TensorRT-LLM, etc.)

Why you should join us

Join us to bring AI full ownership to everyone! Work on frontier research in efficient model deployment, and solve hard problems with a fun, collaborative team. We're backed by top-tier VCs and offer access to advanced hardware in a low-ego environment where your work makes a real impact.

Application process

To apply, please send an email to careers@withexxa.com with your CV, and everything you think is relevant to your application (e.g. past projects, publications, motivation letter, etc.)

Back to all positions