Summer Research Intern

Internship

Palo Alto, CA

Abaka AI

]:pointer-events-auto [content-visibility:auto] supports-[content-visibility:auto]:[contain-intrinsic-size:auto_100lvh] scroll-mt-[calc(var(--header-height)+min(200px,max(70px,20svh)))]" data-turn-id="request-WEB:05e34dc7-a5e1-4b82-bbe9-dd79952f9b5e-4" data-testid="conversation-turn-10" data-scroll-anchor="true" data-turn="assistant">
]:pointer-events-auto [content-visibility:auto] supports-[content-visibility:auto]:[contain-intrinsic-size:auto_100lvh] scroll-mt-[calc(var(--header-height)+min(200px,max(70px,20svh)))]" data-turn-id="bf8392ec-c980-4b90-9597-3f6a7b197afa" data-testid="conversation-turn-14" data-scroll-anchor="true" data-turn="assistant">

Our Recent Related Work

SuperGPQA (NeurIPS ’25) – https://supergpqa.github.io/
ACADREASON – https://arxiv.org/pdf/2510.11652
Objaverse++ – https://arxiv.org/abs/2504.07334
OmniVideoBench – https://arxiv.org/abs/2510.10689
VideoScore2 – https://www.arxiv.org/abs/2509.22799
EditReward (submitted to ICLR ’26) – https://arxiv.org/abs/2509.26346

About The Role

We’re looking for Summer Research Interns to help build high-quality datasets, benchmarks, and evaluation pipelines across LLMs, vision, video, 3D/4D, multimodal reasoning, agentic systems, and world models.

In this role, you’ll work closely with our internal research team and external collaborators from the 2077AI Foundation, contributing to research artifacts that are actively used by leading AI labs and academic groups. This internship is ideal for students passionate about evaluation science, dataset construction, and applied AI research at scale.

Responsibilities

Design and construct high-quality datasets and benchmarks for one or more of the following areas:
- LLM reasoning and QA (graduate / PhD-level difficulty)
- Vision and vision-language modeling
- Video understanding, temporal reasoning, and multimodal QA
- 3D/4D perception, embodied AI, and spatial reasoning
Evaluate LLMs, VLMs, Video-LLMs, and multimodal models on reasoning, factuality, temporal understanding, and spatial tasks.
Develop and maintain evaluation pipelines, metrics, and quality-control criteria for expert-level data generation.
Analyze model outputs, conduct error taxonomy and failure analysis, and summarize insights for internal reports and research papers.
Support research on long-context modeling, data efficiency, compression strategies, and benchmark standardization.
Contribute to open-source datasets, benchmarks, and public leaderboards in collaboration with the 2077AI Foundation.

Qualifications

Strong background in computer science, artificial intelligence, robotics, data engineering, or related fields.
Hands-on experience with machine learning or multimodal systems, including LLMs, vision models, or video models.
Proficient in Python; experience with PyTorch or similar frameworks.
Strong analytical reasoning skills and ability to reason about model behavior and data quality.
Excellent written and verbal English communication skills.

Preferred Qualifications

Experience with LLM or multimodal evaluation frameworks (e.g., LM Eval Harness, OpenCompass).
Background in computer vision, video understanding, or multimodal learning.
Experience with 3D/4D data pipelines, graphics, or robotics tools (e.g., Blender, COLMAP, PyTorch3D, Open3D).
Familiarity with NeRFs, Gaussian Splatting, SLAM, or embodied AI datasets and simulators.
Experience with video QA, action recognition, or long-context transformer models.
Relevant research experience or publications in top-tier conferences.

Compensation & Benefits

This is a

paid internship

, with a compensation range of

$25–$60 per hour

, depending on experience and qualifications. This will be an onsite internship based in our

Palo Alto office.

Interns will work directly with experienced researchers, contribute to

high-impact open-source benchmarks and datasets

, and gain high-ownership experience shaping evaluation pipelines used by real AI teams. Exceptional performance may lead to

future consideration for full-time opportunities