Linxi "Jim" Fan

Linxi "Jim" Fan

Senior Research Scientist
Lead of AI Agents

NVIDIA AI

Hello there!

I am a Senior Research Scientist at NVIDIA and Lead of AI Agents Initiative. My mission is to build generally capable agents across physical worlds (robotics) and virtual worlds (games, simulation). I share insights about AI research & industry extensively on Twitter/X and LinkedIn. Welcome to follow me!

My research explores the bleeding edge of multimodal foundation models, reinforcement learning, computer vision, and large-scale systems. I obtained my Ph.D. degree at Stanford Vision Lab, advised by Prof. Fei-Fei Li. Previously, I interned at OpenAI (w/ Ilya Sutskever and Andrej Karpathy), Baidu AI Labs (w/ Andrew Ng and Dario Amodei), and MILA (w/ Yoshua Bengio). I graduated as the Valedictorian of Class 2016 and received the Illig Medal at Columbia University.

I spearheaded Voyager (the first AI agent that plays Minecraft proficiently and bootstraps its capabilities continuously), MineDojo (open-ended agent learning by watching 100,000s of Minecraft YouTube videos), Eureka (a 5-finger robot hand doing extremely dexterous tasks like pen spinning), and VIMA (one of the earliest multimodal foundation models for robot manipulation). MineDojo won the Outstanding Paper Award at NeurIPS 2022. My works have been widely featured in news media, such as New York Times, Forbes, MIT Technology Review, TechCrunch, The WIRED, VentureBeat, etc.

Fun fact: I was OpenAI’s very first intern in 2016. During that summer, I worked on World of Bits, an agent that perceives the web browser in pixels and outputs keyboard/mouse control. It was way before LLM became a thing at OpenAI. Good old times!

Featured

Research Highlights

Eureka
GPT-4 writes reward functions to teach a 5-finger robot hand how to do extremely dexterous tasks like pen spinning.
Voyager
LLM-powered agent that masters Minecraft by in-context lifelong learning.
VIMA
Multimodal LLM for robot manipulation; unifies diverse robotics tasks in a single prompting framework.
MineDojo
NeurIPS Outstanding Paper Award✨. Large-scale open-ended agent learning framework in Minecraft.

Media
Coverage

Publications

Visit my Google Scholar page for a comprehensive listing!

*
Pre-Trained Language Models for Interactive Decision-Making
Oral Presentation ✨. Neural Information Processing Systems (NeurIPS), 2022
iGibson 1.0: A Simulation Environment for Interactive Tasks in Large Realistic Scenes
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021
Kernel Approximation Methods for Speech Recognition
Journal of Machine Learning Research (JMLR), 2019
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016
Hybrid Ontology-Learning Materials Engineering System for Pharmaceutical Products
Computers & Chemical Engineering Journal, 2017
AIChE Annual Meeting, 2014

Experience

 
 
 
 
 
NVIDIA
Research Scientist
Dec 2021 – Present California
  • Conducting bleeding edge research on foundation models for general-purpose autonomous agents.
  • Leading the MineDojo effort for open-ended agent learning in Minecraft.
  • Mentoring interns on diverse research topics.
  • Collaborating with universities: Stanford, Berkeley, Caltech, MIT, UW, etc.
 
 
 
 
 
NVIDIA
Research Intern
Jun 2020 – Sep 2020 California
  • Proposed SECANT, a state-of-the-art policy learning algorithm for zero-shot generalization of visual agents to novel environments.
  • Paper published at ICML 2021.
 
 
 
 
 
Google Cloud AI
Research Intern
Jun 2018 – Sep 2018 California
  • Created SURREAL, an open-source, full-stack, and high-performance distributed reinforcement learning (RL) framework for large-scale robot learning.
  • Paper published at CoRL 2018. Best Presentation Award finalist.
 
 
 
 
 
Stanford Vision Lab
Ph.D. in Computer Science
Sep 2016 – Sep 2021 California
 
 
 
 
 
OpenAI
Research Intern
Jun 2016 – Mar 2017 California
  • Co-designed World of Bits, an open-domain platform for teaching AI to use the web browser. World of Bits was part of the OpenAI Universe initiative.
  • Paper published at ICML 2017.
 
 
 
 
 
Mila-Quebec AI Institute
Research Assistant
Sep 2015 – Mar 2016 Montréal, Quebec, Canada
  • Systematically analyzed and proposed novel variants of the Ladder Network, a strong semi-supervised deep learning technique.
  • Mentored by Turing Award Laureate Yoshua Bengio.
  • Paper published at ICML 2016.
 
 
 
 
 
Baidu Silicon Valley AI Lab
Research Intern
May 2015 – Sep 2015 California
 
 
 
 
 
Columbia University
Research Assistant
Sep 2013 – Dec 2014 New York City
  • Columbia NLP Group, advised by Prof. Michael Collins. Studied kernel methods for speech recognition. Paper published in Journal of Machine Learning Research.
  • Columbia Vision Lab, advised by Prof. Shree Nayar. Implemented a computer vision system in Matlab to infer astrophysics parameters from galactic images.
  • Columbia CRIS Lab, advised by Prof. Venkat Venkatasubramanian. Developed ML and NLP techniques to automate ontology curation for pharmaceutical engineering. Paper published in Computers & Chemical Engineering.