Jacob Pfau

Faculty Advisor(s): Sam Bowman

Contact: jp6263@nyu.edu
Bio: Jacob Pfau is a PhD student at the NYU Center for Data Science, working in the NYU Alignment Research Group supervised by Sam Bowman and He He. Jacob’s research is motivated towards ensuring language models continue to be safely usable as they scale. As of 2022, he is working on empirically demonstrating failures of language models to generalize honestly. Jacob also thinks about formalizing incentives towards agency and deception in language model pre-training and fine-tuning. Previously, Jacob completed a masters in philosophy at the University of Edinburgh, a year of machine learning masters at the Ecole Polytechnique in France, and a bachelors in mathematics at Amherst College. Between years at university, he worked on mis-generalization in reinforcement learning and interpretability for medical imaging.

Scroll to Top