BEGIN:VCALENDAR VERSION:2.0 X-WR-CALNAME;VALUE=TEXT:CBB Seminar | Social Seminar ~ Austin van Loon, Faculty, MIT Sloan School of Management PRODID:-//Harvard events data//EN BEGIN:VEVENT UID:event_1627841_0 SUMMARY:CBB Seminar | Social Seminar ~ Austin van Loon, Faculty, MIT Sloan School of Management DESCRIPTION:

Social Seminar

Austin van Loon ~ Assistant Professor, MIT Sloan School of Management

Title ~ Using LLMs as a Source of Behavioral Data

Generative AI has prompted a striking proposal for the social sciences: replace human subjects with simulated ones. The appeal is obvious — human data are expensive, LLM outputs are nearly free — but evaluations of their reliability are mixed. In ways that remain difficult to predict, so-called "silicon samples" diverge from real respondents — biasing treatment effects, misrepresenting subgroups, and providing false precision. For theory-testing experiments, where which results matter is determined a priori by their relationship to theory rather than post hoc by effect size, pure silicon sampling offers little evidentiary value. At the same time, these tools clearly predict human behavior with surprising accuracy in many settings, so ignoring their potential value wholesale is also irresponsible. I'll argue the right question is not whether to use LLM predictions as a source of data for social science experiments, but how. I'll present a framework for causal inference in what we call mixed-subjects design randomized controlled trials (MSD-RCTs), in which human outcomes are observed for a subset of units while model-based predictions are generated for all of them. Human data anchors identification; predictions are used as auxiliary measurements to reduce cost or improve statistical power. I outline families of estimators that (1) are unbiased for the human average treatment effect in finite samples and (2) exploit predictions to yield smaller standard errors — regardless of how biased or inaccurate those predictions are. I introduce the open-source `mixedsubjects` R package, which jointly handles estimation and budget-constrained design. I'll close by sketching ongoing work, including a project to build digital-twin infrastructure inside nationally representative panels to support mixed-subjects experimentation at scale.

***CBB and Social talks this semester will meet on Thursdays, 12:00-1:15pm at NORTHWEST Bldg, B101 AUDITORIUM, Bring Harvard ID card for entry to NW Bldg.***

LOCATION:Northwest Building - B101 Auditorium, Harvard ID card needed for entry STATUS:CONFIRMED DTSTART:20260416T160000Z DTEND:20260416T171500Z END:VEVENT END:VCALENDAR