Purpose: The past decade has seen considerable debate about how to best evaluate the efficacy of educational improvement initiatives, and members of the educational leadership research community have entered the debate with great energy. Throughout this debate, the use of randomized experiments has been a particularly contentious subject. This study examines the potential benefits, limitations, and challenges involved in using experiments to evaluate professional development for principals. Approach: We present a case study of an experimental evaluation of a professional development program for principals. The case study is grounded in key themes in recent debates about the use of experiments in educational research, scholarship on challenges in conducting experiments, and experimental studies involving principals. Setting and Sample: The case study was conducted in an urban school district with 48 principals. Implications for Research: The experimental component of the study allowed us to form a trustworthy summary inference about whether or not a professional development program had an overall effect on principals. However, the experiment did not illuminate why or how the program failed to influence principal practice. Using descriptions of the intended curriculum for principals, professional development attendance records, and interview data, we developed an understanding of why the program failed to achieve its intended goals. Based on our experiences, we support continued advocacy of research designs that bring rich evidence to bear about causal mechanisms, implementation conditions, potential measures of delivery of and adherence to treatment protocols, and measures of participants’ exposure to treatment.