To Legitimize Character Education Programs, We Must Measure Student Character Better

Originally published on

A major obstacle standing in the way of comprehensive character education is that current measures of program effectiveness lack scientific accuracy. This makes it difficult to evaluate whether specific curricula are actually improving kids’ capacities. It’s also really hard for school leaders interested in implementing character education programs to compare commercially available options because each program uses different methods to measure effectiveness.

But the experience sampling approach to data collection, pushing a few questions a day to devices carried around by students and teachers, could blow the door open on this field and change how we approach making sure these character education programs actually work.

We’ve said it before, but we’ll say it again: “You can’t improve what you can’t measure.” It’s a truism from social scientists but it’s very relevant here.

There are two major deficiencies in how we approach character measurement. These are:

1) A lack of consensus in how character domains are defined

2) Tools that can’t measure the complexity of such a thing as character

Let’s tackle the first one. Because psychologists primarily use common language to describe complex behavioral phenomena, too often one word corresponds to several qualitatively different domains (e.g. “impulsiveness,” see Kindlon, Mezzacappa & Earls, 1995) or several related domains will at times be called the same thing (e.g. “intelligence,” see Neisser et al, 1996). In short, words like “empathy” or “integrity,” when used as scientific constructs in research, have multiple meanings. This makes it hard to measure them in different contexts.

We’ve done quite a bit of work to help fix this quandary. In close consultation with our colleague Christian Miller at Wake Forest University, we completed an in-depth analysis of three key character domains important to schools: diligence/grit, honesty and compassion.

Our efforts showed, for instance, that “compassion” is theoretically different from “empathy” and “sympathy,” two domains researched extensively for decades (Zhou, Valiente & Eisenberg, 2003; Batson, 1991; Eisenberg & Miller, 1987). We also now argue that “honesty” is only a closely related domain to “integrity” rather than a fundamental part of it, yet these are two words that we’ve heard used almost interchangeably in discussions with school leaders over the past few years about this concept. This is a problem if schools who espouse that integrity is important don’t know exactly what it means.

As a result of our recent efforts to hone in on exactly what these terms mean across disciplines of psychology, philosophy, and even theology, we’re feeling much more confident these days about doing character measurement the right way for schools.

But a lot of this nitpicking doesn’t really matter. Regardless of how a character domain is defined, how it is actually measured becomes its reality when it comes to tracking its development in students. This is the second big issue plaguing the behavioral sciences, and where current character assessment efforts have fallen terribly flat.

For instance, most studies of compassion and empathy with students in school settings rely on broad self-report questions like “I am concerned about others.” A similar question for teachers might be “This child is considerate of other people’s feelings.”

But how can a student or teacher actually answer this accurately? What if Jamal is concerned about his close friends but not so much the kid he barely knows down the street who looked at him funny yesterday? What if Mr. Johnson is asked the above question for a student who just last week comforted another student who was sad but generally doesn’t do that sort of thing?

We must move toward measurement that takes into account individual situations and observed behavior rather than vague recollections of the past or self-reported behavior. Our smartphone apps that leverage the experience sampling approach do just that, asking questions only about one day’s or week’s experiences, soon after they’ve happened.

But let’s be clear: we don’t blame schools here. They’ve never had the time or in-house expertise to figure all this out with all the other incredibly important duties they have to guide students to become mature, engaged, curious, kind citizens of the next generation.

Schools deserve a better avenue through which they can utilize the innovative stuff going on in social science research, like experience sampling. Our strong belief is that technology like smartphones and tablet computers along with the exponential rise in their adoption in school settings is the answer. Technology can be the bridge and we’re trying to help build it.


Batson, C. D. (1991). The altruism question: Toward a social-psychological answer. Hillsdale, NJ: Erlbaum.

Eisenberg, N., & Miller, P.A. (1987). The relation of empathy to prosocial and related behaviors. Psychological Bulletin, 101(1), 91-119.

Kindlon, D., Mezzacappa, E. & Earls, F. (1995). Psychometric properties of impulsivity measures: Temporal stability, validity, and factor structure. Journal of Child Psychology and Psychiatry and Allied Disciplines, 36, 645-661.

Neisser, U., Boodoo, G., Bouchard Jr., T.J., Boykin, A.W., Brody, N., Ceci, S.J., Halpern, D.F., Loehlin, J.C., Perloff, R., Sternberg, R.J. & Urbina, S. (1996). Intelligence: Knowns and unknowns. American Psychologist, 51, 77-101.

Zhou, Q., Valiente, C., & Eisenberg, N. (2003). Empathy and its measurement. In S. J. Lopez & C. R. Snyder (Eds.), Positive psychological assessment: A handbook of models and measures (pp. 269–284). Washington, DC: American Psychological Association.

Also published on Medium.

Leave a Reply