A New Tool for Handling Multiracial and Multi-Identity Data in Health Research

Gabriel J. Merrin, Syracuse UniversityFollow

Description/Abstract

When surveys ask about race or ethnicity, a growing number of Americans select more than one category. The multiracial population now represents over 10% of the U.S. population and is the fastest growing racial group in the country. Yet researchers routinely collapse these individuals into an “other race” category for statistical analysis, rendering specific subgroups invisible. This brief introduces CATAcode, a free software tool that helps researchers systematically explore, document, and prepare check-all-that-apply demographic data for statistical modeling. In a demonstration with over 8,000 high school students, CATAcode revealed 85 distinct racial identity combinations from just eight response options. The tool shows how coding decisions dramatically affect representation. In one dataset, standard approaches identified only 12 Native American participants, while a priority approach increased that number to 128. The analysis shows that even seemingly small methodological choices can mean the difference between communities being statistically invisible or present in findings.

Document Type

Research Brief

Keywords

check all that apply, demographic data, social identity, self-identification, R package, open data, open materials

Disciplines

Databases and Information Systems | Demography, Population, and Ecology | Programming Languages and Compilers | Race and Ethnicity

Date

2-3-2026

Language

English

Acknowledgements

The author used Claude (Anthropic) to assist with brainstorming, organizing, and editing this brief. The extent of use was moderate. All content was reviewed, verified, and edited by the author, who takes full responsibility for the accuracy and integrity of this work. The author thanks Alyssa Kirk and Shannon Monnat for assistance with copyediting and publication.

Recommended Citation

Merrin, G. J. (2026). A New Tool for Handling Multiracial and Multi-Identity Data in Social Science Research. Lerner Center Population Health Research Brief Series. Research Brief #141. Accessed at: https://doi.org/10.14305/rt.lerner.2026.3.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Download

Included in

Databases and Information Systems Commons, Demography, Population, and Ecology Commons, Programming Languages and Compilers Commons, Race and Ethnicity Commons

COinS

DOI

https://doi.org/10.14305/rt.lerner.2026.3

A New Tool for Handling Multiracial and Multi-Identity Data in Health Research

Description/Abstract

Document Type

Keywords

Disciplines

Date

Language

Acknowledgements

Recommended Citation

Creative Commons License

Included in

DOI

Browse

Search

Author Resources

Links

Population Health Research Brief Series

A New Tool for Handling Multiracial and Multi-Identity Data in Health Research

Author(s)/Creator(s)

Description/Abstract

Document Type

Keywords

Disciplines

Date

Language

Acknowledgements

Recommended Citation

Creative Commons License

Included in

Share

DOI

Browse

Search

Author Resources

Links