
How many people identify as lesbian, gay, bisexual, transgender or queer? Why does this figure matter? And how is data about LGBTQ communities used (and misused) by governments, companies and community organisations?
These are some of the questions explored in a new project led by researchers from the University of Edinburgh and Cornell University.
Gender and sexuality scholars Dr Kevin Guyan (Chancellor’s Fellow in the University of Edinburgh Business School) and Dr Jamie Budnick (Assistant Professor in Sociology at Cornell) have received funding from an Edinburgh-Cornell Global Strategic Collaboration Award for their project, Gay Numbers: The Use and Misuse of Sexual Orientation and Gender Identity Data in the UK and US.
Around the world, more workplace diversity monitoring forms, censuses, surveys and other research exercises are asking questions about sexual orientation and gender identity (SOGI). In the UK, national censuses in 2021 and 2022 included new questions on sexual orientation and trans/gender identity. While in the US, the Census Bureau is testing SOGI questions ahead of the 2030 census. Countries including Canada, Australia, Ireland and New Zealand have also either introduced or plan to introduce SOGI questions in their census.
The collection of SOGI data is often driven by good intentions – for example, gathering evidence to address inequalities and social injustices. However, the design of most data, digital and AI systems rely on fixed identity categories and limited response options, which creates specific challenges for LGBTQ communities (e.g. gender fluid individuals).
Launched in January 2025, Gay Numbers will conduct a first-of-its-kind mapping exercise to document how SOGI data has been used and misused in the UK and US between 2020-25.
The project will conclude with an online Future of SOGI Data Forum in autumn 2025, the first global gathering of policymakers, community groups and academics engaged in SOGI data to learn from shared challenges, identify and mitigate ‘hidden harms’ of collecting more data.