The proportion of the population of England that self-identifies as lesbian, gay or bisexual: producing modelled estimates based on national social surveys

Objectives There is currently no widely accepted estimate of the proportion of people in England that self-identifies as lesbian, gay or bisexual (LGB), which is needed if we are to compare health inequality between different population groups. Using systematic review methods, this study identified all national social surveys with a question on sexual orientation and pooled those which represented the overall population of England. LGB proportions were synthesized into an aggregated mean estimate using weights based on sample size, response rate and missing data. The modelled estimate was stratified by socio-demographic and geographical variables. Results Twenty-two national surveys were identified of which 15 were suitable for pooling. Synthesis resulted in a weighted mean estimate of 2.50% of the adult population of England identifying as LGB or ‘other’. The proportion was highest in men, people below 45 years of age and the London region. The (theoretical) upper limit was 5.89% if all non-responders were assumed to identify as LGB. The reported 2.50% presents a minimum and may be influenced by respondents’ perceptions of confidentiality and social acceptance. It is however the most robust estimate currently available and can be used as baseline to understand health and wellbeing needs of different groups. Electronic supplementary material The online version of this article (10.1186/s13104-017-2921-1) contains supplementary material, which is available to authorized users.


Introduction
Sexual orientation is a protected characteristic under the UK Equality Duty of the Equality Act 2010 [1], so public bodies must have due regard for the need to reduce discrimination and advance equal opportunities among those who share such protected characteristics and those who do not. Furthermore, it is important for public health bodies to understand the health and wellbeing needs of minority sexual orientation groups, as they are known to be at increased risk of poor physical and mental health behaviours and outcomes [2][3][4]. An important first step in discharging this responsibility is to know the proportion of people who self-identify as lesbian, gay or bisexual (LGB) in England. Unfortunately there is currently no agreed and supported estimate available.
The authors were commissioned by Public Health England to devise a process to use all available data to derive the best possible estimate of the proportion of people in England who self-identify as LGB. In 2009, the Office of National Statistics (ONS) developed a standard question to ask sexual identity on social surveys, which was subsequently introduced in a number of national questionnaires [5][6][7]. So far, no study has used a systematic approach to identify all surveys that measure sexual orientation and synthesise them taking methodological limitations into account. This study aimed to produce a robust estimate of the proportion of people who selfidentify as LGB, which could be used as a baseline figure for researchers and policy makers to compare health

Methods
First, nine relevant databases (EMBASE, HSCIC, MEDLINE, SAGE, Social Care Online, Social Science Research Network, SocINDEX, UK Data Archive, Web of Science) were searched using a combination of search terms, e.g. in Pubmed/MEDLINE: ("sexual orientation" OR "sexual identity" OR "same-sex relationships" or "lesbian gay bisexual") AND ("United Kingdom" OR England OR Britain) AND (survey OR questionnaire OR proportion OR prevalence OR size OR percentage OR measure OR estimate) (see Additional file 1). In addition, the grey literature was searched by exploring websites from key organizations (National Health Services, ONS, Stonewall, LGBT Foundation), hand-searching publications and exploiting author-contacts. The survey inclusion criteria were: (1) geographical coverage of at least the whole of England or sub-geographies that together form a representative sample of the whole of England; (2) targeting the general population or a sub-set of the general population unlikely to affect sexual identity proportions; and (3) including a direct question on a person's sexual identity. There was no time restriction, but only the most recent version of a survey series or longitudinal cohort was included.
Second, from each identified survey, methodological data were extracted including; geographical coverage, data collection period, survey population, survey design, sampling method, sample size and response rate. With regard to the sexual orientation question, data were extracted on; question format, mode of administration, response categories and proportions. Response categories included both substantive answers (heterosexual/ straight; lesbian/gay; bisexual; other) and non-substantive answers (don't know, prefer not to say, refused, no answer). Individual survey proportions of LGB people were calculated as the sum of the proportions of 'gay/ lesbian' , 'bisexual' and 'other' among all those who were asked the question on sexual orientation. The responses were limited to the population of England.
Third, a quality assessment was done to determine which surveys to pool into a synthesized estimate and what weights to assign to each survey in a synthesis. Surveys with study populations that did not represent the general population of England in terms of age, gender or other characteristics may introduce bias in LGB proportions and were therefore not pooled but reported separately [8,9]. For the synthesis we used an adaptation of a previously developed method to enumerate minority ethnic groups from surveys [10,11]. Weights were assigned to methodological characteristics that differed between surveys, were conceptually linked to sexual orientation response quality and were quantifiable, which were: survey sample size; overall survey response rate; and proportion of missing data. To avoid overweighting we used a logarithmic transformation of sample size. We used an inverse proportion of all non-substantive answers (don't know, prefer not to say, refused, no answer) as a weight for missing data. Five different combinations of weights were used to explore their relative effects (see Additional file 2). The fifth method incorporated all three weights and was considered to be the most robust method. A range was constructed around the aggregated mean estimate by performing a sensitivity analysis around missing data, calculating the most extreme scenarios where people with non-substantive answers would have been either all heterosexual or all lesbian/gay/bisexual.
Finally, we stratified the mean estimate by age, gender and region. Since LGB proportions could not be stratified for all original surveys, we selected a baseline survey to provide a standard distribution of LGB. Ideally, the Census of England and Wales 2011 would have been used for this purpose, but this survey did not include a question on sexual orientation [12]. We found that the GP (general practitioner) Patient Survey 2015 best resembled the population of England in terms of age, gender and region. Using the distribution of LGB across strata from the GP Patient Survey and our synthesized mean LGB estimate, we calculated the estimated number of adult LGB people in England in 2015 and divided this by the total population numbers based on mid-2015 ONS' estimates [13].

Results
We identified a total of 664 records; 617 from published data sources and 47 from grey sources. Of these, 636 were excluded because they did not meet the inclusion criteria. After full-text screening of the remaining 28 surveys, six more were excluded: four were previous versions of more recent surveys already included and two surveys formed part of an umbrella survey (Integrated Household Survey 2014) which was already included. The remaining 22 surveys were similar in terms of study design, question format and substantive response categories, while differences were found in terms of study populations, sampling methods, sample sizes, survey response rates, modes of question administration and non-substantive response categories (Additional file 3).
The proportions of LGB and 'others' among people who were asked the question on sexual orientation in the 22 surveys are shown in Fig. 1. Percentages ranged from 0.90% (95% CI 0.40, 1.83) to 5.52% (95% CI 4.63, 6.56). The proportion of missing data ranged from 0.10 to 24.06%. Sample sizes ranged from 825 to 854,032 and response rates from 28 to 100%. The results of one survey, the First Longitudinal Study of Young People in England: Waves 1-7 2009-2010, could not be obtained.
The following seven surveys were excluded from pooling because their study population was limited in terms of age, gender or health conditions: Thus, 15 of the 22 surveys were suitable for pooling and their measures of sexual orientation were synthesized using the five different weighting methods described above (Table 1).
Ranges around the mean LGB estimate of 2.50% (Method 5) resulted in a minimum of 2.50% and maximum of 5.89%, when people who responded 'prefer not to say' , 'refused' , 'don't know' or 'no answer' were   Table 2). Applied to the adult population of England in mid-2015, the proportion of LGB and 'other' was highest among young adults from 18 to 34 (3.74%) and decreased with each older age group. The proportion was higher in men

Discussion
This study provides an aggregated weighted estimate of the size of the LGB (and 'other') population of England of 2.50% with a range of 2.50-5.89% based on a sensitivity analysis of missing data. This would project to an estimated 1.08 million adults self-identifying as belonging to a sexual minority among a total of 43.1 million people in 2015 (ONS mid-2015 estimates). The upper bound of 5.89% should be treated with caution as it represents the theoretical maximum if all people who did not respond informatively to a question on sexual orientation would report as LGB. The aggregated mean of 2.50% provides the lowest possible estimate of LGB in the given sources, which is slightly lower than that of national household surveys of the United States, Canada and Australia, where figures have been reported between 2.4-3.5, 3.0 and 3-4%, respectively [14][15][16][17][18]. This is the first study to adopt a systematic and weighting approach to identify and combine the results of existing surveys into an estimate of the LGB population of England. Both the search strategy and synthesis methodology were discussed with a group of experts in the field of sexual orientation surveys in England. As a result, we are confident that all relevant surveys are included and that the methodology is robust. However, our LGB estimates are clearly sensitive to problems in the original surveys and societal factors relating to reporting of sexual orientation.

Limitations
Our study had several limitations: • The synthesis included weights based on survey sample size, response rate and proportion of missing data. Yet other factors may also have influenced non-response and misreporting of sexual orientation, including mode of question administration and survey context. However, given the lack of knowledge about the direction and magnitude of these effects, the current methodology could not include quantitative weights for them. We also did not include a weight for variance as is usual in meta-analysis, as we felt it was more important to use weights conceptually linked to problems of reporting sexual orientation rather than weights reflecting precision. • The stratified aggregated LGB estimates were valid only to the extent to which the population distributions of the baseline survey (the GP Patient Survey 2015) were representative of the national population of England. While the distributions of age, gender and region were very similar between the two, there was variation in ethnicity which meant that we could not confidently report estimates stratified by ethnicity. Using the GP Patient Survey as our baseline survey also meant that we were not able to stratify by local authority level or disability, because the survey simply did not provide this type of information. These issues could be resolved if original surveys would stratify their results by these factors or if the Census would include a question on sexual orientation [19]. It would also be useful to have more surveys conducted at local level to get a better understanding of geographical differences. • Using general population surveys to quantify the proportion of people who self-identify as LGB may have underestimated this group, because some people may inaccurately report their sexual identity in survey settings influenced by perceptions of confidentiality and social acceptance [20][21][22]. While these issues may change slowly over time, future surveys may be able to produce more reliable and realistic estimates if the context and mode of administration of sexual identify questions is optimized. Also, social acceptance may increase if the question is adopted in the national Census. Finally, it is important to acknowledge that sexual identity as used in national surveys is not coterminous with sexual orientation, which is the term used in the Equality Act to legally protect LGB people from discrimination, and that any of these estimates therefore are likely to underestimate the actual size of this population. • Abbreviations GP: general practitioner; LGB: lesbian, gay and bisexual; ONS: Office for National Statistics.

Authors' contributions
KH and WL generated the initial idea for the study. SvK, MF and KH searched and identified relevant surveys. SvK and KH extracted and analysed data and wrote the draft report. All authors contributed to the review of the manuscript. All authors read and approved the final manuscript.