Repeated assessment of work-related exhaustion: the temporal stability of ratings in the Lund University Checklist for Incipient Exhaustion

Screening inventories are important tools in clinical settings and research but may be sensitive to temporary fluctuations. Therefore, we revisited data from a longitudinal study with the Lund University Checklist for Incipient Exhaustion (LUCIE) that comprised occupationally active individuals (n = 1355; 27–52 years; 57% women) and one initial paper and pencil survey and 10 subsequent equally spaced online surveys. In the present study we examine to what extent the LUCIE scores changed across 3 years (11 assessments) and whether episodes of temporary elevated LUCIE scores (LTE) coincided with reports of negative or positive changes at work or in private life. In the total sample, the prevalence rates for the four LUCIE classifications of signs of increasing exhaustion (from no exhaustion to possible exhaustion disorder) ranged from 65.4–73.0%, 16.6–20.9%, 6.2–9.6%, and 3.4–5.0%. Of 732 individuals screened for LTE episodes, 16% had an LTE episode. The LTE episodes typically coincided with reports of adverse changes at work or, to a lesser extent, in private life. Thus, LUCIE classifications appear reliable and lend themselves to repeated use on the same individuals, or group of individuals. Even single episodes of elevated LUCIE scores seem appropriately to indicate adverse reactions to the work situation.


Introduction
Screening inventories are important tools in occupational health care and research settings. However, for practical and economic reasons, they are typically applied only once and may thus be sensitive to temporary fluctuations related to the individual, the context, or statistical phenomena (e.g., regression to the mean) [1]. During repeated assessment, the complexity of the test, the number of administrations and the time between assessments is also a concern [2,3]. Because re-test effects can create ambiguous results and contribute to unreliable classifications of various medical and psychiatric conditions, it is essential to understand the temporal stability of test scores [2,4,5].
To further the knowledge on repeated assessment of work-related exhaustion, we re-visited a validation study entailing the Lund University Checklist for Incipient Exhaustion (LUCIE) and 11 assessments across 3 years [6][7][8]. LUCIE is intended to assess behaviors, feelings and symptoms associated with prodromal stages of exhaustion disorder (ED) [6,7]. As such, it aligns with clinical experience and research that suggest that early detection/intervention is important [9,10]. The present objective was to examine how stress and exhaustion warning scores changed across the study period and whether episodes of temporary elevations in LUCIE was associated with personality trait scores or coincided with reports of negative or positive changes at work or in private life. Presumably, temporary elevations that coincides with reported changes in work and/or private life would indicate that LUCIE has an appropriate sensitivity to real life changes. The research questions were: • To what extent is the point prevalence of stress and exhaustion warnings in LUCIE stable across 11 consecutive measurements? • Are temporary stress or exhaustion warnings commonly occurring and are they preceded, or concurrent, with reports of changes at work and/or in private life? • Do individuals with temporary elevated stress or exhaustion warnings differ from individual's never displaying stress or exhaustion warnings, regarding demographic characteristics, personality traits and descriptions of work and private life stressors.
Measures LUCIE entails 28 items covering six domains that make up two supplementary scales: the Stress Warning Scale (SWS) (0-100) and the Exhaustion Warning Scale (EWS) (0-100). Using pre-defined cut-off scores on each scale, the SWS and EWS are combined into a four-step ladder of incremental stress symptomatology: STEP 1-GG (normal: SWS green zone and EWS green zone), STEP2-YG (SWS yellow zone and EWS green zone), STEP 3-RG (SWS red zone and EWS green zone), and STEP 4-RR (possible ED: SWS red zone and EWS red zone). For details on the scoring and development of LUCIE see Persson et al. [7]. Passing episodes of elevated SWS and EWS scores (i.e., LUCIE Temporary Elevation [LTE]) were identified for each individual. An LTE episode/case was defined by temporarily scoring in the red zone on either scale (i.e., Step 3-RG or Step 4-RR) while scoring at Step 1-GG or Step 2-YG in the assessment before and after. Given this definition and study design, up to 5 LTE episodes per individual could be achieved.
Personality traits were assessed in five dimensions at T0 with a Swedish 44-item version of the Big Five Inventory (BFI) [12,13].
Two forced choice items asked: "Has your situation at work (alternatively in your private life) changed in a positive or negative direction during the past couple of months?" [6]. Participants were also encouraged to complete an optional free-text field (480 signs).

Data management, statistical analysis and analysis of free-text answers
LTE cases were drawn from the control group sample (n = 745) in a previous study [6]. None of these participants (n = 745) had showed a sustained stress or exhaustion warning (i.e., over several consecutive quarters) in the previous longitudinal study [6] but some, however, displayed intermittent elevations in LUCIE scores (i.e., only one quarter). Thus, we targeted only control group participants with intermittent LTE episodes. In this group, 82% had a completed all 11 surveys, 17% failed to reply to 1 to 3 surveys, and < 1% failed to respond to ≥ 4 surveys [6].
Because the items "Changes in the situation at work and in private life" were introduced at T1, the search of LTE cases entailed waves T1 to T10 and 732 individuals. When LUCIE scores across three consecutive quarters (Q) confirmed an LTE for the first time, the elevation phase was set to Q2, the preceding phase to Q1 and the return phase to Q3. The LTE data was compiled into a new data set and merged with the data from non-LTE participants at T8 to T10.
Statistical analysis applied traditional non-parametric and parametric testing using the IBM/SPSS software version 25 (two-tailed alpha level was set to ≤ 0.05). Sensitivity analyses evaluated potential effects of participant dropout. Thematic analyses of free-text commentaries sufficed using the categories established in our previous study [6].

Results
Both the participation rate and the median SWS scores declined slightly between T0 and T4, but stabilized thereafter (Table 1). Sensitivity analyses entailing the subset of participants that had complete data across the 11 assessments (n = 670; 49%) indicated a similar pattern of decline in SWS scores. The median EWS score exhibited mostly a floor-effect throughout the study (Table 1).

Table 1 Distribution of prevalence rates and median scores (Mdn) with accompanying 95% confidence intervals [95% CI] for LUCIE classes (Step 1 GG to Step 4 RR) and the SWS and EWS scales across the 11 assessments rounds for the total study sample at each round
Cut-off scores for the SWS score: Step 1 GG
The SWS and EWS scores were generally higher in the LTE group than in the control group across all three quarters (p < 0.001; Mann-Whitney U-test; Additional file 3), and most clearly so at Q2 (Elevation phase).
Ratings of both negative and positive changes at work were more frequent among LTE cases (71% and 54%, respectively) than among controls (39% and 46%, respectively) (χ 2 : p < 0.001; Fig. 1; Additional file 4). For both type of ratings, the largest difference occurred at Q2, at which 19% among controls, and 58% among LTE cases, reported a partly or highly negative change at work (χ 2 : p < 0.001). Contrariwise, 27% of the controls reported a partly or highly positive change at work whereas only 15% of the LTE cases did (χ 2 : p < 0.001).
Ratings of negative and positive changes in the private life were more frequent among LTE cases (41% and 49%, respectively) than among controls (23% and 38%, respectively) (χ 2 : p < 0.001; Fig. 1; Additional file 4). For ratings of negative changes, the largest difference occurred at Q2, at which 10% among controls and 28% of LTE cases reported a partly or highly negative changes in their

Table 2 Baseline demographical characteristics and personality traits according to the Big Five Personality Inventory (BFI) of the participants identified as having a LUCIE temporary elevation (LTE) and participants without any LTE across the 11 assessments (controls)
An LTE episode/case was defined by temporarily scoring in the red zone on the LUCIE SWS or EWS scales (i.e.,  Step 4-RR) while scoring at Step 1-GG or Step 2-YG in the assessment before and after. Comparisons with categorical data were made with Pearson Chi Square tests. Comparisons involving continuous outcomes were made with one-way analysis of variance F-tests (ANOVA) private situation (χ 2 : p < 0.001). For ratings of positive changes, the largest difference occurred at Q3, at which 18% among controls and 29% of the LTE cases reported a higher rate of positive changes in the private situation (χ 2 : p = 0.006). The analysis of the free-text commentaries gave a deeper understanding of complaints, and delineated the interplay between work life and private life. See Additional files 5 and 6 for a listing and in depth analysis of free-text answers, respectively. Noticeably, however, when analyzing the 45 free-text answers from the in total 48 LTE cases that had rated negative changes in private life on the forced choice item, it became clear that some had misattributed a negative impact from work as a "negative change in private life". Thus, if discounting reports like "feeling worn out due to work" and reports flagging spillover from work to family as a private burden, only 29% had a solely (genuine) private burden unrelated to work in the total group of 116 participants with an LTE, in contrast to the 41% reported above (see Additional file 6 for computation details).

LTE LUCIE temporary elevation
Reports of simultaneous negative changes at work and in the private sphere were infrequent among LTE cases at Q1(7%) and Q3(3%) but rose to 20% at Q2. Some 20% of LTE cases did not report any negative change at work or in the private sphere during Q1 to Q3, see Additional file 7 for further details.

Discussion
The prevalence rates for the stress and exhaustion warnings in LUCIE (i.e., Step 1-GG to Step 4-RR) were essentially stable throughout the study period, although the median SWS scores declined between T0 and T4 indicating a weak drift towards better health. Conspicuously, the participation rates declined in parallel. However, the sensitivity analyses rejects participant dropout as an explanation for the decreasing SWS scores. Noticeably, only 16% displayed an LTE, and women were overrepresented with a ratio of 2:1. Despite a minute effect size, the higher neuroticism scores among LTE cases corroborates previous cross-sectional and longitudinal findings suggesting that personality traits and stress reactions to some extent are related [6,7,14]. More importantly, however, is that the LTE episodes coincided more frequently with ratings of changes in the work situation, and predominantly so during the elevation phase (Q2), when compared with changes reported to occur in the private life sphere. The analysis of the free-text commentaries strengthened this view. Indeed, some LTE cases misattributed work exposures as being private life stressors. Thus, even a short-term impoverishment of the work situation appears to be associated with the reporting of stress and exhaustion symptoms in LUCIE. In accordance with previous findings in cases of longterm elevation of LUCIE-scores [6], LUCIE appear to be a sensitive measure of short-term stress symptoms/ signs related to primarily the work situation and, as such, is probably a useful tool in the clinical screening of early signs of stress symptomatology and exhaustion in working populations.
Although LTE cases more frequently reported both negative and positive changes at work and, to a lesser extent, in the private situation, 20% of the LTE cases did not report any negative change whatsoever. This puzzle remains even after analyzing the LTE episodes in relation to a control question, documenting the occurrences of circumstances that in theory could have biased the replies in the original survey (e.g., pregnancy, Fig. 1 Ratings of changes in the work situation (left graph) and in the private situation (right graph). Within each graph the left panel shows ratings during the three quarters of fulfillment of the criterion among LUCIE temporary elevated cases (LTE; n = 116), whereas the right panel shows the corresponding data for controls (n = 616) menopause, pain, somatic disease, disturbed sleep due to small children or late habits, or other unspecified private life burdens; data not shown). Yet, humans sometimes display symptoms without being able to attribute them to a specific external or internal factor. Such unknown, or random, variation underlines that results from screening instruments on the individual level is only fully understood in a confident dialogue with the person screened. Since temporary fluctuations in mood and performance may occur even in the absence of any identifiable factor known to the individual, single temporary elevations in LUCIE scores should be conceived as possible indications of increased stress symptoms.

Conclusions
Participation rates and median stress warning scores declined independently from each other during the first five assessments rounds but stabilized thereafter. The overall pattern of results suggest that LUCIE classifications are reliable and lend themselves to repeated use on the same individuals, or group of individuals. Thus, even single episodes of elevated LUCIE scores seem appropriately to indicate adverse reactions to the work situation.

Limitations
Since the participants had long education and all were healthy when entering the study, the results may underestimate population levels of stress and exhaustion warnings and the occurrence of temporary elevations (LTE episodes). The calculations of 95% confidence intervals (CI), and analysis of LTE data, did not account for clustering within individuals. Thus, the CI's may be too narrow due to an underestimation of the standard errors.