Evaluating broad-scale system change using the Consolidated Framework for Implementation Research: challenges and strategies to overcome them

Objective The objective of this paper is to demonstrate the utility of the CFIR framework for evaluating broad-scale change by discussing the challenges to be addressed when planning the assessment of broad-scale change and the solutions developed by the evaluation team to address those challenges. The evaluation of implementation of Patient-centered Care and Cultural Transformation (PCC&CT) within the Department of Veterans Affairs (VA) will be used as a demonstrative example. Patient-Centered Care (PCC) is personalized health care that considers a patient’s circumstances and goals. The Department of Veterans Affairs (VA) is working towards implementing PCC throughout its healthcare system, comprised of multiple interventions with a singular long-term goal of cultural transformation, however little is known about the factors influencing its implementation. This paper discusses the issues that arose using CFIR to qualitatively assess the factors influencing implementation of cultural transformation. Results Application of CFIR to this broad-scale evaluation revealed three strategies recommended for use in evaluating implementation of broad-scale change: (1) the need for adapted definitions for CFIR constructs (especially due to new application to broad-scale change), (2) the use of a mixed deductive-inductive approach with thematic coding to capture emergent themes not encompassed by CFIR, and (3) its use for expedited analysis and synthesis for rapid delivery of findings to operational partners. This paper is among the first to describe use of CFIR to guide the evaluation of a broad-scale transformation, as opposed to discrete interventions. The processes and strategies described in this paper provide a detailed example and structured approach that can be utilized and expanded upon by others evaluating implementation of broad-scale evaluations. Although CFIR was the framework selected for this evaluation, the strategies described in this paper including: use of adapted definitions, use of mixed deductive-inductive approach, and the approach for expedited analysis and synthesis can be transferred and tested with other frameworks.


Background
Improving performance and initiating broad-scale change at the organizational level in healthcare often involves multiple interventions, or a collection of interventions including complex, multi-faceted interventions needing careful coordination and adaptation to the specific context in which there are being implemented [1]. While understanding the process of dissemination of these practices is a priority [2] and efforts have been made to identify and describe mechanisms for change at the health system level when implementing complex multi-dimensional interventions [1], challenges exist in evaluating implementation of these complex interventions.
In evaluation, theories and frameworks describe and prescribe aspects of an assessment, rooted in the needs or requirements of the customer and the purpose of the inquiry. These include activities or strategies, methods choices, and the responsibilities of and products to be provided by the evaluators [3]. Evaluators must consider the complexity and coordination of the multiple interventions when selecting an appropriate evaluation strategy, but often end up relying on evaluation of the interventions individually due to the paucity of frameworks available for evaluating broad-scale change requiring multiple interventions.
The Consolidated Framework for Implementation Research (CFIR) has primarily been used to evaluate implementation of single, discrete interventions or programs [4][5][6], yet it may also be particularly useful for evaluating broad-scale programs implemented by large, integrated healthcare systems. The CFIR complements interventions built on process-oriented theories focused on how implementation should be planned, organized, and scheduled [7], aspects which are critical when coordinating and evaluating implementation of multiple, complex interventions. CFIR offers a comprehensive, unifying taxonomy of constructs related to the intervention, inner and outer settings, characteristics of individuals, and implementation process [8]. Because CFIR offers a wide-reaching set of constructs, it is possible and practical to apply these constructs as a comprehensive set of a priori codes for deductive coding. This, in turn, provides a means to expedite the analysis of large amounts of qualitative data [9] and facilitates the rapid turnaround of recommendations to leadership, operations partners or programs.
The objective of this paper is to demonstrate the utility of the CFIR framework for evaluating broad-scale change by discussing the challenges to be addressed when assessing broad-scale change and the solutions developed to addressing those challenges using the evaluation of implementation of Patient-centered Care and Cultural Transformation (PCC&CT) within the Department of Veterans Affairs (VA) as a demonstrative example.

PCC&CT in VA
The VA model of patient-centered care (PCC) focuses on whole person care, by providing care that is personalized, proactive, and patient-driven [10]. The VA's commitment to PCC was solidified with the creation of the Office of Patient-Centered Care and Cultural Transformation (OPCC&CT) in 2010 [10] which was established to promote cultural transformation in VA. To achieve this task, OPCC&CT used an approach that included a broad range of interventions and innovations that required piloting and evaluating, as well as testing of many different strategies for supporting their implementation across the system at the clinical and organizational level. Examples of interventions or innovations include: redesigning the environment of care to make it more comfortable and inviting (e.g., use of more natural light, providing better maps and way-finding for navigating the hospital) and providing training and promoting use of diverse approaches by clinicians where the patient is recognized as the primary member of a supportive team with an equal voice and with the choice of goals that may encompass all facets of their life, even those beyond primary health concerns [10].
OPCC&CT sought to understand how PCC was being implemented, and engaged health services researchers to conduct an implementation evaluation; given the size and scope of the transformation, two groups were selected to conduct the evaluation (located in Chicago and Boston). The goal of the evaluation described in this manuscript, a component of a larger evaluation, was to develop a set of recommendations for future rollout of PCC to the broader VA organization by describing the key lessons learned from understanding individual and organizational factors, key barriers and facilitators, and the strategies used and their impact on implementation; the results of which have been expected for publication elsewhere [11].

Challenges in assessing implementation of PCC&CT
This study used a realist approach to evaluation of PCC in VA [12] which recognizes that changes resulting from implementation of interventions/programs occur in complex and dynamic ways and initiate result in both planned and unplanned processes and outcomes [12]. Given that the evaluation was utilizing CFIR in a new way, the evaluation team utilized the following approach for planning for its use in the evaluation: (1) assessing its fit for the application, (2) closely tying the methodological approach and analytic strategy to the framework, and (3) projecting the use of the framework to structure a set of recommendations that could be used by the operations partner to support enhancement of implementation.

Identifying a framework and assessing its fit
Prior to the start of the evaluation, discussions with leadership in OPCC&CT focused on the goals of the program office to: (1) describe ongoing implementation efforts of the multiple interventions across the Centers of Innovation (COIs) and (2) to understand the factors influencing those implementation efforts. Recognizing the complexity of the evaluation, the team selected CFIR to plan and guide the implementation evaluation because of the comprehensive nature of its constructs and the flexibility offered in recommendations of its use [12]. Also, the comprehensive nature of the framework lends itself to use as an initial structure coding [13] because the dynamic and numerous constructs offer coverage for wide-ranging themes and ensures the capture of those factors important to implementation [14].
The evaluation team used the "menu of constructs" process which involves identifying and including only constructs essential to the evaluation, which facilitates shorter, focused interviews and expedited analysis [12]. The evaluation teams had exploratory discussions with PCC leaders to gain a preliminary understanding of ongoing and planned innovations and their questions and goals for the evaluation. Following these discussions, the two evaluation teams met to review the scope of the evaluation and the constructs of CFIR that were most relevant and applicable to the evaluation [12] and critical to the questions and goals of key leadership, and followed that with a discussion with OPCC&CT to ensure the "essential constructs" needed to meet the evaluation goals were included. Specific reasons for the selection of each of the constructs can be found in Table 1. As recommended in a recent systematic review of use of CFIR, the framework was integrated into the evaluation design, data collection, and analysis [15] (Table 1).

Developing a methodological approach and analytic strategy
Qualitative data were collected through semi-structured interviews with key stakeholders at each VA facility. CFIR was used to guide development of the interview questions, as well as the coding structure and data analysis. Operational definitions were developed for the CFIR constructs selected for the evaluation (Table 2) based on consensus between the evaluation teams to ensure members collected and analyzed data based on the same understanding of the domains and constructs [16]. These definitions were used by the evaluation teams to inform development of an interview guide to be used by both groups ( Table 2). The interviews were conducted by a team of experienced qualitative researchers. Interviews were audio-recorded and transcribed and descriptive field notes [17] were taken in instances where audiorecordings were not collected.
A mixed deductive-inductive [18][19][20] approach to coding was used to analyze data from the interviews. In this approach, operational examples are used to define each construct and are used to create an initial code list to be used for the analysis [21]. Deductive coding was guided by CFIR [4] using a structured analytical tool (Table 1) to facilitate rapid qualitative analysis. Inductive coding was used to capture themes that were not represented in CFIR as a way to ensure coding was reflective of the data, especially given that CFIR had mainly been used to assess discrete interventions rather than larger scoping programmatic evaluations. Although inter-coder reliability was not calculated initially, the study-specific operational definitions and newly created inductive codes resulted in a high-level of agreement between coding teams requiring few consensus discussions, in cases where consensus discussions were required, there was 100% agreement after consensus (no discussions ended in disagreements or required a third coder).

Projecting use of CFIR for development of a structured set of recommendations
The final step of this process required synthesizing early findings of the evaluation for rapid use by our operations partner in the field. This process involved: (1) conducting an additional level of analyses on data within these key domains to define how and why the domain was salient in the context of the evaluation and (2) developing a set of recommendations that could be utilized by leadership to enhance implementation of the program. The evaluation team planned to utilize a similar process as the qualitative coding described above. Key domains were split between the evaluation teams and were independently analyzed by investigators to develop definitions which were then refined within each team. Next, a set of recommendations was developed by the individual evaluation teams assessing the key domains. Finally, several meetings were conducted with the full evaluation team to discuss the definitions of the key domains and the recommendations for finalization.

Strategies identified for using CFIR to evaluate implementation of broad-scale change
During the evaluation, the strategies identified by the evaluation team for use of the CFIR in this new application were tracked as they were considered essential components for understanding the approach to and use of CFIR for this broad scale evaluation. Aspects of each of these strategies were tracked through documents that were also used for completing required elements of the evaluation (such as meeting minutes). The strategies identified by the team and the documented aspects of those strategies are presented in the results below. Three overarching strategies were identified during the evaluation including: (1) the creation of adapted definitions for the CFIR constructs to account for its application to the broad-scale evaluation (Table 1), (2) the use of a mixed deductive-inductive coding process which demonstrated the flexibility of use of the CFIR framework for complex evaluation in the emergence of both additional CFIR constructs not initially accounted for by the study team (Table 3) and several new key themes from the cooccurring inductive thematic coding (Table 4). Finally, (3) use of CFIR for the rapid analysis and synthesis of the data into key domains impacting implementation of PCC&CT to develop recommendations for the VA OPCC&CT leadership to support enhancement of implementation and expansion opportunities of the program (Table 5). These findings are described in further detail below.

Adapted definitions
As a first step, the evaluation team reviewed the CFIR domains and constructs and developed adapted definitions based on the study context, including: (1) the broad scope of the intervention(s), (2) the broad-scale change targeted, (3) the input and existing knowledge of the operations partner OPCC&CT, and (4) the goals of the evaluation including assessing what had already occurred and what was currently in progress. Some definitions required more adaptation than others. For example, the domain/construct "Intervention Characteristics/Complexity" or "Inner Setting/Culture" did not necessarily require adaptation of the definition, however the questions associated with measuring those constructs did have to be broader than those typically associated with a single-intervention evaluation. Adapted definitions for the other constructs are available in Table 2; and the standard short descriptions of these constructs can be found at the CFIR Wikipage [22].
Other domain constructs required adaptation of the definitions to fit the goals of the evaluation and the needs of the operations partner. For example, the domain/construct "Intervention Characteristics/Intervention Source" required adaptation from the standard short description "perceptions of key stakeholders about whether the intervention is externally or internally developed" [22] to the adapted definition "History of PCC-related program(s) or practice(s) and perceived source of the initiative. " The primary purpose for this adaptation was two-fold, (1) the participants interviewed in this study had some level of involvement in the implementation of PCC at their facility, and therefore were familiar with OPCC&CT and the source of the intervention and (2) OPCC&CT spent much time and energy on strategies to expose individuals to the PCC cultural transformation and knowledge of the source of the intervention was (presumably) widely known. Instead, the emphasis was placed on understanding the history of PCC at the facilities which, in some cases was the source of the intervention in that it was an innovation already present at a facility and adopted by OPCC&CT as recommendations for other facilities.
In another example, the domain/construct "Outer Setting/Patient Needs & Resources" was adapted from the standard short definition "the extent to which patient needs, as well as barriers and facilitators to meet those needs, are accurately known and prioritized by the organization" to the adapted definition "Identified patient needs, processes used to identify them, barriers and facilitators associated with meeting needs and strategies for engaging patients to identify ways to address them. " The need to adapt the definition for this construct was largely driven by the characteristics of the transformation, namely, its patient-centered and patient-driven nature. OPCC&CT was interested in more information beyond just understanding patient needs and the barriers/facilitators to addressing them, such as the processes and strategies for engaging patients as partners to identify, strategize, and address those needs.

Emergence of CFIR constructs (deductive) and new thematic codes (inductive)
CFIR is composed of 39 constructs, of which 19 were selected by the evaluation team to gather data on via        Identify middle managers as clinical champions to foster implementation of senior management-initiated innovation target interview questions (Table 1); these constructs were selected as part of the "menu of constructs" approach [12] focusing on the essential questions to the evaluation.
Interview data revealed that out of the selected constructs targeted in the interview guide, all 19 (100%) were identified as important influences on the implementation of PCC. Several interview questions encouraged longer narrative type answers such as: "What do you think about when you hear the term patient-centered care? What are the key elements for care to be patient-centered from your perspective?" and "Tell me a little bit about the history of transforming the organization to become more patientcentered. " These types of questions, along with follow-up and probe questions which were asked to further explore participant's perspectives resulted in the emergence of additional CFIR constructs beyond those selected in the menu of constructs process prior to the start of the evaluation. In fact, another 16 CFIR constructs emerged across 4 of the 5 CFIR domains (Table 3) when using deductive coding with the CFIR structured analytical tool (Table 2). Interestingly, although OPCC&CT and the evaluation team placed lesser emphasis on the factors in the outer setting and characteristics of individual domains and constructs, multiple additional constructs emerged in these two domains.
The mixed deductive-inductive approach to coding enabled the team to utilized thematic coding (inductive) to create codes for additional themes that: (1) were not fully represented by a CFIR construct, (2) provided context-specific details, or (3) offered advantages for organization of ideas. An example for each of these is provided below, and the thematic codes and their definitions are provided in Table 4.
For example, one of the emerging codes that was not fully represented by a CFIR construct was 'key strategies. ' This code was used to explore key strategies utilized in implementation; an example that emerged from the data included taking chances with novel ideas that resulted in "quick wins" and "sparks" of innovation across the hospital that encouraged staff to embrace the idea of PCC. In another example, the code of 'role in VA' as a contextspecific detail was used to differentiate the dual roles that some served (one OPCC-specific role and one in VA in general). For example, an individual might serve as a Patient-Centered Care Coordinator within the transformation and as a Nurse within a clinical role in an overall VA role, often referred to as "collateral duty"; a dual role that that could result in a dual perspective that should be differentiated.
Another code 'creation story' was used to capture the previous history of PCC efforts at the facility. This code was important; particularly given the fact that (1) the sites were selected as COIs, in part, based on their status as leaders in cultural transformation and (2) that the evaluation was being conducted after the transformation had already begun. Unlike the 'Tension for Change' construct within CFIR which is focused on identifying a 'need' for change (often reflective of a discrete issue or set of issues), the code "creation story" encourages a more narrative reflection on the setting in which PCC was being implemented at its inception.
Some codes simply offered advantages for organizing ideas such as the "PCC Barriers" and "PCC Facilitators" codes where all mentions of barriers and facilitators could be placed for easy access rather than having to search within CFIR construct codes to identify them. By examining where barriers and facilitators were double-coded with CFIR constructs, the team was able to determine overarching themes that hindered or facilitated implementation of PCC innovations. Similarly, the code "Golden Nugget" was utilized as a place to identify codes that stood out to the coding teams or that were emblematic or particularly successful or salient in regards to the construct and was used for easy identification of these exemplary quotes.

Rapid, actionable feedback
Finally, the initial discussions where key evaluation questions were identified by the PCC leadership and the evaluation team and were connected to the CFIR framework and study-specific definitions developed facilitated delivery of rapid, actionable feedback on the evaluation. The availability of these context-specific definition for the constructs allowed for identification of factors influencing implementation in an organized and easily accessible way. It also enabled the evaluation team to deliver a methodologically sound, prompt analysis of the data which facilitated development of timely, meaningful recommendations to the operational partner. To demonstrate this point, 107 interviews were conducted, transcribed, and analyzed over a period of approximately 5 months. The evaluation team used this assessment to create a set of recommendations that could be used to facilitate the development of strategies and processes to support future implementation efforts which was delivered at the end of the 6th month. These findings are explored further in Table 5.
These examples and others were reported as part of a white paper developed by the evaluation teams, which described in the OPCC&CT annual report as informing the strategies the office was taking in moving the program forward.

Discussion
The multiple, complex interventions often required for implementing broad-scale change create challenges for evaluation teams. In particular, there are no guidelines or recommended frameworks for evaluating implementation of broad-scale change. This study is among a small number of studies to use CFIR for conceptualizing an evaluation and for guiding data collection, coding, and analysis [12] and one of the first to use it to evaluate a broad-scale system change. In addition, examples in which CFIR has been used to assess implementation involving multiple interventions aimed at broad-scale change are limited; the authors identified one other example in which CFIR was used to assess implementation of a continuum of psychosocial interventions [23]. As such, it required a number of steps and processes that exercised the flexibility of the framework in new ways. This paper describes the steps taken to plan an evaluation and the strategies developed to utilize the CFIR framework for evaluation of broad-scale change.
The appropriateness of the application of CFIR in the evaluation of this broad-scale change is demonstrated by the ability of the framework's constructs to "fit" the data. This is evident by the fact that constructs that were not 'pre-selected' by the study team as potentially relevant to this large-scale implementation emerged from the data and were captured by the evaluation team post hoc [14]. The current study differs from other studies using CFIR to evaluate discrete interventions [4][5][6]12] in which findings are nearly exclusively tied to the framework [24] which may not be appropriate for the evaluation of a broad-scale change.
The application of CFIR in the context of broad-scale change required the creation of adapted definitions to account for this unique application that was used both to develop interview questions and to analyze interview data. In another study of "complex system interventions" Smith et al. [25] described a number of adaptations including changing the names of domains and constructs within CFIR to address distinctive features of the interventions being studied as well as modifying definitions of the constructs to incorporate terminology and exemplary examples of the specific interventions. In this study, an evaluation was not conducted, rather CFIR was used to inform the development of new frameworks to be used in future evaluation efforts in process redesign (PR), patient-centered medical homes (PCMH), and care transitions. The work completed by this group is important because it not exposes the advantages of adapting and refining existing CFIR constructs and definitions, but also details the process of doing so. The current study builds upon this work not only by demonstrating adaptation of the CFIR constructs in the context of a broad scale evaluation, but also using those adapted definitions to design data collection tools and a supporting analytic framework.
Utilizing a mixed deductive-inductive approach [18,19] allowed for the identification of themes that emerged that were not represented in CFIR that may be unique to evaluating large scale transformations, rather than discrete intervention implementation. These themes needed independent codes, offered context-specific information, or were grouped together for better organization. Further, the approach used in this evaluation builds upon the work of Damschroder and colleagues who used CFIR to evaluate a large-scale weight management program in VA but also shared some details about their process including choosing not to do parallel inductive coding, but remaining open to new themes (though the group felt that significant themes were encompassed by CFIR) [12]. One of the reasons the PCC evaluation team chose CFIR as a framework was its flexibility and the openness of the creators of the framework to test its flexibility and applicability. The current study suggests that while application of CFIR as a deductive analytical framework without inductive coding to allow for emergent themes is appropriate in some cases, that in other cases, utilizing inductive coding to capture those themes is vitally important.
The use of CFIR facilitated the rapid analysis and synthesis of a larger number of interviews in a short period of time, 107 interviews in 5 months, with final synthesis and delivery of findings by the end of month six. This evaluation approach resulted in a methodologically sound, easily digestible, and actionable set of findings and recommendations for the operations partners in a white paper entitled Lessons from the Field for Implementing Patient-Centered Care and Cultural Transformation. This proved critical for OPCC&CT as they quickly operationalized the findings and disseminated a document to the field and their stakeholders entitled: Lessons from the Field-Operational Tactics for Implementing Patient Centered Care and Cultural Transformation which proposed "operational tactics" or steps to addressing findings from the white paper described in OPCC&CTs Annual Report [26].

Conclusions
Utilizing CFIR in a relatively new application, a broad scale evaluation with multiple interventions, yielded the identification of a number of important processes and insights that should be considered to expand its applications to future broad-scale evaluations. This study demonstrates the utility and value of utilizing a comprehensive framework with a directed, yet flexible approach to evaluation which has implications for the broader field of implementation science. A collection of programs with multiple interventions with sometimes staggered, sometimes simultaneous beginnings presents a very challenging, complex evaluation that requires a balance between focus and flexibility. The insights that emerged from the study suggest that application of frameworks to organize findings and ideas from these complex evaluation environments are critical to delivering well-formulated recommendations that are derived from data driven by a sound theoretical basis. This study not only provides continued contribution to the larger implementation literature about the fit of the constructs from frameworks themselves in action, but also the utility and practicality of use of these frameworks for different applications.
In addition, the key analytic processes described in this paper provide a detailed example and structured approach that can be utilized and expanded upon by others in the implementation science community conducting broad-scale evaluations. Although CFIR was the framework selected for this evaluation, the analytical processes described in this paper including: use of adapted definitions, value of using mixed deductive-inductive approach, and the approach for expedited analysis and synthesis can be transferred and tested with other frameworks. Continuing to test frameworks, in general, and reporting experiences with use of these frameworks in new ways provides continued important insight to the implementation science community.

Limitations
This study had several limitations. First, the evaluation team chose not to use the full CFIR analysis approach which involves rating of CFIR constructs by case [12]; although this approach offers the ability to compare across sites, the 4 COIs on which this evaluation was based is a small number of sites and therefore the application was not appropriate. Second, similar to the analysis conducted by Damschroder 2013 [12] for their evaluation, discrepancies in coding were not quantified in this study. However, no issues were encountered while reaching consensus on disparate codes [12], which suggests that use of the constructs as a priori codes in a structured analytical tool is appropriate. Finally, this study may not highlight the potential application of use of CFIR in other healthcare contexts and additional studies may be needed. Authors' contributions JNH collected, analyzed, and interpreted interview data and lead the conceptualization and writing of the manuscript. SML analyzed and interpreted interview data and assisted with the conceptualization and writing of the manuscript. BGB was a funded PI for the study and collected, analyzed, and interpreted data as well as contributed to the conceptualization and writing of the manuscript. GMF collected, analyzed, and interpreted data and contributed to the conceptualization and writing of the manuscript. JS collected, analyzed, and interpreted data and contributed to the conceptualization and writing of the manuscript. NM collected and analyzed data as well as contributed to the writing of the manuscript. SLL was a funded PI for the study and interpreted the data as well as contributed to conceptualization and writing of the manuscript. All authors read and approved the final manuscript.
• fast, convenient online submission