Revisiting Retrospective Chart Review : An Evaluation of Nursing Home Palliative and End-of-Life Care Research

or trained Overall, the studies provided a sparse description of abstractor training and qualification, which limits the reliability of the method. Two studies reported that abstractor training or instruction was provided [34,35], and one study included that the data abstractors were highly experienced nurse data collectors [18]. Of note, it was clear by the description given in Hall et al. [1] that all of the authors were involved in the development of


Introduction
As palliative care becomes a more mainstream approach in facilitating end-of-life or terminal care in long-term care (LTC) homes [1], researchers have sought to describe and evaluate this method of care.In a variety of health care fields, including palliative care, a popular method of data collection is retrospective chart review [2,3].Using the retrospective chart review method allows researchers to examine, record and understand past clinical events documented in persons' medical charts [4,5].However, despite retrospective chart review's touted feasibility, issues with the reliability and validity of this data collection method have been identified, due to both the limitations of the chart itself and the guidance in using this method provided in the literature [4,6,7].It is important to examine the reliability and validity of retrospective chart review used in palliative care research, as the findings of such studies employing this method may ultimately shape how end-of-life care is provided in LTC homes.Therefore, the purpose of this paper is to review and evaluate the reliability and validity of the retrospective chart review method in LTC home-based palliative and end-of-life care research.

Retrospective Chart Review
In order to appreciate the use of retrospective chart review in health care research, it is essential to define it and understand both the advantages as well as areas for caution when using this data collection method.Retrospective chart review is a specific type of data collection method used in archival research and can be understood in two parts [5].First, the term retrospective means to look back in time, and in this case, at clinical events [4].Second, the information in the medical chart is used as the source of data.This data collection method is commonly used in studies with retrospective designs, where research questions cannot be answered prospectively [4].It is especially appropriate in evaluating approaches to palliative and end-of-life care, as neither the dying person nor their family member is burdened with actively participating in the research process at or near the time of death.
Although study designs and data collection methods should be based on the most rigorous way of answering research questions [8], several advantages of employing the retrospective chart review method encourage its use.It has been touted as a "quick and dirty" option because the clinical data already exist and just has to be abstracted from the medical charts [2,4].Another advantage of retrospective chart review includes the relatively low cost when compared to prospective trials [4,9].In addition, medical charts are generally accessible to researchers, and can be a source of clinical richness and accuracy.Due to these advantages, retrospective chart review may be suitable for pilot work in the LTC home setting, if valid and reliable methods are employed.
While it is recognized that chart reviews can be a convenient method of data collection, there are many complexities in retrieving relevant, high-quality information [10].Importantly, authors have noted a lack of published well-established approaches to retrospective chart review, which leaves the validity and reliability of the method in question [6,7,11].To understand how validity and reliability affect retrospective chart reviews, it is first important to explain and define each.Validity describes the degree to which a tool, protocol or process accurately represents the concept or topic it was designed to measure [12].Reliability describes the degree that a tool, protocol or process will generate the same or similar results when it is used over time, with the assumption that what is being measured remains unchanged [12].
Several limitations of retrospective chart review have been noted around the chart itself, as well as the process of abstraction that threaten the overall validity and reliability of the method.Limitations of the medical chart itself in retrospective studies have been recognized including: inaccurate, incomplete or illegible documentation, as well as variance in the quality and location of the information recorded by medical professionals [4,13,14].Many limitations of the chart review process have also been recognized and include: missing charts; lack of a clear procedure for data abstraction and how to handle missing or incomplete data; lack of abstractor training or blinding to the study purpose; and inconsistency or mistakes in coding chart information [11,15].Together, these limitations may negatively impact the validity, and especially, the reliability of the retrospective chart review method, and any subsequent study findings.

Chart Review in The Palliative Care Context
The definitions of palliative care provided by the Canadian Hospice Palliative Care Association [16] and the World Health Organization [17] include that it is an approach to care which aims to enhance quality of living and dying for persons facing lifethreatening illness, through the relief of pain and suffering, and attention to physical, psychosocial and spiritual needs.For the purposes of this paper, palliative care is considered an approach to caring for residents at the end of their life or facing a terminal diagnosis.However, no definite time period has been determined for when the palliative approach to end-of-life or terminal care is appropriate [18].
The need to improve palliative care for residents of LTC homes have been recognized in combination with continuing efforts to advance LTC home staff's knowledge [19,20].More high-quality studies are needed to test interventions to improve LTC home processes and resident outcomes [19].It is postulated that a retrospective chart review method may serve as an entry point for researchers and LTC home personnel to assess the quality of current palliative care processes and outcomes.Pilot trials using this approach or accessing published studies that have used retrospective chart reviews may lead to the development of interventions or implementation of quality assurance programs aimed at improving palliative and end-of-life care.Given the possible threats to the reliability and validity of this method, it is important to examine the palliative care literature.Therefore, the following research question was explored, do palliative or end-of-life care studies, set in LTC homes, use reliable and valid methods when employing retrospective chart review as the sole approach to data collection?

Definitions
The following definitions will be applied throughout this paper.A LTC home provides accessible 24-hour nursing care for persons over 18 years of age [21].References to nursing homes, long-term care geriatric institutions or facilities in the studies are defined as a LTC home.Any references to a chart review is defined as data collected from a chart review, chart audit, clinical record review, medical record audit or medical record review processes, as reported in the literature.The chart itself may contain past documented medical histories, clinical orders, test results, and assessment and care notes, specific to an individual person.The chart can be in either paper or electronic format, or a combination of both formats.

Search strategy
In order to search for relevant literature, a library consultant helped to determine electronic databases and key search terms for LTC homes, palliative and end-of-life care, retrospective, and chart reviews.See Table 1 for an example of the search terms.The following electronic databases were searched and yielded the following results in January, 2014: Ageline (n=31), Excerpta Medica Database (EMBASE) (n=258), Cumulative Index to Nursing and Allied Health Literature (CINAHL) (n=104), and Medline (n=179).No date restrictions were used in the search but the results were limited to the English language.

Study selection
Three inclusion criteria were used to select the studies for evaluation.First, the study had to employ retrospective chart

Group of Search Terms Terms
Long-term care home *Each grouping of search terms first combined using the OR command.
The three groupings were then combined with the AND command.
review as the main method of data collection.Second, the study description had to clearly indicate that the data were abstracted from LTC home residents' charts.Therefore, studies that included data collection from multiple types of settings (e.g., hospice or hospital) were included, as long as at least one cohort of charts was obtained from a LTC home setting.Also, the data had to be abstracted and analyzed from the persons' charts, as opposed to being part of a secondary analysis.Finally, the study had to focus on the provision of palliative or end-of-life care to residents of LTC homes.As described above, terminal and end-of life care processes are relevant to the provision of palliative care in LTC homes.
Two further exclusion criteria were used.First, studies that mainly abstracted resident information from large administrative data sets such as the Resident Assessment Instrument Minimum Data Set (MDS)* 1 were excluded.Several issues have been raised around the consistency of the MDS's psychometric performance in everyday use, syndrome specific scales (eg.depression and pain), its ability to accurately represent residents' clinical status, and attitudes of home staff towards its completion [22].In addition, DiCenso et al. [23] advocate caution in using large administrative data sets in research, due to a lack of relevant details, inaccuracy, and incompleteness of information.Second, studies using multiple data collection methods to answer the same research question or purpose were excluded.It was postulated that insufficient attention to detail in the methods section would be provided, given the possible limited space provided by academic journals [10].Therefore, if insufficient detail was provided, the evaluation of the reliability and validity of the method may not accurately reflect the actual quality of the study.

Description of the evaluation tool and approach to synthesis
In 1996, Gilbert et al. [11] suggested that eight methodological strategies, based on the works of Boyd et al. [24] and Horwitz et al. [25], would assess the validity, reliability and general quality of data gleaned from medical charts.The authors used these eight strategies to develop evaluation criteria, and then applied them to published emergency medicine research articles that employed chart review as the primary source of data.Of the 244 articles that were reviewed by Gilbert et al. [11], each were examined for the criteria and given a yes or no rating for the following: abstractors trained, inclusion/exclusion criteria described, important variables defined, standardized abstraction forms used, abstractors' performance monitored, abstractors blinded to the study objective and patient assignment, inter-rater reliability discussed and inter-rater agreement tested.Gilbert et al. [11] reported on the proportion of these articles that adhered to the eight criteria, and concluded that strong chart review methods were lacking.
Others have published practice guidelines suitable for conducting retrospective chart reviews [6,7,10,15,26,27].In their practice guidelines, these authors have included criteria used by *1 MDS is a standardized assessment tool designed to communicate quality indicators to LTC home and government Gilbert et al. [11] for addressing the reliability and validity of chart reviews.In addition, Panacek [5] offered Gilbert et al.'s [11] criteria as a resource for assessing chart review studies.In this evaluation, a yes or no rating was assigned to each of Gilbert et al.'s [11] criteria.The context and quality of study description helped to determine the yes or no rating.Thus, the eight criteria employed by Gilbert et al. [11] were selected to guide the evaluation of the LTC home-based, palliative care literature.

Search results
Without accounting for duplicates, the literature search yielded a combined total of 572 articles.The inclusion and exclusion criteria were applied to each title and abstract, which resulted in 16 articles identified for the evaluation (see Figure 1).However, it was clear that Chen et al. [28] and Lamberg et al. [29]; Hickman et al. [30] and Hickman et al. [31]; and Travis et al. [32] and Travis et al. [33] produced separate articles derived from the same retrospective chart review data collection processes.Therefore, a total of 13 unique retrospective chart review processes from 16 articles were identified for the evaluation.Herein, all references to the reviewed articles will be referred to as studies, to account for the repeated chart review processes.For a summary of the included studies see Table 2.

Evaluation criteria
The following presents the results of the review and evaluation of the eight criteria that were applied to the 13 studies (16 articles).See Table 3 for a summary of the evaluation.

Abstractor trained
Overall, the studies provided a sparse description of abstractor training and qualification, which limits the reliability of the method.Two studies reported that abstractor training or instruction was provided [34,35], and one study included that the data abstractors were highly experienced nurse data collectors [18].Of note, it was clear by the description given in Hall et al. [1] that all of the authors were involved in the development of the data abstraction tool and completed the chart review process.Similarly, in Suhrie et al. [36] the data abstractor designed the abstraction tool, and in Takezako et al. [37] the primary author was the sole data abstractor.Therefore, in these three studies, it was assumed that training was not necessary.However, the qualifications for, and previous experience with retrospective chart review were not detailed.For the remaining studies, this criterion was not well addressed or clarified.Due to the limited description of the abstractor training, this criterion related to reliability was found to be lacking.

Inclusion and exclusion criteria described
This criterion was consistently well addressed, thereby giving the reader a clear understanding of the sample of charts that were included in the review processes.Inclusion criteria were described in all 13 studies.With one exception [30,31], all of the studies stated that they sampled only the charts of residents who had died within a certain time period, for example between May 2001 and 2002 [35].Exclusion criteria were noted in five studies.Most authors listed exclusion criteria that were based on the resident either dying outside of the LTC home or unexpectedly [1,[37][38][39] or not residing in the LTC home for a long enough period of time [29].On the whole, the 13 studies demonstrated attention to the inclusion and exclusion criteria, which were appropriate to the topic of palliative care.

Important variables defined
The authors of the reviewed studies were attentive to providing detail around the important variables.However, it was more common for a list of the important variables to be provided rather than providing an operational definition.An operational definition would have better addressed the validity of the chart review processes.With one exception [39], all of the authors tended to list variables that were more objective in nature, such as resident demographic characteristics, medications, and presence or absence of clinical symptoms.However, three studies clearly defined the variables that directly related to their research questions [28,29,32,33,36].In addition, DiGiulio et al. [34] provided an explanation of the functional assessment staging tool (FAST) used in Alzheimer's disease diagnosis and prognostication.Also of note, studies indicated that their variables were informed by literature [1,33,40] or in combination with clinical experience [28,29,37].

Standardized abstraction forms used
While it was inferred that the important variables would serve to inform the abstraction tool, its design and use were overall, less clearly detailed.Nine studies included that an abstraction tool was utilized, and this was indicated in a variety of descriptions.First, four studies stated that their abstraction form drew on existing work, including the Toolkit After death Chart Review [35], Latimer's tool to audit hospital care of the dying [1], the Medication Appropriateness Index [36] and the Physician Orders for Life-Sustaining Treatment form [30,31].In these studies, the design of the abstraction form was more clearly explained.Second, Powers and Watson [18] as well as Keay et al. [41] provided details around addressing the validity of the variables included in their abstraction forms.Third, pre-testing of the abstraction tool was indicated in three studies [32][33][34]41].Last, Keay et al. [40] stated that their protocol had received clearance from an institutional ethics review board.While some detail was included, overall, an insufficient description of the abstraction form's design and use was found.Therefore, the reliability and validity of the abstraction forms should be questioned.Also, the description of the uniform handling of missing or conflicting information was lacking.Five studies commented on this aspect.Chen et al. [28] and Lamberg et al. [29], and DiGiulio et al. [34] indicated that when ambiguous cases were identified, chart reviewers sought assistance from the LTC home employees to resolve issues.Similarly, Solloway et al. [35] included that the data abstractors could consult the study investigators to sort out any problems.Hickman et al. [31] also described a process of consensus building among researchers and dropping cases with insufficient information.Takezako et al. [37] mentioned missing charts.The inclusion of this aspect may not have been described if missing or conflicting data was not commonly encountered by the investigators.

References Location of Study Focus
Chen et al. [

Abstractors' performance monitored
This criterion was only addressed in Hickman et al. [30] and Hickman et al.'s [31] study, where the inter-rater reliability of chart abstraction was assessed at regular intervals.This inclusion reported by one study limits overall reliability and validity of the chart review process in this body of literature.

Abstractors blinded to study objective and patient assignment
Similar to the previous criterion, the blinding of the abstractors was also not well addressed in the study descriptions.
In the studies where it was clear that the investigators also collected the data, abstractor blinding would not have been possible [1,36,37].However, Suhrie et al. [36] directly indicated that the data abstractor could not be blinded.Overall, abstractor blinding was not well described, and therefore, it was inferred that this measure of reliability was not well utilized in the chart review process.

Inter-rater reliability mentioned
Inter-rater reliability and its related measures such as abstractor agreement and confirmation were mentioned in all but four of the reviewed studies.Like the use of a standardized abstraction form, this criterion was indicated in a variety of ways.First, Powers and Watson [18], Hickman et al. [30] and Hickman et al. [31], and Keay et al. [41] directly mentioned measuring inter-rater reliability.Second, Solloway et al. [35] stated that no processes were in place to measure inter-rater reliability.Third, Hall et al. [1], and Travis et al. [32] and Travis et al. [32] commented on abstractor agreement.Also, while less conclusive, and Hall et al. [1] indicated that data collection often occurred concurrently between abstractors.In addition, the notion of reliability was implied by Suhrie et al. [36], and Chen et al.

Keay et al. [40]
The quality of the LTC homes and staff experience can vary across sites Carefully consider the setting and context from which the charts were sampled in the article Due to the international perspective of the literature, the reader should take into account whether any standards or legislation that may have impacted the provision of palliative care are similar or applicable to their own geographic location To enhance the validity, access to resident charts should be planned with the LTC homes so that a representative sample of the population of interest is obtained *Note: In all but Hickman et al. [30] and Hickman et al. [31], the residents in the reviewed studies were deceased, and the charts obtained, easy access to chart data is not always guaranteed Obtaining ethics clearance from the institutional review board may be necessary [46] As the prevalence of residents with dementia continues to grow in LTC homes [47] issues with obtaining consent to access this population's charts may be problematic.
As noted, many people do not give advance directive plans about their participation in research after their decision-making capacity is lost [48] Aaronson and Burman [44] Hall et al. [1] Assess the quality and completeness of the charts that will be used for data collection *Note: Authors of three studies reported that, the quality of the charts was excellent [28,29,34].However, the quality of charts should not always be assumed The quality of charts should not always be assumed, therefore do a pilot test before committing to data extraction Determine whether applicable data is being recorded in alternate locations (e.g., physician offices like in Hall et al. [1] Determine how the data is stored (e.g., paper chart and/or electronically).It is important to know which source to use, what information could be duplicated within the two sources, and on which type the staff are more likely to record information Engel et al. [7] Consider the suitability of the person that will be procuring the charts and abstracting data LTC home personnel may be most familiar with the layout of the chart, and possibly even the content of the charts, which would enhance the reliability of the data abstraction.However, caution is warranted because these persons may not be qualified or lack experience.This may seriously affect the reliability of the findings, as noted by Engel et al. [7] If LTC home staff will be procuring and abstracting the data, clear instruction and ongoing monitoring is recommended [49,50] Hall et al. [1] Select an appropriate period of time from which to abstract data Select a time frame of documentation that will adequately measure variables of interest (e.g.[28] and Lamberg et al. [29] as they included that specific data abstracted from the chart were confirmed by a second reviewer but the term 'inter-rater reliability' was not mentioned.

Inter-rater agreement tested
An appropriate test of inter-rater agreement is the Kappa statistic because it accounts for agreement that would occur beyond chance [42].Three studies indicated the use of this more rigorous test by reporting the inter-rater reliability of the: abstracted data [30,31], the chart abstraction tool [41] and symptoms identification [18].In addition, the study by Travis et al. [32] and Travis et al. [33] reported a 100% agreement on the qualitative coding of all resident charts reviewed, whereas Hall et al. [1] reported a 95% agreement between the abstractor and auditor on the data abstracted from 20 charts.Overall, both mentioning and testing inter-rater reliability criteria were poorly addressed.

Interpretation of findings
In revisiting the findings from Gilbert et al.'s [11] evaluation of the validity and reliability of the chart review method using emergency medicine, the LTC home-based palliative care literature produces similar results.Like in Gilbert et al. [11] the inclusion and exclusion criteria as well as defining the important variables were consistently well described in this literature.These two particular criteria relate to demonstrating the validity of retrospective chart review.
However, it was clear that the remaining six criteria provided by Gilbert et al. [11] were not as well described, especially the data abstractors' monitoring and blinding, and testing for inter-rater reliability.These criteria related more closely to the reliability of the retrospective chart review method.Therefore, the reliability of this data collection method was not well described in this literature.Attention to these facts when both using the literature as well as conducting palliative care research should be employed.

Suggestions: Using palliative care literature
Overall, the reliability of the chart review method in the palliative care literature was not well indicated.This fact should be considered when using research findings collected from retrospective chart review as a source of information.In the future, it is recommended that readers consider articles for use that address each of Gilbert et al.'s [11] eight criteria to ensure both the validity and reliability of the methods before using any findings to inform practice or direct research.Table 4 provides additional suggestions for critiquing a palliative care article using retrospective chart review.

Suggestions: conducting retrospective chart review in palliative care, LTC home-based research
Conducting a retrospective chart review is a useful approach to data collection to assess the quality of and provision of palliative and end-of-life care.As described above, there are several advantages to conducting retrospective chart reviews, which are applicable to the LTC home setting [2,4,9].This approach may be favourable in a LTC home because the barriers to conducting research in this setting may be avoided.These barriers may include, staff time away from duties, obtaining participant consent, access to residents, as well as assessment of resident capacity [43].However, the methodological articles and reviewed studies this paper highlight several issues that may impact retrospective chart review and should be considered before designing and implementing a study.See Table 5 for suggestions.
Overall, there was a limited description of the measures used for evaluating the validity, and especially the reliability of the retrospective chart review method included in the LTC homebased, palliative care literature.As such, readers should proceed with caution when using this source of literature to inform palliative care research and practice, and carefully consider the inclusion of the measures of validity and reliability.In addition, palliative care researchers should carefully consider the source of data as well as the guideline when engaging in retrospective chart reviews.

Table 1 :
Example of Search Terms.

Table 3 :
Summary of the Validity and Reliability Evaluation for Articles (n=16).
Legend: Y=Yes; N=Not mentioned; NA= Not applicable; *=Methods also described in related study article; **3 articles not included due to Not Applicable rating; ***1 article not included due to Not Applicable rating Evaluation criteria from "Chart Reviews in Emergency Medicine Research: Where are the Methods?, by E. Gilbert, S. R. Lowenstein, J. Koziol-McLain, D. C. Barta, and J. Steiner, 1996, Annals of Emergency Medicine, 27, p. 306.Copyright by the American College of Emergency Physicians.

Table 4 :
Using Palliative Care Literature.

Table 5 :
[35]man et al. [31]).For another example, as noted in Hall et al.[1]and Solloway et al.[35]it was not reasonable to expect to abstract data around the reasons behind resident's end-of-life care wishes, if only reviewing chart data from the final 48 hours of life Conducting Palliative Care Research Using Retrospective Chart Review.