Search Site

 

Journal Entries

 

Stay Informed

Sign Up Today to stay informed about HINZ events and relevant health informatics news!

*

 

 
 

Supporting Partners for 2012

Major Sponsors


 

 


 

 


 

 


 

 

Supporting Partners






 


 


 


 


 


 



 


 


 

















 

 
 

International Events 2012

 

 

 

"Data salvage" in Hospital Emergency Departments: Extracting Usable Information from Electronic Data Systems

Saturday, March 31st, 2007
Martin von Randow
Research Officer
Social Statistics Research Group
Department of Sociology
The University of Auckland
Auckland, New Zealand


Peter Davis
Professor
Department of Sociology
The University of Auckland
Auckland, New Zealand


Antony Raymont
Senior Research Fellow
Health Services Research Centre
Victoria University of Wellington
Wellington, New Zealand


Roy Lay-Yee
Research Fellow
Social Statistics Research Group
Department of Sociology
The University of Auckland
Auckland, New Zealand


Daniel Patrick
Research Programme Manager
Social Statistics Research Group
The University of Auckland
Auckland, New Zealand

Abstract
Aims
This paper reports on the "data salvage" steps that were required to extract usable operational information from the electronic data capture systems of four New Zealand hospital Emergency Departments (EDs).

Methods
The National Primary Medical Care Survey (NatMedCa), carried out over 2001/02, was a national survey of ambulatory care. The dominant component was a nationally representative sample of general practitioners (GPs) and their patients. A sample of four EDs spread across New Zealand was also drawn. Each ED was asked to report on all their patients during four (Monday to Sunday) weeks of the year in 2001. This information was contributed by the four EDs from their electronic data capture systems covering a total of 15,655 visits over the four weeks.

Results
A list of variables was requested from each ED for the purpose of the survey, but not all of these variables were provided. The data dumps received from the four EDs involved were all formatted differently, often with different variable names for the same items, and data for some of the requested variables were often either partially or completely missing. An ad hoc process was developed and performed on the four data sets in order to identify the core variables from the data provided, edit these for clarity and comparability, and code them as required.

Conclusions
There are substantial difficulties in comparing data across EDs, and between EDs and community-based ambulatory care. The principal problem is the lack of a common, usable template across EDs for the capture of data on core clinical and related variables. A further issue is the coding of such data. Adopting standardised coding systems would greatly assist in understanding the dynamics of ambulatory care across different modalities.


Introduction

The National Primary Medical Care Survey (NatMedCa) aimed to describe primary medical care in New Zealand. The New Zealand Health Survey 2002/03[1] asked respondents what sources of care they had accessed: 80 percent indicated that they had visited a general practitioner (GP), 13.5 percent that they had visited a commercial Accident and Medical clinic (A&M), and 7.3% that they had visited a public hospital Emergency Department (ED) in the previous year. EDs are accessible without referral; for example Christchurch Public Hospital’s ED reported in 1998 that 43% of their patients were self-referred.[2] It is clear that EDs must be considered as constituents of the primary health care system in New Zealand, serving a sizeable proportion of the population.

In principle, therefore, individuals in need can choose whether to seek care from a GP, an A&M or the local ED, and the relationships among these services are complex. Preliminary analyses from the NatMedCa data set drawing comparisons of the work undertaken by general practices, A&Ms and EDs in New Zealand have been published elsewhere in a technical report.[3] Current systems of classification for diagnoses do not necessarily indicate the severity or urgency of a problem, and may not indicate the appropriate source of care. To assess the choice to use a general practice, an A&M or an ED, further information, perhaps a purpose-designed index, is required.

This paper is methodological and practical in orientation rather than descriptive of results. It describes the steps that had to be taken – "data salvage" – in order to extract valid and comparable information from the EDs’ electronic data capture systems. This was to facilitate analysis both across EDs, and between EDs and ambulatory care providers in the community sector. A more detailed description of the ED arm of the NatMedCa study can be found in the same technical report.[3]


Methods

The NatMedCa survey 2001/02
NatMedCa involved a nationally representative, multistage, probability sample of GPs and patient visits. The primary purpose of the survey was to collect data on the content of patient visits. For two periods, each of one week, every selected GP completed a detailed questionnaire for a 25 percent systematic sample of patient visits. The questionnaire was adapted from the National Ambulatory Medical Care Survey (NAMCS), a rolling survey of ambulatory care providers in the United States.[4] The overall response rate was 70 percent.

Over the same period, all primary health care practices affiliated with Health Care Aotearoa (HCA), the main umbrella organisation for non-profit community-governed practices, were invited to participate, as were a 50 percent random sample of all A&Ms distributed over the country. Overall, 70 percent of HCA practices and 55 percent of A&Ms responded. Further details of the methodology and results from the HCA practices are to be found in NatMedCa Technical Reports 2[5] and 3,[6] while those for A&Ms are outlined in Technical Report 5.[7]

ED study arm
Data were requested on all attendances at four EDs for one week in each quarter in 2001; namely, the (Monday to Sunday) weeks beginning 5 February, 7 May, 6 August and 5 November. Data items were confined to those that could be accessed from the electronic databases of each organisation.

Approaches were made to ED1 seeking to represent a small city, ED2 as a large metropolitan centre and ED4 as a South Island city. The availability of a comparable data set from ED3, a larger South Island city, led to its inclusion. Data from each of these EDs should be viewed independently, and projection to the whole of New Zealand should be considered speculative.

The data dumps detailing the four weeks’ activity were received at The University of Auckland late in 2003. The EDs’ data capture systems would have been intended to gather information for internal use rather than to be broad data repositories for external users’ research purposes, and the consequences of this were borne out in working with the data.

Table 1 shows the number of patients seen by each of the EDs in each of the four weeks of observation in 2001. All EDs saw their highest number of patients in the third week of observation, which was in late winter.

Table 1: Number of visits reported by each ED in each week


A specific list of variables was originally requested from each ED for the purposes of the survey, but not all of these variables were provided. The data dumps received from the four EDs involved were all formatted differently, often with different names for the same items, and data for some of the requested variables either partially or completely missing. An ad hoc process was developed and performed on the four data sets to extract as many variables as possible that were fully comparable across them. The variables ultimately included in the analysis are listed and described in table 2.

Table 2: Variables comparable for all Emergency Departments



Results

Data cleaning
The data cleaning process is described in some detail below, and summarised in the flowchart (figure 1).

Prior to analysis, the data sets that were received from the four EDs had to be checked, restructured, coded, collated and documented in various ways. This was necessary to convert all raw data sets into a compatible form, and some variables had to be converted to codes for the useful presentation of results.

Most of the ED data sets already contained unique IDs for patient visits, but these were created where necessary, initially on the basis of a unique date of birth but later also on unique time of visit to account for twins coming in together (three instances of this were found). More specifically, the original situation with visit IDs was as follows.

  • ED1 - recorded only one row of data for each visit 
  • ED2 - sometimes recorded multiple rows of data for individual visits, most commonly when they resulted in the prescription of more than one drug 
  • ED3 - sometimes recorded multiple diagnoses and/or actions for individual visits, but the visit ID was always given 
  • ED4 - sometimes recorded multiple, identical rows of data for the same visit, but a visit ID field was provided, albeit not very clearly.

All of the ED data sets gave the date and time of each visit, but some had both in the same column, causing Excel to hide the time component, which then needed to be extracted.

Figure 1: Flowchart of ED data checking and reconciliation process

 


Data classification
Ethnicity data were problematic, as is frequently the case in studies of this kind. There was no indication of the methods used for the coding of this variable by the EDs, and this could have affected comparability. Ethnicity was coded to the nine categories used in New Zealand’s 2001 Census and analysed using either those categories or broader groups based on them. ED3 did not list patients’ marital status or ACC status (Accident Compensation Corporation; a New Zealand government agency that covers costs incurred by injury), so these were not analysed for any of the EDs, but marital status was made comparable for the other EDs so it would be available if desired later.

Most of the data sets recorded four-digit domicile codes for the residential addresses of patients. However, in one data set codes had to be generated from the domicile descriptions. The domicile codes that were provided used the 1996 classification system; these were converted to 2001 codes so that NZDep2001 codes (an area measure of deprivation) could be derived. Some codes had been retired between 1996 and 2001 and where retired domicile codes were given, or there was no domicile information, the cases were left unclassified.

The ED data sets contained a triage code for each visit, based on urgency; the operational requirements for triage categories used by the EDs in managing their patients are shown in table 3. It was assumed that all of the EDs were using this system, but ED4 was found to have consistently higher (less urgent) average triage values than the other EDs.

Table 3: Definitions of the ED triage categories



A diagnosis for each visit was also coded from the data sets. In most of the ED data sets there were two variables describing diagnoses or similar, eg, provisional diagnosis and final diagnosis. The final diagnosis was taken as the "dominant" variable and was used for coding in the first instance; the provisional diagnosis was used for coding only if a final diagnosis was not given. The data set from ED1 also contained a "presenting problem" variable, but this was considered different from diagnosis and was not used for coding. ED3 was the only one that appeared to have employed a formal system for reporting diagnoses, with far fewer unique values than the other EDs (and less junk), and with codes in the form of abbreviations of the text strings.

Coding of diagnosis data was performed using the READ version 2 (READ2) classification system. A significant number of visits to EDs do not result in a clear pathological diagnosis, and READ2 makes provision for symptoms, investigations, administrative functions, intended actions and other types of entry. While this system was designed for use in general practice, and focuses on rather less serious ailments than might be presented to EDs, coding the data in this way meant that they could be compared with other NatMedCa data, for instance from GPs and A&Ms. The ED diagnoses were entered as free text in the data sets, and coding was performed using a mixture of electronic and manual processes.

The NextGeneration coding software,[8] developed by Dr Ashwin Patel, attempted to assign a READ2 code[9] to each reported diagnosis through keyword/phrase searches of its database. However, 60 percent of cases could not be so coded due to vagaries in the text, abbreviations, and text strings that included additional information (eg, "laceration ran into door").

The uncoded diagnosis strings were reviewed by the lead ED investigator (Dr Antony Raymont) and a text term more closely matching one within the coding tool’s database was suggested for each of the 2,990 uncoded items. This process enabled many more of the diagnoses to be coded electronically but there were still many values that remained uncoded. These were dealt with by interactive coding methods (finding matches or close matches through keyword searches within the database itself). These matches then had to be checked again by the lead investigator, who gave yet further suggestions for unsatisfactory matches. This process was repeated several times, until all of the reported diagnoses had a READ2 code attached, even if, as in 1,480 of the 15,655 cases (9.5 percent), the code was "not coded". These included patients who had left before being assessed, cases where an impenetrable abbreviation was used and cases of missing data.

Drugs and actions were coded using similar software, under the Pharmacodes/ATC (Anatomical Therapeutic Chemical) system,[10] as were other therapeutic actions. No limit was placed on the number of actions coded per visit, but cases that received at least one code via the software were not examined further. ED2 separated actions into drugs and procedures, and reported pharmaceutical prescriptions in some detail. The other EDs combined both types into "actions" and reported far less drug detail. ED3 again had a more formal system of recording, but provided almost no detail on drugs at all. The variables in the data sets were again in free text form and many of them were too long to be fully coded electronically. Interactive coding outside the software was again required, but not as frequently as for diagnoses.

Among the non-drug actions there were again various troublesome abbreviations, and in a number of cases some of the detail was almost certainly missed. There were also text strings that read more like diagnoses (eg, "leg fracture") or implied more major surgery, which were not accounted for in the ATC system and had to be manually categorised, usually simply into "Actions" or "Investigations".

The coding that was successful directly through the electronic keyword matching took only minutes to complete; the "interactive" coding process described above was conducted via email over a period of months, but would have amounted to a good working week of searches both within the coding tool database and on the Internet. This was found to be one of the most limiting factors for the purposes of analysis, and the implementation of some form of diagnosis and action coding in the EDs’ data capture systems would be of great utility in the future.

Information on the disposition of patients was present in all of the ED data sets, but the lack of detail given in some cases resulted in the comparable variable for analysis being reduced to just three categories: admitted, referred and discharged. The data for ED3 were later found to be incorrect, and the updated data subsequently received from them provided only the percentage of patients that had been admitted, which further limited the finer analysis to considering only the other three EDs.



Data outcomes
Reproduced below are two tables outlining key descriptive information drawn from the data derived from the ED systems. The demographic comparisons in table 4 are plausible, eg, a higher proportion of patients of European ethnicity in the two South Island EDs (3 and 4). Further checking against other data sources would be required, however, to ascertain the accuracy of the clinical data in table 5. The percentages in the tables do not necessarily sum to 100 due to rounding.

Table 4: Summary of demographic differences among ED users




Table 5: Summary of usage patterns of each ED




[a] Note: rounding errors of up to 0.7 may apply.
[b] Note: rounding errors of up to 1.6 may apply.
[c] Note: rounding errors of up to 0.1 may apply.


Discussion and Conclusion
The hospital ED is a vital part of the New Zealand health system. Not only does it provide the entry point to care for a significant proportion of inpatients, but it also performs a very important function in the broader system of ambulatory and first-contact (primary) care. What this study shows is that the important role – both clinical and political – that EDs play in the wider health system has in the recent past not been matched by the attention paid to the quality of their data capture. While "admission and discharge" data on public hospital inpatients follow international protocols and are deposited with the New Zealand Health Information Service, encounters in EDs are treated differently, with no formal requirements imposed on data collection.

Similar findings to ours have been reported in other studies. Amouh et al found computerised systems in EDs in Belgium to be lacking, with inadequate data models, clumsy user interfaces and poor integration with other clinical information systems. This was attributed in part to the complexity of combining information from a frequently fluid staff pool in the ED setting.[11] Downing and Wilson found similar limitations, and also issues with the comparability of data received from a range of different EDs in the UK.[12]

The difficulties encountered in extracting meaningful and comparable information from ED data capture systems raise issues about the quality of information available for management purposes both internally for ED operations and for the wider health system. The problem seems not to be the lack of data capture; all EDs were able to provide a data dump for the designated sample weeks, although not uniformly for all variables requested. Though there are issues of accuracy and completeness, the main shortcomings seem to be in ensuring conformity of such data to standard formats for ease of extraction, interpretation and comparison.

The Health Information Strategy for New Zealand[13] pushes for more consistency in the collection and coding of health care data. One current project looking to deal with these issues is the National Non-admitted Patient Collection (NNPAC).[14] NNPAC is a store of data from Emergency Departments and similar sources of care, and is looking to deliver standardisation of format, transmission and error correction methods among data sourced from various different DHBs, over its three-year duration. The data stored does not include information on diagnoses or actions, which were the most difficult areas to "salvage" as described above, but NNPAC is certainly a step towards alleviating many concerns.

For our purposes, given the low quality of electronically captured ED data in 2001, it was necessary to undertake considerable processing to get the data into a usable form. If the current situation persists, then analysts and researchers will continue to face this burden on time and effort. In addition to the advances of projects such as NNPAC, this experience argues for uniform requirements to be applied to ED data capture systems; the development of a minimum data set, common variable formats and standardised coding schemes.


References

  1. Ministry of Health. A portrait of health: key results of the New Zealand health survey. Public Health Intelligence Occasional Bulletin No 21. Wellington: Ministry of Health; 2004. http://www.moh.govt.nz/moh.nsf/0/3D15E13BFE803073CC256EEB0073CFE6/$File/aportraitofhealth1.pdf. Accessed 11 March 2007.
  2. Hider P, Helliwell P, Ardagh M, Kirk R.The epidemiology of emergency department attendances in Christchurch. N Z Med J. 2001;114(1129):157–9.
  3. Raymont A, von Randow M, Patrick D, Lay-Yee R, Davis P. The National Primary Medical Care Survey (NatMedCa): 2001/02: Report 8: a description of the activity of selected hospital emergency departments in New Zealand. Wellington: Ministry of Health; 2005. http://www.moh.govt.nz/publications/natmedca. Accessed 27 March 2007.
  4. NAMCS. Data file documentation. Atlanta, GA: National Centre for Health Statistics, Ambulatory Care Branch; 1994.
  5. Crampton P, Lay-Yee R, Davis P. The National Primary Medical Care Survey (NatMedCa): 2001/02: Report 2: primary health care in community-governed non-profits: the work of doctors and nurses. Wellington: Ministry of Health; 2004. http://www.moh.govt.nz/natmedca.. Accessed 27 March 2007.
  6. Crengle S, Lay-Yee R, Davis P. The National Primary Medical Care Survey (NatMedCa): 2001/02: Report 3: Mäori providers: primary health care delivered by doctors and nurses. Wellington: Ministry of Health; 2004. http://www.moh.govt.nz/natmedca. Accessed 27 March 2007.
  7. Hider P, Davis P, Lay-Yee R. The National Primary Medical Care Survey (NatMedCa): 2001/02: Report 5: a comparison of the work of doctors in accident and medical clinics and in general practice. Occasional Paper. Ministry of Health; http://www.moh.govt.nz/natmedca. Accessed 27 March 2007.
  8. My Practice Limited, Next Generation .Net Limited. http://www.mypractice.co.nz/. Accessed 11 March 2007.
  9. Clinical terms (the Read Codes) version 3 reference manual. http://www.connectingforhealth.nhs.uk/terminology/readcodes/publications. Accessed 11 March 2007.
  10. WHO Collaborating Centre for Drug Statistics Methodology. http://www.whocc.no/ atcddd/. Accessed 11 March 2007.
  11. Amouh T, Gemo M, Macq B, Vanderdonckt J, El Gariani AW, Reynaert MS, Stamatakis L, Thys F. Versatile clinical information system design for emergency departments. IEEE Trans Inf Technol Biomed. 2005; 9(2):174–83.
  12. Downing A, Wilson R. Regional surveillance of accident and emergency department attendances: experiences from the West Midlands. J Public Health (Oxf). 2005; 27(1):82–4.
  13. Health Information Strategy for New Zealand. 2005. Ministry of Health. http://www.nzhis.govt.nz/publications/strategy.html. Accessed 22 March 2007.
  14. National Non-admitted Patient Collection (NNPAC). 2006. Ministry of Health. http://www.nzhis.govt.nz/documentation/nnpac/index.html. Accessed 22 March 2007.


    PDF Download of this paper

    Click here to download a PDF version of this paper.