- Abstract
- Introduction
- Background
- A proposal for patient management system enhancement
- Conclusion
- References
- Footnotes
Abstract
The implementation of New Zealand’s Primary Health Care Strategy and Health Information Strategy will result in an increased requirement to consistently apply accurate standardised coding to primary care data, resulting in a significant increase in workload within primary care organisations. This paper argues the benefits of using an IT system to automatically assign codes to medical records in general practices based on the free text entered by the practitioner. The paper reviews the existing computer assisted coding technology and the implications of the approaches being used. The paper then outlines the design of an automatic coding module to support a practice management system. 
Introduction
The general practice medical record is the richest data store of routinely collected personal health information. The move to team-based, primary health care (PHC) with a focus on chronic disease management and the need to demonstrate improved health outcomes for the investment in PHC has resulted in an increased need to share this rich data store and to report on its contents. [1-2] Whilst several problems exist with sharing and reporting on the general practice data store, two main problems concern the quality of the data and the need to be able to communicate and process the data efficiently.
The classification and subsequent computer coding of medical records can increase the efficiency of electronic communication and processing of heath information.[3] The use of standardised coding systems would improve the quality of data stored in practice management systems and thus its ability to be shared for patient treatment, planning and funding purposes,[1] thus benefiting both patients and the health care system.[4]
Coding is currently used in general practice in New Zealand. However, it is neither standardised nor used consistently. In a study of 938 general practices in New Zealand in early 2003, 99 percent of practices used a practice management system (PMS) but consultation diagnoses were coded in only 64.7 percent of practices. Of these practices, 24.6 percent indicated that they only “sometimes†or “occasionally†used codes after entering the consultation details or only coded for ACC purposes. In addition, whilst 94.6 percent of these practices recorded screening information or kept disease registers on their PMS, what they recorded varied widely, ranging from cervical screening (97.8 percent) to prostatic specific antigen (PSA; 2.7 percent).[5]
Even if the electronic medical record (EMR)[a] contains coding, the data may be only partially coded,[6, 7] or the code assigned to an event may be inaccurate. Inaccurate coding can occur due to selection of the wrong code,[8] because the range of potential codes may be wide and not standardised,[9] or the correct code may not actually exist.[6] Faulconer and de Lusignan in 2004 reported that the sensitivity of the use[b] of the diagnostic code for COPD was 79% while positive predictive value was 75.3%, and that patients with COPD had to be identified from additional searches.[8] Likewise, Gray et al reported in 2003 that of 3803 patients with a pre-existing ischaemic heart disease (IHD) Read code, 15 percent were found to have no evidence of IHD.[10]
This has significant implications for the accuracy and quality of the subsequent analyses, reporting and health planning based on such inaccurate or incomplete coding. The question thus arises of how to increase the accuracy and quality of the coding applied to medical records.
In general practice, the general practitioner (GP) usually assigns codes to the EMR as part of their recording of the consultation. Enabling GPs to code more accurately and consistently by training, development and incentivisation has been advocated.[6] Repeated assessments, feedback and training have been shown to improve the quality of coded clinical data in general practice but the authors note that this method has cost and resource implications.[7]
The requirement to manually code medical records is an additional and unattractive workload for GPs,[4, 11] reducing the time available for direct patient care,[12] especially if it requires documentation and the coding of data extraneous to the clinical encounter[12] but required, eg, for financial, billing or performance management. Newrick et al reported a range of 53–227 hours/month spent by general practices recording data into computer systems or manuals records and, by extrapolation, suggested that about 1,230,000 hours would be spent every month collecting data in general practice in the UK.[11] Likewise, Letrilliar et al reported that GPs spend an average of two minutes per patient visit manually coding “reason for referral†using the International Classification of Primary Care (ICPC), and calculated that the systematic coding of the problems managed by all general practice encounters in France may account for about for an equivalent of about $US 640 million each year.[4]
The introduction of performance management indicators for general practice based on precisely defined, reliable and valid information covering a widening range of clinical conditions[13] has resulted in an increased requirement to consistently apply accurate standardised coding to general practice data. This has significant increased workload implications for practitioners working in general practice, an important consideration in view of the developing workforce shortage in general practice[14] and hence even more increased workload for those who remain.
This paper explores the possibility of using computer systems to automate the coding process of medical data in general practices in order to both improve the quality of coding and reduce the administrative burden for GPs. The paper first reviews previous work on computer assisted coding (CAC) and then proposes an approach that is suitable to be used in general practice and in the wider PHC arena.
Computer assisted code entry
Clinical coding by computers using standardised coding systems has been proposed to improve accuracy[15] with no additional workload for GPs[4] and is increasingly being used in EMRs.[16]
Many EMRs offer a basic level of CAC, in the form of semi-automatic coding systems operating at the time of data entry (eg, “popup†or “drop-down†menus of standard clinical terms).[4, 16] These facilities are now being extended to CAC using structured input, whereby the selection of successive menu items results in narrative text being produced and entered in to the EMR[16] such as that used in MedTech 32 Advanced Forms. However, these semi-automatic coding systems place additional constraints on the GP to use some kind of structured language or to enter data in pre-defined manner, thus affecting the way in which they enter clinical data in the EMR. 
CAC from free text
A major problem with coding systems has been their inability to cope with the level of detail required by clinical practice.[3, 17] Coding systems tend to restrict the freedom of data entry and the level of detail coded in order to facilitate subsequent analysis, whilst users (ie, health professionals) wanting “expressivity without any concern for subsequent analysis may use natural languageâ€[17] in the form of free text entered into the EMR.
The ability to automatically code free text would increase the completeness of any analysis from medical records as currently only limited information in health records is in a structured format[3] and important clinical data may be recorded in free text and be invisible to coding by traditional standardised coding systems and thus also be “invisible†to any analysis.[3, 18] This ability would allow GPs to continue to enter data in a flexible manner that best suits their individual consultation recording style and the recording requirements of the clinical encounter rather than restrict them by imposing the particular needs of the coding system and any subsequent data analyses.
It is this use of CAC from free text entered in the medical record that offers the means to increase coding without placing any additional constraint on practitioners[4] and achieve Lewis’s “ideal of a (coding) system that combines the advantages of structured records with the richness of free textâ€.[3]
The use of CAC of medical records has been an area of active research for some time, particularly in the US. The report “Delving into Computer-assisted Coding†by American Health Information Management Association (AHIMA) outlines the history of the CAC field,[16] and the report “Automated Coding Software: Development and Use to Enhance Anti-Fraud Activities†by FORE (AHIMA’s Foundation for Research and Education) surveys and compares 13 CAC vendors / service providers.[19] There are many publications describing case studies of CAC technology use and comparing CAC performance with the performance of human coders ( refer [20] and references in it).
However, CAC from free text appears to have applied to, or be under development for, restricted medical domains such as outpatient reporting[16] and hospital referral letters,[4] but does not appear to have been applied to general practice consultations, especially within New Zealand, except for specific research projects.[21, 22] 
CAC technology
The underlying automatic reasoning engines of current CAC software rely on combining soft computing (such as probabilistic reasoning using Bayesian networks or neural networks) and rule-based reasoning. Soft computing approaches require the use of learning datasets: sets of medical records with “known good†codes assigned. The advantage of soft computing approach is that one does not need to explicitly provide rules for all relevant situations or even to determine which situations are relevant. On the other hand, one potential advantage of using rule-based reasoning only is the ability of the reasoning engine to justify its decisions as consequences of applying some specific rules, triggered, eg, if certain keywords are present in the context of a particular linguistic structure. This could be useful in audits (when discrepancies need to be justified) and, more broadly, to enhance the confidence in code assignments by a reasoning engine. Whereas when using a soft computing approach, such justifications are more difficult to provide, as the impact of the learning dataset is normally of a statistical nature, so that human-readable justification is difficult to provide even in terms of a viewable collection of precedents. To the best of our knowledge, not one of the existing systems is capable of justifying the code assignments it makes. It is in this respect that human coders are superior, as they can always be asked to justify their decisions. According to the FORE report,[19] all of the current CAC software / service providers recommend that automatically assigned codes should be verified by human coders, although in practice this may not always occur.
Another possible benefit of using decision engines that rely on learning datasets (soft computing approaches) is that potentially they may be applied to capture and to re-use tacit knowledge. In particular, if a given organisation (relying on manual coding) is known to have established good practice in terms of how codes are assigned to EMRs, they could then use their data to train a CAC reasoning engine. Then, this engine could be used to assist human coders in other organisations, thus guiding the new organisations to use the good practice established by the learning dataset originator. For this approach to be feasible, the reasoning engine should be able to accurately capture the coding practices. This does, in fact, appear to be feasible at the current level of sophistication of information technology. However, we note that it is easier to judge if a reasoning engine captures the existing practice well (something quite specific) than to judge if it is accurate in general (and thus approaches the difficult to define "gold standard").
In principle, the GP filling in the record is in the best position to assign a relevant code, as they are in possession of the information regarding the patient, including any information that is not captured in the record. However, there are two factors that make it difficult to rely on practitioners: (1) the focus of the practitioner is on caring for the patient; assigning codes to a medical record is a low priority task; and (2) GPs’ time is very expensive. Hence, in some countries (eg, the US) it is a common practice to outsource the coding task to professional coders, who assign codes based on the text of the EMR only, as entered by the GP. However, we are not aware of any studies that compare the quality of codes assigned by GPs with the quality of codes assigned by professional coders or by CAC software. One would expect that the quality of codes assigned by physicians might vary greatly depending on the organisational culture determining the attitude of practitioners to coding.
Current vendors offer solutions ranging from deploying and training software "on site" (such as A-Life ) to providing a web service allowing encoding of one record at a time, with the text of the record sent over the Internet using a secure protocol (eg, Kiwi-Tek ). The physical architecture and the associated data ownership and data security issues have implications for any potential improvements in the quality of coding. In principle, the wider the context, the better the code quality that can be attained. In particular, one would expect the best results if the entire medical history of a particular patient, including socio-economic demographics, plus any regional issues such as local rates of disease, etc, are taken into account. If such a wide context were made available to CAC systems, then one could expect CAC systems to eventually surpass human coders in terms of the trustworthiness of the codes they assign. However, if codes are to be assigned via an Internet service, in order to take the wider contextual data into account, the CAC provider would need to have access to such data allowing them to effectively duplicate the whole of the health service provider database. This would imply an unrealistically high degree of trust and have significant privacy and confidentiality issues. On the other hand, if CAC software were deployed on-site, and operated entirely independently, it could not take advantage of relevant contextual information available only off-site (eg, the onset of an epidemic). Perhaps, the most promising architecture would correspond to largely autonomous CAC installations matching the administrative scopes defining data ownership and the need for data security that would communicate with the CAC provider or a similar entity to obtain the relevant current off-site contextual information.
While EMR coding is an important practice, there are limits of its usefulness. Coding is by definition restricted to a pre-determined schema. Hence, in a sense, decisions made based on coded data would rely on the state of the medical field as it was when the schema was created. If a new major health phenomenon were to occur, the evidence provided by existing EMR codes would be incomplete. Such a phenomenon could be a pandemic involving a new pathogen (eg "bird flu"). To be able to deal with emergent health phenomena in an informed way, one should be able to mine medical records in real time, rather then rely on codes as intermediaries, although this raises significant privacy and data ownership issues. 
A proposal for patient management system enhancement
From the previous discussion it would appear that there would be significant benefits for general practices if their PMS system incorporated a CAC component, which could automatically generate codes for encounters derived from the free text consultation notes entered by GPs at the time. This section describes a proposal for the architecture of such a system. To improve the coding process, a PMS incorporating an EHR system should be able to:
- Assess the quality of coding by an individual GP or by an organisation or a division in an organisation.
- Generate codes automatically when codes are not assigned by the practitioner.
- Assist the GP during data entry by suggesting codes based on the free text information they are entering.
For each layer, a topic tracking algorithm based on the algorithm described in Franz et al’s “Unsupervised and Supervised Clustering for Topic Trackingâ€[24] is applied to place each record entered by a given GP into one of the clusters. An initial training set can be generated using records (with codes assigned) entered by GPs who are judged to be compliant with a given set of coding regulations by assigning correct, informative codes to the records they create. To distinguish between positive and negative assertions (eg, regarding the presence or absence of a certain condition), a limited semantic measure to quantify similarity between records[25] can be employed. The topic tracking algorithm makes the best effort to assign each record to a cluster, based on calculating how similar this record is to other records (including training set records) falling under the corresponding code or one of its subcodes. All records are assigned to clusters, and the value corresponding to the confidence level of each assignment is calculated. Once records are assigned at all levels, each record ends up being assigned to a number of codes at different level of detail. Then, out of this set, the code corresponding to the lowest level of granularity, for which the confidence level is better then an a priori taken acceptable confidence level (and thus the likelihood of an incorrect assignment is smaller than a value deemed to be acceptable), is associated with the record as its "soft code". A record that cannot be assigned to any of the codes with sufficient confidence remains unassigned. Thus, this approach to clustering medical records can be described as restrictive clustering, as introduced in Siersdorfer and Sizov’s “Restrictive Clustering and Metaclustering for Self-Organizing Document Collectionsâ€.[26] Assignment of codes is restricted to cases when it can be achieved with a high degree of certainty. Note that a soft code reflects the current state of the system (the body of records it contains and, in particular, the training records it employs) and can be adjusted any time.
The soft code is used to fulfil the three requirements of a PMS incorporating an EMR outlined above. The difference between the soft code and the "hard code" entered by GPs (if one is entered at all) is taken as an indicative measure of the coding quality. While the actual quality of coding for a given record can only be judged by a suitably qualified professional, we expect that significant discrepancies between soft codes and hard codes for a large number of records could indicate (but not necessarily imply) that there is a problem with the approach to coding taken by a given GP, organisation or division in an organisation. This fulfils the first requirement.
When codes are not assigned by the GP, soft codes can be used as surrogates of hard codes in situations where a code is required (eg, for reporting purposes). This fulfils the second requirement.
Finally, the PMS may assign a soft code to a record that has just been entered and prompt the GP about whether they would like to accept the resulting soft code as a hard code (a decision for which, ultimately, the practitioner will be responsible). This fulfils the third requirement. This approach to code entry is very amenable to implementation with voice recognition interfaces, in which coded records would be entered by voice only, without the GP having to physically manipulate a computer. Thus, it could be easily integrated as part of voice-enabled PMS, such as the one described in Teel et al’s “Voice-Enabled Structured Medical Reportingâ€.[27]
As this approach has not been used before in primary care and we are currently at the design stage, the approach outlined above may need to undergo changes before it is implemented. While overall, the approach to topic tracking we are suggesting follows the seminal paper by Koller and Sahami on classifying documents hierarchically,[28] by re-classifying records level by level, we hope to reduce the problem of record misclassification near the point of origin. It is difficult to compare our approach to approaches used by existing CAC applications, because for existing CAC applications, performance data is more readily available than design details due to the proprietary nature of many of them. We count on benefiting from the recent advances in web mining and document clustering, very rapidly developing research areas related to CAC. As we improve our understanding of the peculiarities of EMR data patterns in New Zealand practice and the requirements for various quality aspects of code assignment, the details of the algorithm are likely to be adjusted. However, we expect that the overall approach will remain stable.

Conclusion
Using CAC from free text would increase the completeness of analysis of EMR data, allow GPs to continue to enter data in a flexible manner appropriate to their individual consultation recording style and the recording needs of the clinical encounter, and offer the means to increase the quality and completeness of the coding of clinical encounters in general practice without placing any additional workload on the practitioner. This paper reviews the current uses of CAC technology and examines the different approaches to the implementation of CAC technology. The paper presents the outline of the design of an automatic CAC module, based upon recent advances in document clustering techniques developed in text and web mining. This CAC module could enhance existing PMS systems by providing the functionality to assess the quality of coding by an individual practitioner (or by an organisation or a division in an organisation), generate codes automatically when codes are not assigned by the practitioner and, finally, assist GPs during data entry by suggesting codes based on the free text information they have entered . Theoretically, this would enable the increased coding requirements implicit within New Zealand’s Primary Health Strategy and Health Information Strategy to be met without increasing GPs’ workloads. The challenge now is to see if it can be done in practice. 
- Health Information Strategy Steering Committee. Health information strategy of New Zealand. Wellington: Ministry of Health; 2005.
- Ministry of Health. The primary health care strategy. Wellington: Ministry of Health; 2001.
- Lewis A. Health informatics: information and communication. Adv Psychiatr Treat 2002;8:165–171.
- Letrilliart L, Viboud C, Boëlle P, Flahault A. Automatic coding of reasons for hospital referral from general medicine free-text reports. Proceedings of the AMIA Symposium 2000;(20 Suppl):487–91.
- Didham R, Harrison K, Wood R, Hall J, Martin I. Information technology systems in general practice. Dunedin School of Medicine: RNZCGP Research Unit; 2003.
- Sanderson H, Adams T, Budden M, Hoare C. Lessons from the central Hampshire electronic health record pilot project: evaluation of the electronic health record for supporting patient care and secondary analysis. Br Med J 2004;328:875–878.
- Porcheret M, Hughes R, Evans D, Jordan K, Whitehurst T, Ogden H et al. Data quality of general practice electronic health records: The impact of a program of assessments, feedback, and training. J Am Med Inform Assoc 2004;11:78–86.
- Faulconer E, de Lusignan S. An eight-step method for assessing diagnostic data quality in practice: chronic obstructive pulmonary disease as an exemplar. Inform Prim Care 2004;12(4):243–54.
- Gray J, Orr D, Majeed A. Use of read codes in diabetes management in a south London primary care group: implications for establishing disease registers. Br Med J 2003;326:1130–1134.
- Gray J, Ekins M, Scammell A, Carroll K, Majeed A. Workload implications of identifying patients with ischaemic heart disease in primary care: population-based study. J Public Health 2003;25:223–227.
- Newrick D, Spencer J, Jones K. Collecting data in general practice: need for standardisation. Br Med J 1996;12:33–34.
- Brett AS. New guidelines for coding physicians’ services - a step backward. N Engl J Med 1998;339(23):1705–1708.
- McColl A, Roderick P, Gabbay J, Smith H, Moore M. Performance indicators for primary care groups: an evidence based approach. Br Med J 1998;317:1354–1360.
- RNZCGP. 2005 RNZCGP membership survey: part 1 general practitioner demographics, working arrangements and hours worked. Wellington: RNZCGP; 2005.
- Norris AC. Current trends and challenges in health informatics. Health Inform J2002;8:205–213.
- AHIMA e-HIM Work Group on Computer-Assisted Coding. Delving into computer-assisted coding (AHIMA Practice Brief). J AHIMA 2004;75(10):48A-H.
- Schulz E, Price C, Brown P. Symbolic anatomic knowledge representation in the rRead codes Version 3: structure and application. J Am Med Inform Assoc 1997;4(1):38–48.
- Anandarajah S, Tai T, de Lusignan S, Stevens P, O’Donoghue D, Walker M et al. The validity of searching routinely collected general practice computer data to identify patients with chronic kidney disease (CKD): a manual review of 500 medical records. Nephrol Dial Transplant 2005;20(10):2089–2096.
- FORE. Automated coding software: development and use to enhance anti-fraud activities. Chicago, Illinois: American Health Information Management Association; 2005.
- Hripcsak G, Knirsch C, Zhou L, Wilcox A, Melton G. Using discordance to improve classification in narrative clinical databases: An application to community-acquired pneumonia. Comput Biol Med 2006; in press.
- Ministry of Health. Family doctors: methodology and description of the activity of private GPs. The National Primary Medical Care Survey (NatMedCa): 2001/02. Report 1. Wellington: Ministry of Health; 2004.
- Using routinely collected primary care data in research: pitfalls and potential. Proceedings of the primary care research forum of the Royal New Zealand College of General Practitioners Conference July 2004, Wellington, New Zealand. RNZCGP.
- Stearns MQ, Wang AY, Price C, Spackman KA.. SNOMED clinical terms: overview of the development process and project status. Proceedings of the American Medical Informatics Association Symposium Fall 2001:662–6.
- Franz M, McCarley J, Ward T, Zhu W-J. Unsupervised and supervised clustering for topic tracking. SIGIR’01 2001:310–317.
- Nallapati R. Semantic language models for topic detection and tracking. HLT-NAACL’03 2003:1–6.
- Siersdorfer S, Sizov S. Restrictive clustering and metaclustering for self-organizing document collections. SIGIR’04 2004: 226–233.
- Teel M-M, Sokolowski R, Rosenthal D, Belge M. Voice-enabled structured medical reporting. CHI’98 1998: 595–602.
- Koller D, Sahami M. Hierarchically classifying documents using very few words. Proceedings of ICML-97, 14th International Conference on Machine Learning 1997: 170-178.

Footnotes
a The term “medical record†used in this paper means an electronic medical record unless otherwise specified. The issues with manual coding apply equally to paper-based medical records and electronic medical records. Computer-assisted coding obviously requires the paper-based record to have been converted into an electronic format first.
b The sensitivity of the use of a code is the proportion of people with the disease who have a COPD code result. The higher the sensitivity, the greater the accurate coding rate and the lower the false negative rate. Sensitivity = true positives / (true positives + false negatives).









.jpg)











