GESDOR - A Generic Execution Model for Sharing of Computer-Interpretable Clinical Practice Guidelines

 
Abstract
We developed the Guideline Execution by Semantic Decomposition of Representation (GESDOR) model to share guidelines encoded in different formats at the execution level. For this purpose, we extracted a set of generalized guideline execution tasks from the existing guideline representation models. We then created the mappings between specific guideline representation models and the set of the common guideline execution tasks. Finally, we developed a generic task-scheduling model to harmonize the existing approaches to guideline task scheduling. The evaluation has shown that the GESDOR model can be used for the effective execution of guidelines encoded in different formats, and thus realizes guideline sharing at the execution level.
 
Brief Description
Sharing of computer-interpretable clinical practice guidelines (CPGs) is a critical requirement for guideline development, dissemination and implementation [Shortliffe EH et al., 1998, A study of collaboration among medical informatics research laboratories]. In addition to conferring cost efficiency in guideline development, guideline sharing leads to improved acceptance of guideline implementation systems, and thus promotes the use of guidelines.

In this study, we propose an alternative approach, the Guideline Execution by Semantic Decomposition of Representation (GESDOR) model, to guideline sharing at the execution level. This approach is based on the observation that the different guideline representation models contain similar execution tasks, which are used to support the implementation of CPGs. According to the GESDOR model, guidelines can be encoded in different formats. A set of generalized guideline execution tasks are extracted from the existing guideline representation models. This set of generalized guideline execution tasks is then used to drive the execution of specific guidelines encoded in different formats. The relationship among the guideline instances, the guideline representation models in which the guideline instances are encoded, and the generalized guideline execution tasks is shown in Figure 1.

Figure 1. The relationship among the guideline instances, the guideline models, and the generalized guideline execution tasks in GESDOR. The guideline instances are encoded in specific representation models, while these models are mapped to the generalized guideline execution tasks. The guideline tasks are then used to drive the execution of the guideline instances encoded in different formats..
The GESDOR guideline execution model comprises
  1. a set of guideline representation models, which defines the domain to which the GESDOR guideline execution model can be applied,
  2. a set of generalized guideline execution tasks that are extracted from the existing guideline representation models,
  3. a set of mapping relationships, each of which corresponds to a specific guideline representation model defined in (1) and provides the semantic links from the elements of that model to the guideline tasks defined in (2), and
  4. a generic task-scheduling model, which harmonizes the existing approaches to task scheduling during guideline execution.
The GESDOR model is built on the approach of guideline execution that was used by GLEE , the execution engine for guidelines encoded in the GLIF3 format. In contrast to GLEE, the GESDOR model uses generalized guideline execution tasks to drive the execution of guidelines. Specifically, guidelines encoded in different formats are stored in a guideline repository, from which they can be retrieved and translated into the instances of the guideline tasks. This translation process is directed by the mapping relationship between the generalized guideline execution tasks and the model in which a guideline is encoded. Once the translation has been completed, the guideline task instances are used by the GESDOR guideline execution engine, along with a generic task-scheduling model that harmonizes the existing approaches to task scheduling, to drive the execution of the guideline. The overall system architecture of the GESDOR model is shown in Figure 2.


Figure 2. The overall system architecture of the GESDOR guideline execution model.
When using the four different approaches to execute the DTP immunization guideline, consistent final recommendations were generated in 1978 out of the total 2007 cases (98.56%). In the remaining 29 cases (1.44%), the recommendations generated by GESDOR GLIF3, GESDOR PROforma*, and GLEE were inconsistent with those generated by the EzVac system. The kappa value of 0.98 indicated a high level of agreement of the results.

When using the three different approaches to execute the cough guideline (the ad hoc approach did not apply here), consistent recommendations were generated in all of the 20 cases.

Comparison of the execution paths when GESDOR GLIF3 and GLEE were used to execute the DTP immunization guideline and the cough guideline indicated that the activation traces and the start traces were exactly the same for all the cases of the two guidelines. However, a significant portion of the cases (1946 out of the 2007 cases for the DTP immunization guideline, and all 20 cases for the cough guideline) had inconsistent results when the chaining records were used in the comparison.

Comparison of the execution paths when GESDOR GLIF3 and GESDOR PROforma* were used to execute the DTP immunization guideline and the cough guideline indicated that the activation traces and the start traces were exactly the same for all the cases of the two guidelines. Here the chaining records did not apply, as GLIF3 and PROforma* have different types of primary tasks.

Finally, we used physicians' judgments as the gold standard to evaluate the clinical validity of the final recommendations generated by the systems. For the DTP immunization guideline, all the 29 inconsistent cases and 20 cases that were randomly selected from the 1978 consistent cases were reviewed by two physicians. In the first round of the review, the physician judges did not know the recommendations generated by the systems. Instead, their judgments were based solely on the case descriptions. In this round, the sensitivity and the specificity of GESDOR GLIF3, GESDOR PROforma*, and GLEE (these three systems had the same final recommendations) were 99.71% and 67.65% respectively; and the sensitivity and the specificity of EzVac were 99.43% and 67.48% respectively. To improve the reliability of the judgments, the 5 cases in which the judgments by the physicians were different from any of the four systems were sent back to the physicians for a second review, along with the results generated by the systems this time. In the second round of the review, the sensitivity and the specificity of GESDOR GLIF3, GESDOR PROforma*, and GLEE were 99.80% and 80.74% respectively; and the sensitivity and the specificity of EzVac were 99.53% and 80.55% respectively. Here the EzVac system was used as an external reference to evaluate the other three systems.

For the cough guideline, two physicians reviewed all the 20 cases. The percentage of the correct, acceptable, and wrong diagnoses for case 1 to case 10 were 38.89%, 47.22%, and 13.89% respectively; and the percentage of the correct, acceptable, and wrong diagnoses for case 11 to case 20 were 46.94%, 44.90%, and 8.16% respectively. Here as the first 10 cases were used to tune the encoding of the decision criteria, they were used as a reference to measure the performance of the last 10 cases.
 
Discussion
The results had shown that the recommendations generated by GESDOR GLIF3, GESDOR PROforma*, and GLEE were exactly the same for all the cases in both guidelines. This means the GESDOR model works well in terms of generating guideline-based recommendations, which are used finally in clinical decision support and thus the most important outcome in the evaluation.

The execution paths of GESDOR GLIF3 and GESDOR PROforma* were exactly the same for all the cases in both guidelines. This means the GESDOR model is generalizable in that it can be applied to different guideline representation models.

The activation traces and the start traces generated by GESDOR GLIF3 and GLEE were consistent for all the cases of both guidelines. Analyses found that the inconsistent results in the comparison of the chaining records were due to the extra information that was added by the generic task-scheduling model of GESDOR. This means that the chaining records should be used (e.g., in implementation of an explanation function associated with a clinical decision support system, where the chaining records play a critical role) conservatively when applying the GESDOR model for guideline execution. It is important to note, however, that this problem of the generic task-scheduling model does not affect the final recommendations generated by the system.

The clinical validity of the final recommendations generated by GESDOR GLIF3 and GESDOR PROforma* reached the level of the reference systems. Specifically, in the execution of the DTP immunization guideline, the sensitivity and the specificity of the systems were at the same level of the EzVac system; in the execution of the cough guideline, the accuracy of the systems when they were applied to the last 10 cases was a little better than that when they were applied to the first 10 cases.

Process modeling tools had been used previously to implement care plans12. The GESDOR model is different from previous approaches in that it focused on the process-centered knowledge management, with the generalized guideline execution task ontology as a process-oriented reorganization of the guideline execution knowledge that are common across different guideline models.

Several ontology mapping models and tools had been developed previously for different purposes13,14. The ontology mapping model in GESDOR is different from previous approaches in that (1) it focuses on the instance translation directed by model-level mapping, and (2) it has its own languages for specification of slot mapping and mapping condition, and thus provides flexibility in the development of the mapping model to facilitate ontology mapping.

The GESDOR model provides connections among different guideline representation models, similar to the function of the UMLS to bridge different controlled medical terminologies15. As a long-term goal, with more and more guideline representation models included into the application domain of GESDOR, a comprehensive standard of guideline representation will be able to be developed and widely accepted.

In this study, we assume that PROforma* uses the same expression language as that in GLIF3. For models using different expression languages, we believe that the general principle of GESDOR still applies, although its effectiveness needs to be evaluated further in those cases. Ideally, the GESDOR model should be tested with guidelines encoded in their original formats. We plan to request additional resources to further investigate the feasibility and the generalizability of the GESDOR model.
 
Conclusion
The GESDOR model can be used for the effective execution of guidelines that are encoded in different formats, and thus realizes guideline sharing at the execution level. GESDOR's chaining records should be used conservatively.
 
Publications
  1. Wang D, Peleg M, Bu D, Cantor M, Landesberg G, Lunenfeld E, Tu SW, Kaiser GE, Hripcsak G, Patel VL, Shortliffe EH. "GESDOR - a generic execution model for sharing of computer-interpretable clinical practice guidelines." AMIA. Annual Symposium proceedings / AMIA Symposium. AMIA Symposium. 2003;
 

University of Rochester Medical Center
601 Elmwood Avenue, Box 689, Rochester, NY 14642.
Webmaster | Last update: 5/10/2011