State of the clinical modelling program and international CKM

I sent this email to the openEHR lists yesterday. Published here for broader sharing, and perhaps as a resource for future benchmarking…


22 July 2019

Dear colleagues,

We recently passed the eleven-year anniversary for the first upload to the international CKM – the body temperature archetype. As Europe readies itself for summer holidays and the clinical review season slows down, it is a good time to review the progress of the openEHR clinical modelling program.

 Roughly 6 weeks ago I created and downloaded a number of reports from CKM. I’ve spent some time analysing the data and thought I’d share what I learned with you.

This exploration was triggered by a tweet from Ewan Davis last December asking:

“How many person hours do you think has gone in to creating the openEHR archetypes available via CKM - I think it must be in excess of 100,000 hours (40 person years)”

 It took a while to gather the data and propose reasonable assumptions so that we could make time and effort estimates, but here goes…

CKM stats

(As of July 5 2019):

  1. Community

    • Registered users – 2239

    • Countries represented – 95

  2. Archetype library

    • Total archetypes – 785

    • Active archetypes

  • Published – 93

  • Published as v1, needing reassessment - 6

  • In review – 31, with at least 7 about to be published

  • Draft - 351

  • Initial (in incubators) – 110

    • Proposed archetypes - 10

Behind the scenes

(from CKM reports, May 2019)

  1. Number of archetypes which have completed or are undergoing a review process – 130

  2. Number of review rounds completed - 295

  3. Number of archetype reviews completed by all reviewers – 2995

  4. Number of unique reviewers – 272

  5. Reviews completed per review round – 10.15

  6. Average number of reviews per archetype – 23.04

  7. Average number of reviews per reviewer – 11.01

  8. On average, during the past 12 months, approximately 100 unique reviewers logged into CKM on 900 occasions per month.

Time estimates

This is where things become interesting…

table.jpg

This equates to roughly 8.5 person years.

Obviously, I have made some assumptions about the average time for many activities and if we factor in incidental conversations or pondering modelling conundrums or cross pollination between CKMs we could reasonably increase the estimate to 10 person years. However, try as I could, there was no way I could justify bumping them up in order to achieve estimates of 20, much less 40, person years.

These numbers reflect the work for archetypes that are owned and managed in the international CKM. This includes an estimation of work done by the reviewers and editors from the Apperta and Norwegian CKMs if their archetypes are now residing in the international CKM, or multiple CKMs. It does not reflect the work done on reviews from the now retired Australian CKM, although estimates of design time have been part of the assumptions.

I interpret Ewan’s estimate to reflect his impression that the effort to achieve what we have done so far was huge. I too believed that the effort was epic, but in my head it was still only in the ballpark of about half of his initial estimate. That the actual effort appears to be only 8-10 person years totally surprised me. Initially my figures were considerably lower; I did go back to the figures and tried to massage them upward because this is obviously a rather inexact science, more like an educated guestimate, but this is as far as I feel comfortable going.

In addition, Thomas Beale estimates that on average there are 14 clinically significant data elements per archetype, according to the ADL Workbench. These are the relevant data points that we design, review etc. So 785 active archetypes x 14 data points/archetype suggests that we have a library of approximately 10,990 data points, none of which are duplicates or overlapping in the governed archetypes. And if we agree with my estimate of a total of 16289 hours, the amount of time per data element is 16289/10990 - only 1.48 hours each, covering design, review, maintenance, governance.

 What conclusions can we draw?

  • Firstly, modelling ‘openEHR style’ seems to be quite efficient, surprising even those of us who are involved daily and secondly, this unique collaborative and crowdsourced approach to standardisation of clinical data is working well. On top of that, if you remember that more than 95% of the editorial work and reviewer’s time has been volunteer, then it this truly has been an extraordinary community endeavour.

  • Secondly, the ratio of reviewer time to design time is noteworthy – 1498 hours of review, compared to 10437 hours of design. In effect, we have successfully minimised reviewer effort by making each 30-minute review count as efficiently as possible, and that has been achieved by attention to detail and spending time investigating and developing strong design patterns before we send them out for review. Over the years we have made some bad design choices and had to rethink our approach. Gradually we have been developing some good patterns and, before you ask where we have documented them, I will point you to the published archetypes – each of them functions as a potential pattern for the next archetype we intend to develop – we reference and reuse the patterns as much as possible. In this way our library is growing, and our modelling is improving. As an example, a current area of serious rework is the Physical examination archetypes which are being ‘renovated’ at present. It does make me think that for every hour spent in design it is a good investment of time and effort – that may not seem apparent in the early days, but I think that we are finding that it is paying off for the archetypes that we are designing years later, based on the (good and bad) learnings from those earliest archetype designs.

  • Thirdly, we have some insights into the modelling community, and for the first time we have some idea about the level of activity by those with various roles and activities. We also have an estimate of the size of the data library at data element level, so that we are able to compare to other similar modelling efforts elsewhere in the world.

I would particularly like to thank my co-lead, Silje Ljosland Bakke, and Ian McNicoll for their dedicated efforts, and of course to all of the other Editors, Reviewers and Translators who have so generously volunteered their time and expertise to create a strong free and public foundation for digital health data standards. 

We should all be very proud of this work. This will be our legacy that will live on after well after we’ve all long retired.

Kind regards

Heather Leslie

Representing FHIR clinical content in openEHR

Over the past few days I’ve attempted the task of representing the FHIR Skin and Wound Assessment profiles using openEHR archetypes.

I note that there are three Skin and Wound Assessment FHIR Implementation Guides available online:

  1. "Full CIMI" version – which is the one I chose to model;

  2. Federal Health Information Model (FHIM) version; and

  3. Mitre’s ‘mini-CIMI’ version.

I’m a clinical modeler, a clinician by background, so I’m always looking at how we can best represent clinical data in ways that are friendly to grassroots, non-technical clinicians of any sort. The FHIR IG is a tough beast to decipher, despite my experience of gathering patient requirements and turning them into implementable specifications for more than a decade.

My intent as I started this work was specifically that as a clinician I didn’t want to have to fully understand the FHIR representation. I wanted to be able to look at the clinical data and recreate it using my familiar openEHR tooling and representations.

I estimate that it has taken me nearly 2 full days of work – much more than I anticipated - and mostly to trawl through the myriad of online pages for each FHIR resource and associated profile, then to piece together the connections visually so that I could create/reuse appropriate openEHR archetypes and templates. The openEHR representation didn’t take long, largely because of reuse. It was the analysis that was the time killer.

Despite all of that effort, I am still not confident that I’ve got it right. But the following post reflects my experience, plus learnings and some queries.

My modelling assumptions

The three base clinical representations that I’ve gleaned is the Wound Presence Assertion, the Wound Absence Assertion and the Wound Panel Assessment.

Rightly or wrongly, my openEHR templates assume the following:

  • ‘Wound Presence Assertion’ profile is the equivalent of our recording a diagnosis and overview of a wound – so I’ve created a Wound Presence Assertion template, based on the EVALUATION.problem_diagnosis);

  • ‘Wound Absence Assertion’ profile is the positive assertion that a wound is excluded or not present; and

  • ‘Wound Assessment Panel’ is the equivalent of clinical examination findings about a single, identified wound – so I’ve created a Wound Assessment Panel template, based on the OBSERVATION.exam archetype.

  • If FHIR components were modelled as 0..1 occurrences then I added them to the archetype at the root level – see the size measurements and edge related data elements in the Examination of a Wound CLUSTER archetype.

  • If FHIR components were modelled as 0..* occurrences then I added them as repeatable internal cluster groupings – see the Tunnelling and Undermining clusters in the same archetype.

openEHR Representation

You’ll note that I haven’t created a template for the Wound Absence Assertion. I’ll be curious to understand the use case from the FHIR modellers but I cannot understand the use case where a clinician will explicitly record that a wound is absent, not there. They will record that a previously known wound has healed over, or that the skin in the area is normal. But to record that a pressure ulcer on the right buttock is absent – I don’t think so, not even in a medico-legal scenario! If someone can provide me with a use case, I’m happy to reconsider…

I’ve uploaded the resulting two templates and the associated archetypes to a public incubator on CKM. You can view them all here: https://ckm.openehr.org/ckm/#showProject=1013.30.9.

The two templates comprise 6 reused archetypes, and I created two new archetypes:

The clinical content within the FHIR IG is generally very sound, and I can see that a lot of work has gone into the development of it and especially the value sets. It is a very useful resource, if you can discern the content in amongst all of the rest of the tech spec. I must admit I got very frustrated and very confused and had to restart a few times.

Once I’d teased it out, I’m very grateful to those who did the hard yards of clinical analysis that underpins this Skin and Wound assessment. Credits on the IG attribute the domain content analysis to Susan Matney from Intermountain Health. I have some other questions for clarifications and would like to discuss a few issues but otherwise this is a really sound piece of work and I’m very pleased with the end result in openEHR.

The templates are up on CKM for you to take a look at:

The rather ugly CKM default UI for templates is deliberately designed for displaying the individual data elements and most of the relevant constraints – much easier for a clinician to be able to review and approve from a content point of view. CKM doesn’t display the URL to value sets, although if you download the .oet and .opt files you will find them safely stored within the code.

Questions, issues

General modelling

I’ve reconfirmed that the openEHR reference model is a godsend. That the data types have set attributes is a given and doesn’t need to be represented over and over again is something I now appreciate enormously, including null flavours etc. Brilliant. There is so much endless repetition in the FHIR resources for RM related data and it takes ages to locate the real clinical data amongst everything else.

Wound Presence Assertion

  • Anatomical location of the wound is only recorded in this model. I’m not so sure about whether this is a good idea. I think that anatomical location should also be modelled for each examination event (ie included in the Wound Assessment Panel) so that what is being examined is clear and unambiguous. At present the anatomical location of the wound represented by the Assessment Panel appears to only be associated with the Presence Assertion via a common identifier (WoundIdentifier).
    Note that I can only find the Anatomical location model in the Logical model and not in the Profile, so maybe I’m missing something?

  • In the logical model, Laterality is represented as ‘Unilateral left’, ‘Unilateral right’ and ‘Bilateral’. I totally agree with the left and right but I have a major problem with identifying one (or more) wound(s) as ‘bilateral’. ‘Bilateral’ should probably not refer to direct observational exam findings at all, but is may have some value in recording conclusions. For example, in examination of each ankle, the finding of pitting oedema may be made, but each side is likely to need explicit recording of different severity or association with ulceration etc, so the findings from each side should be recorded one separately. However, the conclusion of the clinician, at a higher level of abstraction in a physical findings summary or as a diagnosis, may well be bilateral ankle oedema, but it is not advised for use at the point of recording the examination findings.
    Given that this representation of anatomical location is in the Assertion and as best I can tell the concept being modelled is a single Wound (SNOMEDCT::416462003), the notion of a bilateral wound is not appropriate.

  • Clinical status values – now these were tricky. We have an existing ‘Problem/diagnosis qualifier’ archetype. It is a messy beastie, largely because clinicians are notoriously messy at how they record these kind of things. The FHIR value set used in this template comprises values from 4 (yes, four) data elements that we have identified as having completely different axes in our archetype. The FHIR value set is drawn from parts of each of our ‘Active/Inactive’, ‘Resolution phase’, ‘Remission status’ and ‘Episodicity’ data elements.

Wound Assessment Panel

  • As above, I’m concerned that there is no explicit recording of the Anatomical location/laterality parameters so that we can track examination findings over time, especially if there are multiple wounds. An identifier as a connector seems a little flimsy for me.

  • The concept seems to be a generic wound, but the data elements seem to be focused on recording the findings of an open, ulcerated lesion in reality. If the wound was a long laceration, for example, there are parameters missing such as beginning and end point, relative location to a body landmark. An animal bite might also be difficult to record at the best of times, and something simple like a narrative description would also be helpful.

  • There is a data point about a pressure ulcer association, with two values – device and pressure site, which reinforces the focus on a pressure ulcer and is very specific. I have modelled that same data point as a repeating cluster pattern of a ‘Factor’ associated present/absent attributes to make this model more applicable to a range of wounds.

  • Tunnelling is a tricky concept to model. In the FHIR model, tunneling seems to assume that the tunneling radiates out from the edge of the wound but doesn’t allow for deep tunneling from the middle of the wound to be recorded.

  • The use of a clockface direction is common in a few clinical scenarios, including this one. However the openEHR experience has identified that in order to represent it accurately a few assumed items need to be recorded, such as identification of the central landmark around which the clockface is oriented, as well as the anatomical landmark that identified the 12 o’clock reference point. See our recently published Circular anatomical location archetype.

  • The Undermining model is represented in the archetype/template as a repeating CLUSTER to allow multiple measurements of the amount undermined in different directions. In the FHIR model, the length and direction are both optional. In the archetype I’ve made the length mandatory as there is no point recording a direction by itself.

  • I did not model exudate volume in the template as it is measured, and it is not clear to me how you measure a volume at an examination (assuming this assumption about the recording context is correct). Rather it is usually recorded over time, and so does not seem to belong here. I did model the Amount, with a value set that is not available to view, as I assume it is descriptive and could be appropriate here.

  • Episode is modelled within this context, however it seems to me that it is probably better placed in the Assertion. See the ‘Episodicity’ data element within  the Problem/diagnosis qualifier archetype.

  • Similarly ‘Trend’ feels a little tricky in the context of examination findings. I guess that it could be useful if part of a sequence of exam findings and if it correlated back to the longer timeframe of ‘Course description’ narrative within the Problem/Diagnosis archetype.

  • There are a few data element which record Yes/No/Unknown answers as though it is answering a questionnaire within the context of recording exam findings. In openEHR we tend to record these as findings that are Present/Absent/Indeterminate so that we can bind, arguably, more meaningful codes to each value and not mix history-like recording with observations.  


For a grassroots clinician to review the content there is no doubt that the IG is next to impossible. Perhaps the FHIR community has a more clinician-friendly view that I’m not aware of. It is absolutely needed!

Wouldn’t it be great if FHIR and openEHR communities could collaborate such that this CKM representation could be used to support the FHIR work… but I’ve probably said that a million times… or maybe more. Perhaps one day <sigh>.

A common data language is essential for digital health disruption

The lack of a common health data language has been ‘the elephant in the room’ for a very long time. Unfortunately, very few people acknowledge the need for a clinical lingua franca as a critical foundation for eHealth. The mainstream view seems to be that messages are/will be enough and that creating a standard language for health information is either too hard or too complicated. Is it really that hard? Or is that just the view of those with vested interest in perpetuating the message paradigm?