The dummies guide to integrating LLMs and AIs into EHRs
Ambient virtual assistants are dead. Long live ambient virtual assistants and conversational AI.
Part 1: A brief summary of the eternal struggle of the intersection of clinical documentation and technology
Since Dr. Larry Weed invented problem-oriented charting in the 1970s, the major challenge of computer/human clinical documentation has been balancing four key requirements.
The documentation must be thorough enough for medical completeness and reimbursement purposes.
The documentation must be succinct enough that people will actually read it.
The documentation must be comprehensible to a wide-variety of audiences both within and outside of the author’s specialty.
The author must not spent so much time doing #1-#3 that they consider switching careers to farming or product management.
The solutions to this have varied over time. Paper documentation templates were first. Then came transcription. Transcription was believed to be expensive, to which the EHRs pre-Meaningful Use pitched their utility as “lower documentation costs via documentation macros”.1 As the computer/user interface became more compatible and interoperable, partial dictation and then speech-to-text solutions like Dragon became more accessible. And the cost of transcription went down as transcription was offshored and bolstered by speech-to-text itself. And, of course, scribes were available for certain specialists in various care settings. A scribe is different than an ordinary transcriptionist in that their goal is not just to write the note but also to file any discrete information as required (either by good documentation practices internally or billing requirements externally).
The most recent attempt to solve the documentation-to-communication balancing act were virtual assistants. There are a few flavors of them, but they generally can accomplish the following tasks.
Aggregate more data about patients before the visit to enhance documentation.
Provide immediate and intelligent voice-to-text that could convert prose into captured discrete data (orders, billing, medications, dx/problems). This capture of prose is done through either “smart rooms”, speakers/mobile devices or other workflow-specific devices .
Could write said discrete and non-discrete data into the EHR.
At a high level, these virtual assistants were designed to replace either a scribe or a well-trained transcriptionist. By being able to pull clinical data before and during exams and emulating some amount of clinical/administrative training, virtual assistants used that input to fully document in the EHR, causing a clinician to spend less time on clicks/mental overhead.
For the most part, none of these vendors ultimately dominated the market or deprecated transcription/scribes as predicted up to this point. I would globally say that the Ambient Virtual Assistant era is roughly 10 years old at this point.2 This is perhaps debatable, but I would point towards 5 key events this year:
Meaningful Use Stage 1 ends, the entry phase to provider EHR adoption in the US. The majority of US physicians now use an EHR and growing as MU picks up steam.
Augmedix, the first dedicated virtual assistant company (AFAIK), launches on a Google Glass based platform.
The iPhone 4s is released in 2011, which is the first Apple product to utilize “Siri” in its modern incarnation. Though it is refined with touchID in the 5s in 2013.
Nuance in 2013 claims that Dragon Medical is now 98% accurate out of the box.3
Amazon buys the Polish company Ivona (Dobra robota!) and releases it as Alexa the next year to complete with Siri.
Heading into the ten year mark, the Global Medical Transcription market was estimated to be 1.5 billion USD in 2021, yet in that year most Ambient AI technologies did a small percentage of that overall revenue (that does not includes Scribes and other non-technical equivalents). Nuance at its acquisition in 2021 only anticipated selling 10-23 million USD of Dragon Ambient eXperience (DAX). Augmedix in FY 2022 did 30.3M USD revenue. Other entrants were either acquired by Nuance (Saykara, iScribes) or are startups like Suki or Robin Healthcare who do not have public disclosures.4
A decade in, why were there no world killing technologies yet on the market? It’s hard to say exactly. But these are my hypotheses, somewhat grounded in performing some discovery on building a digital assistant into a healthcare application myself:
While everyone was claiming to have AI, it was less so on the intelligent end of things. Yes, speech to text and other technologies like Amazon Comprehend are ML-driven technology but they were hardly generationally changing tech. They were still mostly effective when buttressed by some other humans (transcriptionists, scribes, even clinicians for review). For the most part you still had to craft intents as an application developer and then train users how to use those intents (e.g. “Siri, play ‘Never Gonna Give You Up’ on Spotify.”)
The mediums for the input of voice-based data are complicated. A device like an Alexa or on-phone recording is fine in a quiet Ambulatory exam room but is not necessarily a good fit in a busy acute care setting.
Conversely other tools which may be better for more crowded care settings have not outperformed the classic dictophone itself. Google Glass and phone microphones themselves generally were lacking or made the user look rather dorky. This is getting better with massive investment into digital audio quality through televideo applications like Zoom/Teams/Meet but is still not ultimately available to the masses.
Integration technology was nascent. Yeah, it’s easy enough to receive a scheduling/registration feed and push notes into an EHR but setting up something which can push all of the discrete data high holies (meds, allergies, problems/hx, orders, charges) or respond to data ingested mid-exam is no trivial feat in HL7v2. Despite their claims, the data integrator companies didn’t make this problem any simpler. More possible, but certainly not simpler. These APIs exist in FHIR now but their capabilities to provide write capabilities back into EHRs are limited to market leading vendors like Epic, Cerner and a few other smaller players. RPA does fine with automated, non-real-time tasks but may struggle when operating in real time during an exam.
At the end of the day, the early adopters of all technology have two attributes: believing they have a solvable problem and believing that your technology is far-and-away the best way to solve the problem. At at the end of the day, anyone considering a virtual scribe might find that the technology is cheaper perhaps but it is certainly not better than a scribe in most cases. In many cases this may mean changing the paradigm of better (e.g. the Butterfly IQ+ may not be as versatile or as focused on certain exams for ultrasound power users, but putting a probe in your pocket which works immediately upon opening the app that passes a four-foot drop test).
This isn’t to say that these virtual assistants are a failure. On the contrary. One thing I often like to tell healthcare newbies is that it’s often easy to make a prediction that some trend will change healthcare. The challenge is always understanding the time horizon by which it will occur. The need to reduce the digital burden on clinicians by creating better tools has been obvious for a decade. However, we haven’t had something truly better since LLMs; something better than a human doing a task. To be better than a scribe at documentation and data summarization. While many of these technologies generally failed (or were predicates to) vs. newer LLMs, they do have the groundwork necessary and proof required to execute on the vision of AI.
The virtual assistants are already better integrated. They are already aligned with EHR vendors. They already have some proof of reliability and that those vendors could cross the chasm and survive in the rough and tumble world of enterprise health system sales. While their actual AI tech quality may be debatable, healthcare technology sales is often more about trust than quality. If the maxim of product is “right thing, right time” than they are seemingly in the pole position.
What does this mean? If you are a nascent LLM vendor it means that you are behind. This is true of any startup entering a new space against incumbents but this is specifically true now as everyone, both buyers and vendors alike, is gathering their footing around LLMs. If your business or hypothesis is that you can build a more useful LLM faster, you must also ensure that you integrate those insights or documentation at the correct inflection point of the clinical workflow. This means understanding patient-provider flow; both as people perceive it and their underlying technology systems like EHRs. And it means understanding the limitations of those systems. The good news and the bad news about clinical data integration is that the playing field is as level as ever. APIs are standardized. Old integrations against VB6 built systems are being deprecated. Anyone can pay a vendor like Redox or Health Gorilla to pull data. IE11 finally EOL’d. Health systems are finally somewhat savvy about cloud computing, even in that they have some system to trust those systems. The opportunity is there for a team of non-EHR/informaticist dorks who are good at LLMs and understanding the user needs of clinicians to take the market.
But you’ve got to get their fast. Because this is happening now. So how do you become a provider/patient/EHR intersectional expert fast? Let me tell you how in part 2.
In hindsight, this is hilarious.
Let me have this as a solid decade. It’s good storytelling!
Older versions of Dragon required some amount of user training on documentation semantics as well as training their Dragon on the user’s voice.
These startups may be on even footing or even outpacing the publicly traded companies as they seem more directionally oriented (we will come back to this when talking about solutions). I also did not include startups/companies who were more patient/consumer oriented like Orbita and Vocera. Their care coordination/patient digital journey technology is important but the point of this is to talk about clinician/technology experience vis-a-vis AI. Perhaps this is a good topic for another blog post since there is some overlap in use cases.