Inlining feature added to Cumulus ETL #47
mikix
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The Cumulus ETL just added a neat new feature: the ability to inline attachments into your DocumentReference or DiagnosticReport NDJSON.
What Is Inlining?
Inlining is the process of taking an original NDJSON attachment definition like this:
Then downloading the referenced URL, and stuffing the results back into the NDJSON with some extra metadata like so:
Now the data is stored locally in your downloaded NDJSON and can be processed independently of the EHR.
Why Would I Want This?
This lets you download the attachments once up front, rather than every time you want to run NLP over clinical notes.
OK, How Do I Inline?
Simply pass
--inline
when doing a bulk export, to have the ETL start an inline job after the export finishes:Or use the new
inline
command to work on already-exported data:Using the default settings, these above commands will both inline any DiagnosticReport and DocumentReference resources, downloading text, HTML and XHTML attachments. There are flags to control which resources are examined and which mimetypes are downloaded. See
--help
for details.Beta Was this translation helpful? Give feedback.
All reactions