Mine That Record

November 26, 2009

This isn’t that easy

Filed under: Exploring the depths — Dr. H @ 12:49 am

I spent most of the afternoon trying to get one measurement from our database for patients that take a certain medication. Since our data warehouse (i2b2) does not support this I had to delve into a backup of the SQL database that hosts the original records.

The schema for our clinical database is reasonable and a person with a working knowledge of SQL and some relational database experience like myself can grasp it quickly. However, the way the schema is used is flabbergasting. And I’m not completely sure  if it’s our installation, our users, or the end-user software. But for example, when people enter a medication order erroneously, I would expect one of two sensible things to happen:

  1. The medication order is deleted and an appropriate entry reflecting this is added to the audit log (I’m not sure if our EHR supports this), or
  2. The medication order is followed by an immediate cancellation order (which our EHR supports).

However, our clinical system in its infinite wisdom stores this occurrence as discontinuation orders. Yes, those that mean “Mrs. Smith? You know that medication you are taking? Please stop taking it.” So we have patients taking drugs legitimately and then stopping for infinitesimal, or 0, amounts of time. So instead of canceling orders we have to do date arithmetic in SQL. And hope that we are interpreting what we see in the data tables correctly.

Don’t get me started on the measurements. “The measurement’s wrong? Why, let’s just enter a new measurement and leave the wrong one there! Flags and deleting are for sissies!”

November 25, 2009

MetaMap Tools update

Filed under: Code,Metamap — Dr. H @ 8:36 pm

My first edition of metamap_tools had several problems and inefficiencies. I’ll keep updating it as I use it and refine it. It currently creates a process every 500 lines and instead of keeping track of line IDs itself relies on MetaMap to do so.

The current edition is much, much less prone to die suddenly from a broken pipe and in fact I haven’t seen it die… yet.

 

Automated Retrieval Console

Filed under: Classification — Dr. H @ 1:37 am

The Automated Retrieval Console (ARC) from MAVERIC looks like an incredibly promising tool. It does not do anything revolutionary, but it automates a common and tedious clinical data warehouse task: creating classifiers to find records based on clinical text.

It allows the developer to isolate the clinical task, i.e. labeling the original records, and trains a battery of classifiers simultaneously so the developer/informatician can choose the best one easily. I’m looking forward to implementing this on our data warehouse. I believe it will reduce the workload of our warehouse team while simultaneously improving their productivity.

November 23, 2009

MetaMap tools

Filed under: Code,Metamap — Dr. H @ 6:20 pm

MetaMap is my bread and butter. I use it every day, all day, thanks to the wonderful folks at the NLM. You can use MetaMap through the SKR server at the NLM , which I do regularly.

Unfortunately the SKR server does not always work, and the machines that process citations can be slow. That’s why I try to process my own text as much as I can. I use an Ubuntu Linux installation in a VMWare virtual machine that has four CPU cores and 2 GB of RAM allocated on my Mac Pro.

My usual workflow produces long lists of sentences to be passed to MetaMap. I therefore use the single line delimited input with ID NLM Scheduler format a lot. The downloadable version of MetaMap can’t handle that format. I therefore wrote my own mini-scheduler that understands the format and controls a variable number of MetaMap instances to process the records as fast as it can. It’s written in Python 2.6 and is quite simple and easy to modify, so I’m publishing it under the GPL because someone, somewhere, might have a use for it.

Modify the following line in the script to point to your MetaMap installation:

METAMAP_BINARY="/opt/public_mm/bin/metamap08 -iDN"

The metamap_tools package is available from github.

Mine That Record!

Filed under: General musings — Dr. H @ 5:20 pm

Welcome to Mine That Record! I am a health informatics researcher working on using patient data to perform high-quality clinical research on a large scale. As it turned out, this is surprisingly difficult to do.

I plan to use this blog to chronicle my adventures and misadventures in actually carrying this out.

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.