Uncategorized
Josh Cooper
 Pub. Date: 2024.04.08


Introduction

Digital Imaging and Communications in Medicine (DICOM) is a technical standard for the digital storage and transmission of medical images and related information. If a patient receives a mammogram, the image made by the imaging machine is saved and transmitted along with their name, age, and various other pieces of personally identifying information.

In the transmission and storage of DICOM images Orthanc can be used as a free and open-source lightweight server. One of the excellent features of Orthanc is that it has developed an API to enable the development of plugins. Plugins of course being a very common way of extending or modifying the features of software.

In medical research researchers must follow privacy laws set forth in tthe HIPAA Privacy Rule. This means Patient Health Information (PHI) is required by law to be scrubbed from such images before they can be utilized in any form of research. One form of research that necessitates the use of images like these in great quantities is that of AI research wherein hundreds of thousands of images, if not millions, are needed to adequately train a neural network.

Thus it falls upon researchers to remove PHI from DICOM images in some fashion or another. This is where our Orthanc Filter plugin enters, because up until then it fell upon Dr. Rajapakshe to manually scrub data from thousands of files. Now with a little configuration by way of json file Dr. Rajapakshe can automate this process and save himself hundreds of hours of work that can be spent on things more meaningful.

 

Description

This project was a capstone project at UBC between myself and, largely, two other developers who know backend development quite well. Brian Zhou and Iwan Levin took on designing relational database models, and writing the SQL necessary for integrating researcher requirements. I largely dealt with all other development tasks ranging from the design of software architecture and CMake files to implementing code that hard linked DICOM images across directories for easy access according to particular image data such as a truncated date of birth.

The plugin allows medical researchers to configure a json file, housed by a system running the Orthanc server software, in order to specify which DICOM data to erase and what dates to truncate and how to truncate them. There are additional settings that can be utilized in order to control the organization of files on the storage medium.

Objectives

  • Develop a well tested C++ plugin for Orthanc servers.
  • Implement features for date truncation, PHI removal, and filesystem organization.
  • Provide solid documentation (both user & developer) and an easily extensible codebase

Highlights

Successes

  • A robust and flexible architecture that allowed development to proceed easily with minimal refactoring as user requirements were amended.
  • Unit testing that was both modular and comprehensive, and proved quite valuable when a few small refactors were necessary.
  • Extremely fast processing speeds through clever data processing and DICOM tag matching in a simple buffer processing loop.

Challenges

  • Simple math, apparently. An overcomplication of the calculations involved in moving through the buffer and figuring out its size resulted in a segmentation fault that due to other time constraints (i.e. exams & holidays) took roughly a month and a half to find.
  • Team management. There was a fourth team member that, for still unknown reasons, wouldn’t contribute to development or communicate with other members. I tried everything I could think of to communicate and provide support in order to help him contribute. All attempts failed. He provided rubric-grade project management – i.e. meeting minutes and reports to the professor.
  • Obtaining code reviews on pull requests from team members; other members were new to C++ and opted to defer to me on what was good or best or problematic.. etc.

Results

An extremely fast, configurable, and end-user-friendly system for researchers to utilize for the automatization of redacting PHI in DICOM images as they are received/generated.