Kick Out Java Fundamental: How to improve DICOM

While DICOM (Digital Imaging and Communications in Medicine) provides a necessary format for handling device connectivity, it could be improved by addressing the following concerns.

DICOM is based on legacy serial communications

DICOM originated as a way to standardized streaming imaging data, such as a series of ultra sound images. As such, the structure of the format supports assumes that the length of tags, and the file as a whole, cannot always be determined. Which is of course untrue since the length of a block, tag, or a file can certainly be computed somewhere along the line. For example, the device acquiring the data no doubt buffers data as it is collected, in which case, it can compute the length. I do consent that the total the number of “frames” collected during a “live session” can be unknown, for example, the total number of frames collected by an ultra-sound device while live-streaming. The size of each frame can be determined in a similar to knowing slices collected by a CT scanner.

DICOM is NOT object oriented

DICOM was created at a time when object-oriented programming (OOP) was itself being developed. As such, DICOM does not adhere to now well-developed OOP structures making it difficult to implement the DICOM specification in object-oriented programming languages such as C++. The developer must resolve the gap between an old-style file format specification based on a complex header-body-footer format as opposed to something as well-structured and easily parsable as XML, for example.

DICOM splits a single 3D volume scan into hundreds of 2D files

High-resolution CT and MRI scans are split into several hundred files. This is because each scan is stored as a stack of 2D DICOM image files. DICOM doesn’t support a true 3D volume in a single file. This makes file management cumbersome for volumetric data. As 3D data has become more and more important over the last few decades, new file formats for working with 3D data have become more prominent like NifTi, ANALYZE, and others. These formats allow all image slices to be stored in a single file, reducing the number of files significantly.

DICOM uses lossy compression on medical images

Image data often requires substantial storage and, when transmitting over a network, can require significant time as well. As such, DICOM incorporated various compression schemes in an effort to reduce the required amount of storage and time for image data. The problem, however, is that DICOM included “lossy” compression algorithms that in effect reduce image quality in an effort to save space and time. This is a problem when you consider that the DICOM is manage a persons medical images. It is entirely possible that the compression scheme may obscure a tumor within a CT scan causing the radiologist to miss a diagnosis. Medical image data should never be compressed with a scheme that “loses” data. Lossy compression has since been removed from DICOM, but it makes me wonder how it even made it into the specification.

DICOM duplicates patient info for each image file

There is a significant amount of redundancy within DICOM. Each 3D image slice in a 3D volume duplicates all patient information. If all of the slices were stored in a single file, a single instance of patient data would suffice.

DICOM changes compression and decompression mid-stream

The DICOM specification supports the notion of changing compression schemes mid-stream. This is due to the origins of DICOM from streaming devices. However, having a file that changes compression arbitrarily makes for difficult software implementations and compromises efficiency.

DICOM uses variable sized block headers

Again, as a result of being based on legacy streaming devices, DICOM data block headers support variable sizes. This makes software implementations complications and again compromises efficiency.

DICOM uses interpret-as-you-go methodology

The DICOM specification requires an “interpret as you go” methodology in that you must constantly read-and-decode each block of data sequentially. This is again due to the origins from a streaming architecture. In many cases, this methodology requires the file pointer to back track in order to read properly. Back-tracking reduces efficiency on hard disks since that technology is designed to efficiently read and write data in large blocks. Moving the file-pointer back-and-forth reading small amounts of data is very inefficient.

A format that let’s you put any thing into a block of data is not a standard

DICOM allows a data-producer to insert any kind of data into private tags. This is analogous to inserting any kind of file into a ZIP file. Once you open up the ZIP file, you still need to know how to interpret the data. As such, even though DICOM is a “standard”, it is still possible that two “DICOM compliant” devices may in fact not be capable of communicating with each other.

DICOM supports every data type under the sun, but only a few are used in practice

DICOM wanted to support all kinds of medical image data. Even that which came in the future. As such, it supports the storage of all data types. The problem is that in practice, it is extremely complicated to write software that can read all types of data. Further, consider that most image data is limited to only a handful of data types (such as shorts or unsigned shorts), it makes no sense to support everything else. Supporting all data types makes DICOM reader implementations complex and inefficient.

DICOM reports are crammed into images

In practice, reports (i..e, radiology reports) are crammed into DICOM compatible image data. DICOM was envisioned at a time before formats such as PDF and DOC were robust and popular. This makes encoding, decoding, and longevity of reports complicated and inefficient.

DICOM is very difficult to implement in software

Taking all these problems into consideration, DICOM is a beast to implement in software. There are so many pitfalls that a robust implementation is almost a pipe dream. Any implementation must include substantial quality assurance testing to insure robustness.

Summary

Despite many “problems,” DICOM is widely used and has provided substantial benefit to patients, physicians, and healthcare organizations. However, I wonder how much better healthcare could be if these problems were eliminated from DICOM?

Kick Out Java Fundamental

Thursday, 26 September 2013

How to improve DICOM