|J Pathol Inform 2019,
Dual-Personality DICOM-TIFF for whole slide images: A migration technique for legacy software
David A Clunie
Pixelmed Publishing, LLC, Bangor, Pennsylvania, USA
|Date of Submission||02-Dec-2018|
|Date of Acceptance||06-Mar-2019|
|Date of Web Publication||03-Apr-2019|
Dr. David A Clunie
Pixelmed Publishing, LLC, Bangor, Pennsylvania 18013
Source of Support: None, Conflict of Interest: None
| Abstract|| |
Despite recently organized Digital Imaging and Communications in Medicine (DICOM) testing and demonstration events involving numerous participating vendors, it is still the case that scanner manufacturers, software developers, and users continue to depend on proprietary file formats rather than adopting the standard DICOM whole slide microscopic image object. Many proprietary formats are Tag Image File Format (TIFF) based, and existing applications and libraries can read tiled TIFF files. The sluggish adoption of DICOM for whole slide image encoding can be temporarily mitigated by the use of dual-personality DICOM-TIFF files. These are compatible with the installed base of TIFF-based software, as well as newer DICOM-based software. The DICOM file format was deliberately designed to support this dual-personality capability for such transitional situations, although it is rarely used. Furthermore, existing TIFF files can be converted into dual-personality DICOM-TIFF without changing the pixel data. This paper demonstrates the feasibility of extending the dual-personality concept to multiframe-tiled pyramidal whole slide images and explores the issues encountered. Open source code and sample converted images are provided for testing.
Keywords: Digital Imaging and Communications in Medicine, Tag Image File Format, whole slide imaging
|How to cite this article:|
Clunie DA. Dual-Personality DICOM-TIFF for whole slide images: A migration technique for legacy software. J Pathol Inform 2019;10:12
| Introduction|| |
Whole slide imaging (WSI) entails scanning an entire microscope slide at high spatial resolution and encoding the result in a manner that is amenable to viewing by means of software that simulates the use of an optical microscope. As WSI becomes more popular, significant interoperability issues have arisen as scanner manufacturers and software developers continue to depend on proprietary formats, even if they are based on commonly known formats such as Tag Image File Format (TIFF). The Digital Imaging and Communications in Medicine (DICOM) standard was extended to support WSI in 2010,,,,,, and recently, demonstrations of the interoperability of commercial implementations of DICOM WSI have begun. However, the incorporation of support for the DICOM WSI format in receiving applications, especially those used for research, has been sluggish, perhaps because the underlying libraries used by those applications, do not yet support the DICOM WSI format.
A little-known feature of the DICOM File Format is that it was specifically designed to allow its pixel data content to be shared with another file format, by allowing for an unspecified preamble before the DICOM content begins. This mechanism can be used to create DICOM WSI files that are also TIFF WSI files, but which share the same compressed pixel data bytes encoding the tiles that compose the image. Since both TIFF and DICOM support the use of the same compression schemes, such as traditional baseline lossy JPEG, and a similar tiling scheme, the compressed pixel data shared in each tile of a dual-personality DICOM-TIFF can be identical to the compressed pixel data of a source TIFF image, thereby permitting transcoding of the format without further loss than present in the original source.
Thus, the problem of a receiving system being unable to read an incompatible multiframe-tiled compressed image file format can be solved by creating a single file that is formatted as both the incompatible image file format and the compatible image file format, sharing the same image pixel data.
This approach allows a scanner vendor or conversion software implementer to save the very large high-resolution layer of the WS image only once, in both formats at the same time, and thus gain all of the benefits of using the DICOM format without sacrificing compatibility with the installed base of TIFF-based software.
| Technical Background|| |
TIFF has evolved from its earliest public release as a desktop publishing format to become a relatively mature and widely implemented format, used for many professional and nonmedical applications. Its current specification has been stable for many years and includes the tile-based pixel data representation that is fundamental to WSI applications, even though it is not included in the baseline requirements for TIFF implementations. The extensions relevant to WSI are the BigTIFF extension and the updated JPEG encoding., Like DICOM, TIFF is encoded as name–value pairs using defined “tags” with a standard interpretation, with the opportunity to add additional proprietary tags. TIFF allows for more than one image to be present in a single file. The images are described by image file directories (IFDs), which contain lists of tags and values, some of which are byte offsets from the start of the file to other content, including compressed or uncompressed pixel data.
TIFF, and particularly TIFF extended with larger than 32-bit byte offsets (BigTIFF), has proven popular as the basis for various vendor's proprietary formats. For example, the Leica/Aperio SVS format is based on TIFF/BigTIFF, and SVS files are valid TIFF files. The common pattern among different vendors using TIFF has been to use the tiled rather than stripped mechanism for encoding the pixel data and to encode multiple images within the same file, one being the base (high resolution) layer and the others being other layers of the pyramid or images for other purposes, such as slide labels., Numerous authors and libraries have attempted to define a set of “TIFF rules” (profiles) for constraining some of the choices that may be made, such as whether or not to use IFDs or sub-IFDs for lower resolution layers of the usual pyramid, how many pyramid layers to include, in what order, and with what ancillary information. It seems that none of these have really been successful in lieu of the scanner vendors' own proprietary choices.
When DICOM was initially released in 1993, extending the earlier ACR-NEMA standards, images were expected to be interchanged using a dedicated network protocol. There was no “file format” defined per se, although it had already become common practice to persist network data sets on disk., The first formal DICOM file format standard was published in 1995, and initially targeted cardiology and ultrasound applications.,,,, The ultrasound community in particular had already dabbled with storing images on interchangeable media,, and a TIFF-based format called DEFF had been defined., Given this investment, the vendors of ultrasound devices were interested in creating DICOM files that were compatible with the TIFF and QuickTime files that they were already creating. Accordingly, the vendor's representatives proposed a mechanism that used a preamble of unspecified content (128 bytes in length) at the start of the file before the DICOM content began and a means of hiding non-DICOM content within a standard data element, with the intention of sharing the pixel data;, this became a feature of the DICOM PS3.10 file format when it was released. However, documentation of this feature was sparse. Implementations were few, and those were restricted to single frame uncompressed files, as far as this author is aware. Early versions of the author's dicom3tools utilities were able to add a TIFF header when writing a DICOM file. Such files were created and publicly distributed, for example, for the JPEG 2000 medical image test data set. Others have also occasionally referenced this capability.,,
For WSI, additional aspects of the DICOM encoding are relevant. DICOM WSI images are always multiframe (with each frame corresponding to a tile) and each frame is always compressed, unlike typical radiology DICOM images, which are often exchanged as single-frame images, and often without compression. Fortunately, each WSI compressed frame is usually encoded as a single fragment, i.e., a single contiguous range of bytes. This is exactly the same encoding as TIFF uses for tiled image encoding. The same compression schemes such as baseline DCT Huffman encoded JPEG or JPEG 2000 are usually used. The means of encoding the DICOM header is exactly the same regardless of the image type. Unlike TIFF, DICOM does not use physical byte offset pointers to organize its data elements or pixel data.
A TIFF file begins with the first bytes of a file, but only a relatively small amount of information is required in a fixed location at the start of the file, and this can be encoded before the required DICOM content begins after 128 bytes. The initial TIFF content defines physical byte offsets to the remaining TIFF information, which can be distributed anywhere in the file, including after the DICOM content. DICOM defines a data element, Data Set Trailing Padding (FFFC, FFFC), which is specifically intended for the purpose of encoding such non-DICOM content.
| Approach|| |
TIFF files may be encoded with one or more images, and each image may be encoded as strips or tiles. To create tiled WS images where the tiles are shared between the TIFF and DICOM representations, the tiled approach illustrated in the third column of [Figure 1] is used. The first column of the figure illustrates the approach that has been used in the past to encode typical single frame radiology images, although usually only one strip is used. The second column shows how multiframe compressed or uncompressed images that are not tiled can be encoded using the TIFF strip mechanism; this can be used for most of the other types of images for which DICOM is used.
|Figure 1: Logical organization of TIFF data structures referencing DICOM pixel data for single uncompressed frames as strips, multiframe compressed or uncompressed images as strips, and tiled multiframe compressed or uncompressed images as tiles. DICOM: Digital Imaging and Communications in Medicine, TIFF: Tag Image File Format, IFD: Image file directory|
Click here to view
For clarity, the manner in which the entire file is created is illustrated first for a single-frame image in [Figure 2]. The first column of the figure shows a pure DICOM file encoding, with no TIFF content, and in particular illustrates the empty 128-byte preamble and the absence of the Data Set Trailing Padding data element. DICOM-aware software will recognize that the file is a DICOM file based on the “DICM” bytes beginning after the empty preamble, which is ignored. The second column illustrates the encoding of a dual TIFF-DICOM file when a single TIFF IFD will fit in the preamble, and no Data Set Trailing Padding data element is needed. The third column shows the use of the TIFF IFD Offset to point to the IFD “hidden” in the DICOM Data Set Trailing Padding data element.
|Figure 2: DICOM data element encoding of plain DICOM file, DICOM-TIFF image in which the IFD fits in the preamble, and DICOM-TIFF image in which the IFD is in the Data Set Trailing Padding and referenced from the TIFF-IFD Offset in the preamble. DICOM: Digital Imaging and Communications in Medicine, TIFF: Tag Image File Format, IFD: Image file directory|
Click here to view
[Figure 3] then shows the solution used for a WS image, in which the entries in the TIFF IFD in DICOM Data Set Trailing Padding are used to point to each of the compressed frames embedded within the DICOM Pixel Data element.
|Figure 3: DICOM-TIFF image for tiled multiframe compressed or uncompressed images, showing Data Set Trailing Padding TIFF IFD tile offsets referencing DICOM pixel data. DICOM: Digital Imaging and Communications in Medicine, TIFF: Tag Image File Format, IFD: Image file directory|
Click here to view
Given a source of tiles, whether they are produced directly by a scanner or read from some standard or proprietary file format, once can follow the steps outlined in the flow chart of [Figure 4]. Typically, WS images encoded in TIFF files follow some predetermined raster scan order corresponding to their location on the physical slide and are already sorted in an appropriate order. Usually, they will also have been lossy compressed with a standard algorithm that corresponds to a DICOM Transfer Syntax.
|Figure 4: Flow chart illustrating the process of creating dual-personality files, such as in DICOM-TIFF format, from a source of tiles. DICOM: Digital Imaging and Communications in Medicine, TIFF: Tag Image File Format|
Click here to view
| Variations|| |
The original proposals to DICOM to standardize WSI envisaged two opposite extremes. One approach was to encode each tile as a single DICOM file,, the method described in the Aperio patents., Alternatively, a single huge frame might be used for the entire image,, which could then take advantage of the JPEG 2000 wavelet domain multiresolution decomposition, and be accessed by the JPEG Interactive Protocol., An intermediate solution, using the DICOM multiframe representation to encode either all the tiles of the entire pyramid or all the tiles of a single resolution layer, was then proposed by the author as a compromise, making use of experience gained defining and using multiframe formats for multidimensional computed tomography (CT) and magnetic resonance (MR).
DICOM images have traditionally required that certain attributes of multiframe images are constant for all frames encoded in a single file. This includes the size of the frames as well as the physical size of the pixels with those frames. When WSI was added to DICOM, it was natural to separate each resolution layer of the WSI pyramid, each of which has a different physical pixel size, into separate files. This is the opposite of the approach used when vendors encode TIFF WSI files, which is to encode the entire pyramid in one file.
Therefore, when creating a dual-personality DICOM-TIFF WSI file, one has the choice to either encode only a single layer, with the expectation that a TIFF recipient will perform its own downsampling for viewing, or to find some place to encode the additional lower resolution layers, such that they are visible to a TIFF reader, but hidden from a DICOM reader (which expects separate files).
Preliminary experiments suggested that some TIFF WSI viewing tools failed when the lower layers of the pyramid were not present in the same file. Hence, as a proof of concept, a private DICOM data element was defined in which the additional layers could be encoded and made available as TIFF IFDs. The result is that the files are slightly bulkier than they would otherwise be.
Although the base TIFF standard specifies a mechanism for encoding tiles and permits multiple subimages within the same file, it does not describe how these images should be arranged as pyramids, other than to state that “the first one must be the full-resolution image.” As a consequence, various ad hoc conventions, as well as attempts at defining the “right” way to encode TIFF pyramids, have been described.,, None of these appears to have become dominant. The most commonly observed pattern is a list of IFDs starting with the base layer and an implicit order of decreasing size, with all layers other than the base layer designated with a NewSubfileType tag value of reduced resolution version, and without the use of trees defined by the SubIFDs tag. These approaches are notably different from the OMERO TIFF pyramid format, which instead encapsulates a single JPEG 2000-compressed bitstream, which contains a multiresolution decomposition. The resolution (microns per pixel [MPP], in WSI terms) is also not typically specified despite the existence of appropriate standard TIFF tags, and various libraries seem to deduce this in relative terms by heuristic means based on the order of the images and the number of pixels in each.
Ideally, one would not recompress the supplied pixel data if it is already compressed in a manner that involved loss, whether it was a consequence of deliberate quality reduction, for example, quantization of the transformed JPEG DCT coefficients or chrominance channel downsampling, or inadvertent loss incurred by the use of irreversible DCT transformation or color space conversion from RGB to YCbCr. The choices made by the original creator of the tiles should be reflected in the base (highest resolution) layer of the dual-personality DICOM-TIFF output. Consequently, a pyramidal, JPEG-compressed, tiled TIFF file can be converted without loss into new dual-personality DICOM-TIFF files, one for each input layer, by reusing the compressed bitstream, rather than decompressing and recompressing it.
The question then arises as to what choices to make when encoding de novo decimated pyramid layers, i.e., whether the same compression scheme should be used, what choices of quality factor or target bit rate, color space conversion, and chrominance downsampling should be made. If the tile source supplies ordinary JPEG baseline compressed data that has been color space converted, then these choices are relatively straightforward. However, some vendors, such as Leica/Aperio, have made a conscious choice not to color convert their JPEG compressed images, so the value of RGB for the TIFF PhotometricInterpretation tag in their SVS files really does mean that the components are RGB and not YCbCr; so, care needs to be taken that the transcoded DICOM file metadata reflects that in its Photometric Interpretation value. To communicate the choice of RGB to various JPEG codecs, it may also be useful to include an appropriate APP14 marker segment in the compressed JPEG bitstream. Decimated pyramid layers that are included in the same file do not need to follow that pattern and instead may use the more common YCbCr encoding.
An additional minor complication caused by the TIFF encoding of JPEG images is that for compactness, TIFF allows (but does not require) the tables that define various aspects of decompression that are common to all frames to be factored out., The individual strips or tiles can then be sent in so-called “Abbreviated Format.” DICOM requires the opposite and explicitly requires the so-called “Interchange Format,” i.e., with the tables, rather than the abbreviated format. Since the creator of the dual-personality file has complete control of the writing process, the Interchange Format can always be used, but this means that care must be taken when copying tiles from an existing TIFF file, i.e., to make sure to insert the tables if they are absent in the source.
Some problems arise producing a DICOM file in the first place if the source of tiles is an existing proprietary TIFF-based file. Although general conversion issues are beyond the scope of this paper, to produce a valid dual-personality file, one has to produce a valid DICOM file in the first place. Some mandatory information may be hard to obtain. Although the TIFF standard defines tags for some DICOM-required metadata, they are not often used. For example, the DateTime tag can be used to communicate the creation date and time, but it is usually absent. The same applies to device make and model, etc. Accordingly, it may be necessary to obtain this information out of band or resort to parsing unstructured or semistructured text that may be present in Image Description, for example.
Like DICOM, TIFF supports the presence of ICC Profile information in a specific tag, although unlike DICOM, it is a private tag, and it is not required to be supplied. Its use has been observed in some WSI scanner vendors images; so, the profile needs to be propagated into the appropriate DICOM data element value to achieve color consistency. In its absence, a default profile such as sRGB may be supplied, to at least achieve consistency subsequently, in the absence of information from the vendor. Care should be taken not only to propagate the information into the DICOM data element but also to re-create the TIFF tag in the dual-personality IFD.
| Procedure|| |
As a proof of concept, a Pure Java implementation of the process described was implemented. It makes use of existing functionality in the open source commercially reusable PixelMed Java DICOM toolkit. The new code reads existing TIFF WSI files and converts them to dual-personality DICOM-TIFF files that comply with the DICOM Visible Light Whole Slide Microscopy Image Storage SOP Class. The entire set of tiles provided in the source file is encoded in the order in which they are supplied, and the DICOM header describes them with a Dimension Organization of TILED_FULL since they are assumed to be nonsparse and in a predictable order.
For each image found in the source TIFF file, a new DICOM-TIFF file is written, i.e., any original pyramidal layers encountered in the source TIFF file are written as separate new files.
Each DICOM-TIFF written, however, also includes a newly created set of decimated pyramidal layers as described earlier, encoded within a private DICOM data element. The first TIFF IFD describes the base (highest resolution) layer for that file. Successive IFDs are written for each decimation and flagged with a standard NewSubfileType tag with a value indicating a reduced-resolution image. The dimensions (ImageWidth and ImageLength) make apparent the relative size of each successive layer. The X and Y Resolution tags are either:
- Given constant values of 1, with an unspecified type, defining square pixels of unknown size, as seems to be the common practice among the existing WSI TIFF file creators;
- Or, an assumed value, or a value extracted from any proprietary ImageDescription string, is provided for the base layer and then divided by two for each successive layer.
The first TIFF IFD is encoded using the Photometric Interpretation and JPEG bitstream as supplied, except as modified to include an APP14 segment and tables (Interchange Format), to avoid any loss caused by color space conversion or recompression. Subsequent layers are compressed de novo with baseline JPEG, YCbCr conversion, and chrominance downsampling since loss has already been incurred during decimation. This may result in a Photometric Interpretation that is different for the base layer than for the other layers. For example, converted RGB JPEG compressed SVS files may have a base layer of RGB and down-sampled layers of YBF_FULL_422.
Pixel data in the JPEG Abbreviated Format is converted to the JPEG interchange format. BigTIFF rather than TIFF files is created if necessary, i.e., if any file offsets cannot be encoded in 32 bits. An experimental extended offset table may be written in a private DICOM data elements as proposed in a recent DICOM CP.
To test the usability of the result dual-personality files in existing TIFF-aware WSI software, publicly available sample images were converted and then tested in several readily available free viewing tools:
- Sedeen version 5.2.3 from PathCore,,
- QuPath version 0.1.2, ± BioFormats extension v0.0.7
- OpenSlide Java demonstration viewer version 0.12.2
- Pathomation viewer version 2.0.1118.
This list of tools is not exhaustive, and in particular does not include viewers that may be available from scanner vendors that are designed primarily to view their own proprietary formats or viewers that require the installation of a server to support testing a viewer.
The re-encoding of the compressed pixel data in the source TIFF files into the dual-personality files involves reuse of the existing compressed bitstream, rather than decompression and recompression. Accordingly, the decompressed display appearance should be identical, qualitatively and quantitatively, to that of the original; so, no quantitative evaluations were performed.
| Results|| |
Since the objective of the experiment was only to establish that dual-personality files are feasible and explore the effect of different encoding strategies, rather than to produce a robust conversion tool, a single source of tiles from a single file was deemed to be sufficient. The tests were performed by converting the source TIFF file “http://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/CMU-1.svs.”
The results are summarized in [Table 1].
The OpenSlide viewer was able to display both DICOM-TIFF and DICOM-BigTIFF versions of the dual-personality files but was not able to open a pure DICOM file without the TIFF content, as expected. Zooming the images out maximally, however, caused a crash related to requesting a tile from the OpenSlide library that does not exist. The viewer worked regardless of the filename extension (“.dcm” or “.tif”).
The Sedeen viewer attempted to open the DICOM-TIFF file as a DICOM file since it was named with a “.dcm” extension and then failed to read it. If the DICOM-TIFF and DICOM-BigTIFF files were renamed as “.tif,” the behavior changed, and Sedeen displayed the images and recognized the pyramidal content as shown in the Image Properties dialog.
QuPath opened and displayed the DICOM-TIFF file but failed to open the DICOM-BigTIFF file, reporting that BigTIFF was not supported yet. Further, QuPath failed to correctly interpolate the supplied images during zooming, showing heavily pixelated images. Yet, it correctly displayed the source SVS TIFF image. This was despite being able to successfully handle a so-called “generic TIFF” version of the same file supplied by CMU, “http://openslide.cs.cmu.edu/download/openslide-testdata/Generic-TIFF/CMU-1.tiff,” which is encoded very similarly.
The Pathomation viewer opened all permutations of TIFF and BigTIFF files named “.tif” without complaint, and the metadata displayed that it had recognized them as tiled TIFF files, not DICOM files. When the same files were renamed as “.dcm,” the Pathomation viewer opened and displayed them, after a lag, presumably while it performed its own downsampling to create a pyramid since only base layer DICOM files were tested. The metadata indicated they had been read as DICOM files.
The viewers using libtiff, as well as libtiff-based utilities, report a warning, “JPEGFixupTagsSubsampling: Warning, Unable to auto-correct subsampling values, likely corrupt JPEG compressed data in first strip/tile; auto-correcting skipped.” This can be avoided by including the TIFF YCbCrSubSamplings Tag since indeed, the default values assumed by libtiff (of 2) are incorrect for the sample image tested (which does not have any chrominance down-sampling in the base layer). This works with OpenSlide viewer; but, the addition of this tag actually causes Sedeen to fail.
The OpenSlide, Sedeen and Pathomation viewers recognized the pyramidal structure regardless of the presence or absence of appropriate values for XResolution and YResolution. This was tested with meaningful values for each layer (with a ResolutionUnit in centimeters), for a constant value of 1 with unspecified units, and without the resolution-related units being included at all.
In addition to being able to display and manipulate the loaded image, some viewers can also make use of the image metadata. For example, Sedeen can report such values as Pixel Size and Magnification to the user. If the resolution-related TIFF tags are correctly populated in the dual-personality DICOM-TIFF image (e.g., by extracting the MPP values from an SVS proprietary ImageDescription tag value), the Sedeen displays appropriate values (0.499 MPP and ×20 in the case of the “CMU-1.svs” test file), but not otherwise.
As expected, since the compressed JPEG bitstream is not changed during conversion, no qualitative differences were observed in the displayed appearance of the source TIFF image and the converted dual-personality DICOM-TIFF images.
| Discussion|| |
Although only a small number of TIFF WSI viewers were tested, the experiment was relatively successful. With further exploration, possibly including inspection of the open source code of viewers and libraries such as OpenSlide, subtleties related to the optimal encoding of the TIFF IFDs for maximum interoperability could no doubt be achieved. Closed and proprietary applications have not yet been tested. Open source applications that require nontrivial server installation and configuration, such as caMicroscope,, the digital slide archive,, and OMERO iViewer were also not tested, although this would undoubtedly be worthwhile. The expectation is that if the current approach works with at least one OpenSlide-based tool, it will likely work with others. The use of a single file as a single source of tiles was sufficient for the purpose of the experiment, but obviously, a greater variety of files of different flavors from different scanner vendors would be needed to validate a robust conversion tool for operational use.
The approach of encoding TIFF with successive IFDs, each encoding down-sampled layers of the pyramid, together with accompanying resolution information in standard tags, was the most successful. If additional meta-data is available, there are various proposed mechanisms for including it in TIFF tags, and this is compatible with the dual-personality approach as long as a separate file (e.g., of XML metadata) is not required. The DICOM attributes can of course already encode an extensive collection of WSI-related metadata in a standard manner. The matter of whether and how to include a label image in the TIFF content, as is done with some proprietary formats like SVS, has not yet been explored.
Although the use of dual-personality files in DICOM is rare, there are few barriers to its use for the WSI application.
When DICOM images are transported using the traditional DICOM network storage protocols,, the file meta information and preamble are removed, discarding the additional TIFF information. However, it can be recreated by the receiving software from the DICOM metadata. Indeed, any software receiving DICOM WSI images can theoretically be updated to add the additional dual-personality TIFF information, even if it was not present before the image was sent. This issue is not encountered with the DICOMweb storage and retrieval services (STOW-RS and WADO-RS),,, which are capable of sending the entire binary PS3.10 file including the TIFF preamble.
To avoid duplication of the entire image content, the approach depends on the organization of the pixel data being the same in both formats, i.e., that tiles be used, and that the same compression scheme is supported. This has been demonstrated for the typical baseline JPEG-compressed tiles usually encountered. It can also be applied when JPEG 2000 compression is used within each tile, although there is greater uncertainty about how JPEG 2000 should be used within TIFF (i.e., there is no official TIFF document or standard TIFF tag). The use of whole image JPEG 2000 is not the approach that DICOM elected to standardize for the WSI application.,, Using JPEG 2000 on the whole image would not allow for a dual personality DICOM-TIFF file, unless the TIFF format and libraries were also extended to support this pattern of use. On the other hand, the tiled approach to lossless encoding of dual-personality files is possible, since most JPEG 2000 codecs support both reversible and irreversible decompression, as does DICOM. Although there are JPEG lossless schemes, they are rarely supported in the common codecs and would not likely be handled by TIFF libraries. JPEG-LS has also been considered for WSI applications, but there is no standardized support for it in TIFF.
It is unfortunate that additional space is required to store the lower levels of the pyramid “hidden” within each DICOM file to support TIFF viewers; the same layers also need to be stored in separate DICOM files to satisfy DICOM viewers. This expands the space required from an additional 30% to the base layer for one encoding, to an additional 60% for both forms. This penalty may be a reasonable tradeoff to achieve greater interoperability, but it is understood that there is considerable sensitivity among users to storage costs for WSI. This could be mitigated by stripping off the TIFF components for long-term archival, as well as discarding the lower layer DICOM files. Both could be recreated on demand when needed in the future. It should also be noted that in the current implementation, the lower layers, if added, are created de novo by relatively crude downsampling as a proof of concept. If the original source of the data is a lossy compressed pyramidal TIFF file, a more sophisticated implementation could extract and reuse any lower levels of the pyramid that are present, rather than recreating them by downsampling. This would result in a mathematically lossless conversion of the entire pyramid as is already true of the base layer.
The idea of multipersonality files is not unique to DICOM. In the space exploration community, the Video Image Communication and Retrieval (VICAR) and Planetary Data System (PDS) file formats are used, and a large body of software has been developed. PDS files have the ability to encapsulate the VICAR format's label and share the same bulk data., However, that implementation requires existing software to be modified to find the old format buried within the new., This is different from the DICOM approach, which works with completely unmodified TIFF software, and is the problem that the solution described in this paper seeks to address.
In theory, the dual-personality approach described for DICOM and TIFF is not limited to those formats. As described earlier, the original DICOM file format proposal also envisaged the use of the Apple QuickTime format. The principle seems to be limited to file formats that (a) have recognition mechanisms separately located in the file, (b) either or both allow for the use of byte offsets to locate organizing structures and pixel data, and (c) can share the same contiguous ranges of bytes for pixel data, including sharing the same compression schemes (or encoding of uncompressed pixel data).
The utility of the dual-personality DICOM-TIFF approach is, of course, time-limited. It is expected that there will soon come a day when all relevant libraries and software, commercial or open-source, will have been updated to use the standard DICOM-tiled WSI format. Slide scanner vendors will produce DICOM natively “inside,” without any proprietary intermediate format, just as CT, MR, and US scanner vendors do today. In the interim, use of the dual-personality approach allows a hybrid configuration to be used, without requiring abandoning existing useful tools, or worse, deferring adoption of DICOM and its many benefits. Despite continued negativity about DICOM from some parts of the community and attempts to reinvent the wheel by developing competing approaches, to the extent that those are TIFF-based, the dual-personality approach may represent a useful compromise.
Thanks to Lawrence Tarbox, Aaron Waitz and Harry Solomon for reviewing the preliminary manuscript, as well as Angelos Pappas and Marcial Garcia Rojo for giving feedback on the readability of converted images, and Andreas Vendeland and Peter Bankhead for doing both.
| Conclusion|| |
The slow progress toward the ubiquitous use of DICOM for WSI encoding can be mitigated by the creation of dual-personality DICOM-TIFF files that are compatible with the installed base of TIFF-based software, yet which offer all the benefits of the standard format. Furthermore, dual-personality DICOM-TIFF files can be created without compromising the fidelity of the pixel data, if the source compressed pixel data bitstream for each tile is a DICOM-supported scheme such as baseline lossy JPEG or reversible or irreversible JPEG 2000.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
National Electrical Manufacturers Association. Digital Imaging and Communications in Medicine (DICOM) Standard PS3. Available from: http://www.dicomstandard.org/current/
. [Last accessed on 2018 Dec 02].
DICOM Standards Committee. DICOM Supplement 145 - Whole Slide Microscopic Image IOD and SOP Classes. National Electrical Manufacturers Association; 2010. Available from: ftp://medical.nema.org/medical/dicom/final/sup145_ft.pdf. [Last accessed on 2018 Dec 02].
Singh R, Chubb L, Pantanowitz L, Parwani A. Standardization in digital pathology: Supplement 145 of the DICOM standards. J Pathol Inform 2011;2:23.
] [Full text]
Rojo MG, Sánchez A, Bueno G, de Mena D. Standardization of pathology whole slide images according to DICOM 145 supplement and storage in PACS 13 th
European Congress on Digital Pathology. Berlin, Germany; 27 May, 2016. Available from: http://www.globalengage.co.uk/pathology/docs/Rojo.pdf
. [Last accessed on 2018 Dec 02].
Herrmann MD, Clunie DA, Fedorov A, Doyle SW, Pieper S, Klepeis V, et al.
Implementing the DICOM standard for digital pathology. J Pathol Inform 2018;9:37.
] [Full text]
Marques Godinho T, Lebre R, Silva LB, Costa C. An efficient architecture to support digital pathology in standard medical imaging repositories. J Biomed Inform 2017;71:190-7.
Clunie D, Hosseinzadeh D, Wintell M, De Mena D, Lajara N, Garcia-Rojo M, et al.
Digital imaging and communications in medicine whole slide imaging connectathon at digital pathology association pathology visions 2017. J Pathol Inform 2018;9:6.
Goode A, Gilbert B, Harkes J, Jukic D, Satyanarayanan M. OpenSlide: A vendor-neutral software foundation for digital pathology. J Pathol Inform 2013;4:27.
] [Full text]
National Electrical Manufacturers Association. Digital Imaging and Communications in Medicine (DICOM) Standard PS3. Rosslyn, VA; 1993. Available from: ftp://medical.nema.org/medical/dicom/1992-1995/. [Last accessed on 2018 Dec 02].
Tesche G. SPI: A PACS interface specification. Med Inform (Inform Health Soc Care) 1988;13:281-8.
Ratib O, Hoehn H, Girard C, Parisot C. PAPYRUS 3.0: DICOM-compatible file format. Med Inform (Lond) 1994;19:171-8.
DICOM Standards Committee. DICOM Supplement 1 - Media Storage and File Format for Media Interchange. National Electrical Manufacturers Association; 1995. Available from: ftp://medical.nema.org/medical/dicom/final/sup01_ft.pdf. [Last accessed on 2018 Dec 02].
DICOM Standards Committee. DICOM Supplement 2 - Media Storage Application Profiles. National Electrical Manufacturers Association; 1995. Available from: ftp://medical.nema.org/medical/dicom/final/sup02_ft.pdf. [Last accessed on 2018 Dec 02].
DICOM Standards Committee. DICOM Supplement 3 - Media Format and Physical Media Interchange. National Electrical Manufacturers Association; 1995. Available from: ftp://medical.nema.org/medical/dicom/final/sup03_ft.pdf. [Last accessed on 2018 Dec 02].
DICOM Standards Committee. DICOM Supplement 4 - X-Ray Angiographic Image Objects and Media Storage. National Electrical Manufacturers Association; 1996. Available from: ftp://medical.nema.org/medical/dicom/final/sup04_ft.pdf. [Last accessed on 2018 Dec 02].
DICOM Standards Committee. DICOM Supplement 5 - Ultrasound Application Profile, IOD and Transfer Syntax Extension. National Electrical Manufacturers Association; 1995. Available from: ftp://medical.nema.org/medical/dicom/final/sup05_ft.pdf. [Last accessed on 2018 Dec 02].
Hall FM, Hranek GA, Kreuzer LB, McNary LT, Rabold MJ, Rock DA, et al
. Ultrasound image information archiving system. US Patent 6,253,214; 2001. Available from: http://www.google.com/patents/US6253214
. [Last accessed on 2018 Dec 02].
Waitz AS, Bono JE, Lincoln RL, Lowery JH, Connell WL, Jacobson JR, et al
. Ultrasonic image data formats. US Patent 5,636,631; 1997. Available from: http://www.google.com/patents/US5636631
. [Last accessed on 2018 Dec 02].
Waitz A. Re: DICOM file Preamble and DEFF, QuickTime. Personal Communication; 16 April, 2018.
Kreuzer L. Re: DICOM file preamble and DEFF, QuickTime. Personal Communication; 16 April, 2018.
Fauquex J. Color management for DICOM images considered as TIFF 16. J Display Technol 2008;4:410-4.
ISO. ISO/IEC 10918-1 Information Technology - Digital Compression and Coding of Continuous-Tone Still Images: Requirements and Guidelines; 1994.
ISO. ISO/IEC 15444-1 Information Technology - JPEG 2000 Image Coding System: Core Coding System; 2016.
Christopoulos C, Skodras A, Ebrahimi T. The JPEG2000 still image coding system: An overview. IEEE Trans Consum Elec 2000;46:1103-27.
Tuominen VJ, Isola J. The application of JPEG2000 in virtual microscopy. J Digit Imaging 2009;22:250-8.
ISO/IEC 15444-9: 2005 Information Technology - JPEG 2000 Image Coding System: Interactivity Tools, APIs and Protocols; 2005.
Tuominen V, Isola J. Linking whole-slide microscope images with DICOM by using JPEG2000 interactive protocol. J Digit Imaging 2010;23:454-62.
International Colour Consortium. Specification ICC.1:2004-10 (Profile version 18.104.22.168). Image Technology Colour Management - Architecture, Profile Format, and Data Structure. Available from: http://www.color.org/icc1V42.pdf
. [Last accessed on 2018 Dec 02].
DICOM Standards Committee. DICOM CP 1713 - More compact use of Per-Frame Functional Group Macros in Non-Sparse VL Whole Slide Microscopy Image IOD. National Electrical Manufacturers Association; 2018. Available from: ftp://medical.nema.org/medical/dicom/final/cp1713_ft2_WSIPerFrameFunctionalGroupMacro.pdf. [Last accessed on 2018 Dec 02].
DICOM Standards Committee. DICOM CP 1818 - Large Compressed Images may have More Frames than fit in the Basic Offset Table. National Electrical Manufacturers Association; 2018. Available from: ftp://medical.nema.org/medical/dicom/cp/cp1818_lb_whenoffsettabletoosmall.pdf. [Last accessed on 2018 Dec 02].
Hosseinzadeh D, Shojaii R, Martel AL. Selective Decoding for Digital Microscopy Images Using the Sedeen Viewer. Pathology Informatics Conference; 2010.
Martel AL, Hosseinzadeh D, Senaras C, Zhou Y, Yazdanpanah A, Shojaii R, et al.
An image analysis resource for cancer research: PIIP-pathology image informatics platform for visualization, analysis, and management. Cancer Res 2017;77:e83-6.
Bankhead P, Loughrey MB, Fernández JA, Dombrowski Y, McArt DG, Dunne PD, et al.
QuPath: Open source software for digital pathology image analysis. Sci Rep 2017;7:16878.
QuPath - Open Source Software for Digital Pathology. Available from: http://qupath.github.io/
. [Last accessed on 2018 Dec 02].
Sucaet Y, Pappas A, Wim W. Free Whole Slide Image Viewer - PMA.Start | Universal Digital Microscopy Software; 2018. Available from: http://free.pathomation.com/
. [Last accessed on 2018 Dec 02].
Saltz J, Sharma A, Iyer G, Bremer E, Wang F, Jasniewski A, et al.
A containerized software system for generation, management, and exploration of features from whole slide tissue images. Cancer Res 2017;77:e79-82.
Gutman DA, Khalilia M, Lee S, Nalisnik M, Mullen Z, Beezley J, et al.
The digital slide archive: A software platform for management, integration, and analysis of histology for cancer research. Cancer Res 2017;77:e75-8.
Bidgood WD Jr., Horii SC, Prior FW, Van Syckle DE. Understanding and using DICOM, the data interchange standard for biomedical imaging. J Am Med Inform Assoc 1997;4:199-212.
Genereaux BW, Dennison DK, Ho K, Horn R, Silver EL, O′Donnell K, et al.
DICOMweb™: Background and application of the web standard for medical imaging. J Digit Imaging 2018;31:321-6.
Kalinski T, Zwönitzer R, Roßner M, Hofmann H, Roessner A, Guenther T. Digital imaging and communications in medicine (DICOM) as standard in digital pathology. Histopathology 2012;61:132-4.
Levoe SR. Personal Communicationl; 07 May 2018.
Deen RG. Personal Communication; 07 May 2018.
[Figure 1], [Figure 2], [Figure 3], [Figure 4]
|This article has been cited by|
||MITI minimum information guidelines for highly multiplexed tissue images
| ||Denis Schapiro, Clarence Yapp, Artem Sokolov, Sheila M. Reynolds, Yu-An Chen, Damir Sudar, Yubin Xie, Jeremy Muhlich, Raquel Arias-Camison, Sarah Arena, Adam J. Taylor, Milen Nikolov, Madison Tyler, Jia-Ren Lin, Erik A. Burlingame, Daniel L. Abravanel, Samuel Achilefu, Foluso O. Ademuyiwa, Andrew C. Adey, Rebecca Aft, Khung Jun Ahn, Fatemeh Alikarami, Shahar Alon, Orr Ashenberg, Ethan Baker, Gregory J. Baker, Shovik Bandyopadhyay, Peter Bayguinov, Jennifer Beane, Winston Becker, Kathrin Bernt, Courtney B. Betts, Julie Bletz, Tim Blosser, Adrienne Boire, Genevieve M. Boland, Edward S. Boyden, Elmar Bucher, Raphael Bueno, Qiuyin Cai, Francesco Cambuli, Joshua Campbell, Song Cao, Wagma Caravan, Ronan Chaligné, Joseph M. Chan, Sara Chasnoff, Deyali Chatterjee, Alyce A. Chen, Changya Chen, Chia-hui Chen, Bob Chen, Feng Chen, Siqi Chen, Milan G. Chheda, Koei Chin, Hyeyoung Cho, Jaeyoung Chun, Luis Cisneros, Robert J. Coffey, Ofir Cohen, Graham A. Colditz, Kristina A. Cole, Natalie Collins, D |
| ||Nature Methods. 2022; 19(3): 262 |
|[Pubmed] | [DOI]|
||Rocky road to digital diagnostics: implementation issues and exhilarating experiences
| ||Nikolaos Stathonikos, Tri Q Nguyen, Paul J van Diest |
| ||Journal of Clinical Pathology. 2021; 74(7): 415 |
|[Pubmed] | [DOI]|
||NCI Imaging Data Commons
| ||Andrey Fedorov, William J.R. Longabaugh, David Pot, David A. Clunie, Steve Pieper, Hugo J.W.L. Aerts, André Homeyer, Rob Lewis, Afshin Akbarzadeh, Dennis Bontempi, William Clifford, Markus D. Herrmann, Henning Höfener, Igor Octaviano, Chad Osborne, Suzanne Paquette, James Petts, Davide Punzo, Madelyn Reyes, Daniela P. Schacherer, Mi Tian, George White, Erik Ziegler, Ilya Shmulevich, Todd Pihl, Ulrike Wagner, Keyvan Farahani, Ron Kikinis |
| ||Cancer Research. 2021; 81(16): 4188 |
|[Pubmed] | [DOI]|
||DICOM Format and Protocol Standardization—A Core Requirement for Digital Pathology Success
| ||David A. Clunie |
| ||Toxicologic Pathology. 2021; 49(4): 738 |
|[Pubmed] | [DOI]|
||Dicom_wsi: A python implementation for converting whole-slide images to digital imaging and Communications in Medicine compliant files
| ||Qiangqiang Gu, Naresh Prodduturi, Jun Jiang, ThomasJ Flotte, StevenN Hart |
| ||Journal of Pathology Informatics. 2021; 12(1): 21 |
|[Pubmed] | [DOI]|
||Developing image analysis pipelines of whole-slide images: Pre- and post-processing
| ||Byron Smith, Meyke Hermsen, Elizabeth Lesser, Deepak Ravichandar, Walter Kremers |
| ||Journal of Clinical and Translational Science. 2021; 5(1) |
|[Pubmed] | [DOI]|
||Cybersecurity Challenges for PACS and Medical Imaging
| ||Marco Eichelberg, Klaus Kleber, Marc Kämmerer |
| ||Academic Radiology. 2020; 27(8): 1126 |
|[Pubmed] | [DOI]|