Keep all tags #50

ryder-cobean-nih · 2024-09-17T21:51:23Z

This pull request should cover a small change to rap_sitkCore that enables the user to retain tag values and tags themselves as seen when the file is loaded. rap_sitkCore's functions for converting color spaces and safely opening a file for use in processing pipelines are vital, but other tools perform the necessary anonymization. This enables the image to be opened with less tampering by the observer, if desired.

keep_all_tags defaults to FALSE so if omitted (as it would be in existing uses in other scripts), existing behavior remains the same.

usage:

from pathlib import Path
from rap_sitkCore import read_dcm

img = read_dcm(Path(path_to_dicom), keep_all_tags=TRUE)

defaults to FALSE for continuity with existing uses of the module. Enables user to bypass the default stripping of unsupported tags for use in applications in which other services perform anonymization.

blowekamp

The lower level _read_dcm_pydicom function also only copies the white list tags, partly because support for converting all value representations (CR) to the SITK string representation was not implemented.

Please let me know if you would like to continue to work on the PR, or if I should address it.

blowekamp · 2024-09-18T13:30:46Z

rap_sitkcore/read_dcm.py

+ for tag_name in _keyword_to_copy:
+ key = keyword_to_gdcm_tag(tag_name)
+ if key in img:
+ out[key] = img[key]


After srgb2gray is run the output image img contains none of the original DICOM tags.

When `keep_all_tags' is true, all the tags need to be copies.

@blowekamp made the requested changes - please review? I think this should cover it in the lower level function.

When keep_all_tags is true, the tags still need to be copied from img to out.

blowekamp · 2024-09-27T13:52:09Z

rap_sitkcore/read_dcm.py

+ else:
+ img[key] = str(float(de.value))
+ elif de.VR in ["CS", "UI"]:
+ img[key] = de.value


These two blocks are very similar but have different behavior. Is this intentional? Were there files that has issues that needed to be addressed?

Check out these changes, in latest commit:

def _get_string_representation(de: pydicom.dataelem.DataElement) -> str: """ Get the string representation of the DICOM tag. Parameters: de (pydicom.dataelem.DataElement): The DICOM date element (a particular tag and its metadata). Returns: str: The string representation of the DICOM tag. """ try: if de.value in [None, ""]: return "" elif de.VR == "DS": if de.VM > 1: return convert_float_list_to_mv_ds(de.value) else: return str(float(de.value)) elif de.VR == "US": return str(int(de.value)) else: return de.value except (TypeError, ValueError) as e: raise RuntimeError( f'"Error parsing data element "{de.name}" with value "{de.value}" ' f'and value representation "{de.VR}". Error: {e}' ) def _read_dcm_pydicom(filename: Path, keep_all_tags: bool = False) -> sitk.Image: """ Reading implementation with pydicom for DICOM """ ds = pydicom.dcmread(filename) arr = ds.pixel_array if ds.PhotometricInterpretation == "MONOCHROME2": img = sitk.GetImageFromArray(arr, isVector=False) elif ds.PhotometricInterpretation == "MONOCHROME1": # only works with unsigned assert ds.PixelRepresentation == 0 # use complement to invert the pixel intensity. img = sitk.GetImageFromArray(~arr, isVector=False) elif ds.PhotometricInterpretation in ["YBR_FULL_422", "YBR_FULL", "RGB"]: if ds.PhotometricInterpretation != "RGB": from pydicom.pixel_data_handlers.util import convert_color_space arr = convert_color_space(ds.pixel_array, ds.PhotometricInterpretation, "RGB") img = sitk.GetImageFromArray(arr, isVector=True) else: raise RuntimeError(f'Unsupported PhotometricInterpretation: "{ds.PhotometricInterpretation}"') # keep_all_tags is either all tags other than PixelData or the tags specified in # _keyword_to_copy, provided they are present in the dataset if keep_all_tags: _keyword_to_copy = [elem.keyword for elem in ds if elem.keyword != "PixelData"] else: _keyword_to_copy = [keyword for keyword in _keyword_to_copy if keyword in ds] # iterate through all tags and copy the ones specified in _keyword_to_copy # to the SimpleITK image for tag in _keyword_to_copy: de = ds.data_element(tag) key = f"{de.tag.group:04x}|{de.tag.elem:04x}" img[key] = _get_string_representation(de) return img

introduce function _get_string_representation() which returns the string representation of a data element with proper error handling, and simplify definition of the _keyword_to_copy list based on whether keep_all_tags is True

blowekamp

There is also a need for testing. The cases when I have seen the tags not being copied should be triggering some test failures.

blowekamp · 2024-09-30T13:35:25Z

rap_sitkcore/read_dcm.py

+ if keep_all_tags:
+ _keyword_to_copy = [elem.keyword for elem in ds if elem.keyword != "PixelData"]
+ else:
+ _keyword_to_copy = [keyword for keyword in _keyword_to_copy if keyword in ds]


Isn't this going to overwrite the file local _keyword_to_copy? Please use a different variable name.

blowekamp · 2024-09-30T13:36:23Z

rap_sitkcore/read_dcm.py

+ for tag_name in _keyword_to_copy:
+ key = keyword_to_gdcm_tag(tag_name)
+ if key in img:
+ out[key] = img[key]


When keep_all_tags is true, the tags still need to be copied from img to out.

…n whether the `keep_all_tags` parameter is True. Remove overwrite of _keyword_to_copy, Also make sure to copy out keys in case that image is RGB and keep_all_tags is True

…han keywords (resolves edge case in which a keyword is missing in the image with the tag present). Resolved issue of reusing variable. When keep_all_tags is true, the tags still need to be copied from img to out when calling srgb2gray. This is now resolved.

blowekamp · 2024-10-02T20:24:23Z

Updated work here: #53

ryder-cobean-nih added 2 commits September 17, 2024 14:36

add keep_all_tags parameter

b79a656

defaults to FALSE for continuity with existing uses of the module. Enables user to bypass the default stripping of unsupported tags for use in applications in which other services perform anonymization.

fix line breaks

50a533c

ryder-cobean-nih mentioned this pull request Sep 17, 2024

Add a boolean parameter to keep all dicom tags at loading #49

Open

blowekamp requested changes Sep 18, 2024

View reviewed changes

changes to _read_dcm_pydicom to support keep_all_tags

6e5b76a

blowekamp reviewed Sep 27, 2024

View reviewed changes

simplify control flow in _read_dcm_pydicom

94f2391

introduce function _get_string_representation() which returns the string representation of a data element with proper error handling, and simplify definition of the _keyword_to_copy list based on whether keep_all_tags is True

blowekamp reviewed Sep 30, 2024

View reviewed changes

ryder-cobean-nih added 2 commits September 30, 2024 17:59

definition of the _keyword_to_copy list has been simplified based o…

1f17902

…n whether the `keep_all_tags` parameter is True. Remove overwrite of _keyword_to_copy, Also make sure to copy out keys in case that image is RGB and keep_all_tags is True

blowekamp closed this Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep all tags #50

Keep all tags #50

ryder-cobean-nih commented Sep 17, 2024 •

edited

Loading

blowekamp left a comment

blowekamp Sep 18, 2024

ryder-cobean-nih Sep 27, 2024

blowekamp Sep 30, 2024

blowekamp Sep 27, 2024

ryder-cobean-nih Sep 27, 2024

blowekamp left a comment

blowekamp Sep 30, 2024

blowekamp Sep 30, 2024

blowekamp commented Oct 2, 2024

Keep all tags #50

Keep all tags #50

Conversation

ryder-cobean-nih commented Sep 17, 2024 • edited Loading

blowekamp left a comment

Choose a reason for hiding this comment

blowekamp Sep 18, 2024

Choose a reason for hiding this comment

ryder-cobean-nih Sep 27, 2024

Choose a reason for hiding this comment

blowekamp Sep 30, 2024

Choose a reason for hiding this comment

blowekamp Sep 27, 2024

Choose a reason for hiding this comment

ryder-cobean-nih Sep 27, 2024

Choose a reason for hiding this comment

blowekamp left a comment

Choose a reason for hiding this comment

blowekamp Sep 30, 2024

Choose a reason for hiding this comment

blowekamp Sep 30, 2024

Choose a reason for hiding this comment

blowekamp commented Oct 2, 2024

ryder-cobean-nih commented Sep 17, 2024 •

edited

Loading