Skip to content

Guide Pseudonymising Data

Marc Modat edited this page Nov 20, 2020 · 1 revision

Home | Installation Guide | User Guide | Admin Guide | User Tests


Pseudonymising_a_Session

Login to the non-anonymised XNAT and navigate to a subject and then session that you would like to pseudonymise (more information about how to navigate DASHER is provided here). From the session-page select Pseudonymise Session on the right-hand menu and a pop-up windows will appear. Select anonymise from the pop-up window and the Pseudonymisation form will appear.

First select the target project on the pseudonymised XNAT to which the pseudonymised data should be added. Each project is either linked to a remote server or designated for local research (see Adding a Project for a Remote Server and Adding a Project for Local Research for more details).

Once the target project is selected, you can choose between a manual or an automatic pseudonymisation method. If the data is being pseudonymised for a specific clinical trial, manual pseudonymisation should be selected. If the data is being pseudonymised for general research, either manual or automatic pseudonymisation can be selected.

Also, if you are pseudonymising RTTQA benchmark data, select the corresponding tick-box.

Automatic pseudonymisation

Automatic pseudonymisation generates a subject ID and session ID for the pseudonymised data automatically by using a hashing algorithm. This results in long, cryptic, and seemingly random IDs. However, hashing has the advantage that the same input ID will generate the same output ID, which in turn means that data from the same patient in the non-anonymised XNAT will end up under the same patient in the pseudonymised XNAT.

When Automatic is selected pseudonymisation can be initiated by clicking on Pseudonymise session.

Manual pesudonymisation

Manual pesudonymisation requires the user to manually define the subject ID and session ID for the pseudonymised data. When Manual is selected text boxes will appear for entering the IDs. Please do not use IDs with white spaces and/or special characters.

Manual pesudonymisation also allows you to specify if the data is being pesudonymised for a specific trial. You can select one of the installed clinical trials from the list, or no-trial if the data is being pesudonymised for general research. Selecting a trial has two effects. Firstly, the header is modified to indicate that the data belongs to the specific trial, enabling the data to be automatically associated with that trial when it is uploaded to the central sever. Secondly, any RT structure sets included in the session will be checked to ensure that any required structures are included, and that only permitted labels (structure names) have been used - see Adding Clinical Trials for more details.

If no-trial has been selected pseudonymisation can be initiated by clicking on Pseudonymise session.

If a clinical trial has been selected you must first check any RT structure sets conform to the trial by clicking on Validate. If the structure sets conform to the trial pseudonymisation can be initiated by clicking on Pseudonymise session. However, if the structure sets do not conform to the trial the session cannot be pseudonymised until further action is taken:

  • If any of the labels are not in the list of permitted labels they will be highlighted in yellow. The labels must be edited using the text boxes so that they match one of the permitted labels. Note - some trials include permitted labels such as 'dummy' which can be used for 'extra structures' such as those created to aid plan optimisation. Once all of the label names have been edited the structure set can be checked again by clicking on Validate. In addition, wildcards are used in the permitted and required list, for example DOSE* - where * can be used for free-text.

  • If any of the required structures are missing the structure set cannot be pseudonymised and the pseudonymisation must be cancelled by clicking Close. Note - it is not possible to add extra structures to a structure set that has already been imported into DASHER. A new structure set must be created that contains all of the required structures, and this must be imported (along with the associated scan, e.g. the planning CT) into DASHER. This may cause a conflict in the Prearchive but this can be resolved by assigning the new structure set and CT scan a different session ID (see Importing Data into DASHER for more details).

Pseudoymisation_form

In the example above a dataset is pseudonymised for the target project KCL using the manual pseudonymisation method and the clinical trial specifications for testtrial1. Note the three columns of structure labels. The first lists the structure names currently used in the RT-structure set. The second one lists all permitted labels for the selected clinical trial. The third column lists those structures that must be present for this clinical trial.

Pseudonymised Session Report

After a few minutes the Pseudonymised Session Report will be added under Assessments on the Session Page and under Experiments on the Subject Page. See Overview of DASHER for how to navigate to the subject and session pages. Note, if you stay on the session page after initiating the pseudonymisation you may need to refresh your browser to see the Pseudonymised Session Report appear.

assessment

experiment

Clicking on Pseudonymised Session, brings up the Pseudonymised Session Report. You have the option to follow the link to the corresponding session in the pseudonymised XNAT by clicking on View Pseudonymised Session.

PA_session_report

Viewing the DICOM header of the pseudonymised data

The DICOM header of the pseudonymised data can be viewed to ensure that all identifiable patient information has been removed. When on the pseduonymised session page, click on the plus sign next to the scan number for the header you want to view, and then click on *View DICOM Headers'.

view_dicom_headers

Note - this shows the most common DICOM header fields, including those where patient information is usually stored. However, it does not show all of the DICOM header fields, e.g. it does not show nested DICOM fields. Therefore, to fully ensure that all patient information has been removed we recommend exporting the pseudonymised data (see Exporting Data from DASHER) and inspecting the DICOM header with a tool that allows it to be fully examined.


Next

Clone this wiki locally