Skip to content

Regulatory Information

dbeasley1 edited this page Mar 25, 2021 · 4 revisions

Summary of system purpose

DASHER (Data Anonymisation and Synchronisation in HEalthcare Research) is an open-source web-based system to manage and view pristine medical images, de-identify these and transfer de-identified data from hospitals to externally hosted central repositories for clinical trials. It provides a means to ensure data conforms to clinical trial protocols and to minimise the risk of transferring data containing identifiable information. It consists of two XNAT servers – one for managing pristine images and to de-identify and to harmonise data which is then sent to the second XNAT server. From the second XNAT server, de-identified data can then be uploaded to remote repositories.

Installation and security considerations

DASHER requires several dependencies that are downloaded from various internet websites during the build process. Internet access is thus required during the installation process, but it is not required for DASHER to be functional after the installation is complete.

DASHER uses two instances of XNAT, each of them containing a web server. The two web servers can be configured to use SSL certificates. Both XNAT instances are deployed on the same physical server but within separate containers. The non-deidentified XNAT server is accessible via port 443 (standard HTTPS – port 80 if no SSL certificates). The De-identified XNAT server is accessible on port 444 (non-standard HTTPS port – or port 8082 if not using SSL certificates). In addition, port 8104 (this can be changed to port 104) must be accessible to allow medical images to be sent using the DICOM protocol.

To transfer de-identified data off-site from the de-identified XNAT server, a connection to the remote destination must be permitted. The transfer is over port 443, using RESTful calls. Only the required remote destinations could be authorised through the firewall.

Detailed description of data processed or held

Pristine medical image data can be stored in one instance of XNAT, including all identifiable information stored in the DICOM metadata such as name, address, dob, gender. The de-identified XNAT instance, which is used to transfer data, only contained de-identified information. The de-identification process requires manual verification for all new data types and a quarantine mechanism has been designed to ensure that non-validated data can be sent to an external data repository.

Summary of system users and administrators

A system administrator is required to ensure the smooth running of the XNAT servers, to add/remove users, to configure the system (for example, adding clinical trial information etc). The system is designed to require very minimal maintenance. Users (radiologists, clinicians, physicians) will use the platform to manage, de-identify and transfer data. The XNAT administrator sets permissions for each user, therefore only users with permission to transfer data off-site are able to perform this.

Description of key data flows and data flow diagram

DASHER consists of two separate XNAT instances running within the same Docker service, one hosting pristine medical imaging data, the other the derived de-identified data. Figure 1 illustrates DASHER – it has been designed to have a clear boundary between the server hosting pristine data, and the server containing de-identified data. The customised XNAT servers have restricted functionality by default, limiting the possibility of transferring identifiable information, improving usability and allowing for improved automated maintenance. Using Docker ports, and customising the XNAT servers, data can only be imported into the pristine medical imaging data XNAT and data can only be transferred to a remote XNAT using the de-identified XNAT server.

workflow

Figure 1

Users import pristine data by physically copying the data onto the server or by using DICOM Push from PACs or radiotherapy treatment planning systems. The data is checked, ensuring it fits criteria set out by a local-site defined criteria. XNAT organises data into projects. If the data passes the criteria, it ends up in a project named after the hospital site (for example GSTT). From this, the data can be viewed, checked and de-identified.

De-identification:

When de-identifying, a clinical trial must be selected from a dropdown menu. A clinical trial session and subject labels are required. Labels in radiotherapy DICOM files are checked to ensure they conform to the clinical trial protocol – this is contained in a file distributed by the Clinical Trial managers and uploaded into the Non-DI XNAT by the XNAT administrator previously. DicomEdit is used to perform the firs-step of de-identification. If this is successful, a python script is used on this data to modify the DICOM headers, hash UIDs and change dates. In addition clinical trial information is inserted into the header. After this step a script is run to compare the non-DI and DI datasets to ensure all of the DICOM fields are DI. If this is successful, the DI data is uploaded to the DI XNAT server to a project linked to the remote XNAT server. The user can then log on to the DI XNAT Server, check the data and, if the user has permissions to perform this, upload this data to an external XNAT server.

Risk Register

Risk Register Likelihood Impact Mitigation Contingency
Unauthorised Access / Change to Data Low Medium The XNAT servers are only accessible within the trust. DASHER limits the actions of a user, limiting how data can be modified. Any de-identified data cannot be modified without root user access and not transferrable off-site. Site investigates user.
Non-anonymised patient data is transferred outside the hospital network. Low High It is not possible to transfer medical images to the de-identified XNAT without de-identification. The images are de-identified twice and all files checked before transferring from the non-de-identified to the de-identified XNAT. Only the de-identified XNAT docker container has ports open to transfer files and only the de-identified XNAT has the plugin required to synchronise data. Dicom headers are read when scans in the de-identified XNAT. The headers fields must contain expected de-identified code. If it is not, the data is deleted or quarantined. The process of synching data from within the hospital to the central repository is manual. The user has to select the data and click synchronise. This requires checks of the data. With the previous checks in place, it would not be possible to synchronise data. It would require deliberate methods and disabling of data The received data would need to be checked, details of each session transferred should be automatically checked and automatic email sent to people involved in the research and sending data. As soon as data is checked at the remote XNAT, and aware that data contains PHI, the remote XNAT should be stopped to prevent any further data being sent. Any data will be immediately deleted and the hospital informed.
Server hacked / Malware / Ransomeware / Cyberfraud / Compromised Web Site Low High The server sits behind the hospital firewall, and XNAT is only accessible within the trust. Passwords exist only in user-read password files inside the container. Local policies for network configuration in place. As docker containers are used, only certain ports of each container are exposed. This limits the ability to ‘hack’ the servers. The de-identified and non-deidentified XNATs reside on different ports, therefore it is recommended the de-identified ports are closed on the external firewall preventing any access. The server will be shut down immediately and ports closed. If there is data loss, DASHER can be wiped and reinstalled. If a backup exists, this can be put in place.
PI put in Wrong Clinical Trial information Low High Synchronisation is manual therefore can be checked for PI before synching. A warning is in place before synching. Similar to emailing PI, this is responsibility of the person synchronising data.
Incorrect Sync details entered Medium Medium Only de-identified data can be synchronised Details and credentials are checked during the initial setup of the server. Investigate why System Validation was not followed. Check credentials with remote XNAT administrators.
Service does not work as expected Medium Medium Docker-based, therefore should be the same across all installations. Full documentation available. Constraints in place to ensure only a few possible actions possible with DASHER. Contact developers
Storage runs out of space Medium Low Automated Maintenance scripts clean the container of old and unnecessary files. Button added to allow dicom files to be deleted. Contact developers for assistance. Recommend that local site allocates greater space.
Wrong Data sent Medium Low Series Import Filters in place to prevent unwanted scan types being imported. Protocol filter in place to ensure the sessions conform to what is expected.
Unexpected power loss High Low In case of sudden powers loss, the service can be restarted without loss of data. Follow instructions in DASHER manual to restart docker.
Data loss Medium Low Local policies for backup in place Local backup used. Docker service restarted.
Clone this wiki locally