Ärztinnen und Ärzte in einer Sitzung

Charité/BIH Virtual Research Environment (VRE) – a module for the Health Data Platform (HDP)

The goal of the Charité Health Data Platform and the Virtual Research Environment is to make medical data secure, findable and usable for research purposes.

With Industry partner Indoc and Business Division IT

You are here:

  • Virtual Research Environment.

What is the Virtual Research Environment?

The Virtual Research Environment (VRE) integrates with the Charité Health Data Platform systems and extends its functionalities.

The HDP is developed within the Digital Medicine Platform. It is responsible for ingesting patient data from the Charité hospital and making it available for use in research. It is still a work in progress. The Health Data Platform is made up of many different components including a data lake, a clinical data zone and a research data zone. HDP’s underlying Technology is an Apache Hadoop Cluster. It is an entire ecosystem of frameworks, tools and software built for scalable and reliable storage and distributed processing of data. 

The goal of the VRE is to make it easier for researchers to find and securely access data and use it in innovative ways. So in summary, the Virtual Research Environment will extend the Health Data Platform by providing 

1) workflows for radiologic imaging data

2) a model for cataloguing data and making it easier to find 

3) workbenches for modelling, simulating and analyzing data and 

4) a portal that is open also for external users who do not belong to Charité

5) interoperability with international data commons like those developed under the European Open Science cloud, for example the Virtual Brain Cloud or the Human Brain Project.

We follow the FAIR guiding principles for scientific data management - Findability Accessibility Interoperability Reusability. We want to ensure that this development is open and accessible to all clinicians and scientists at Charité and BIH so we can solve today's pressing medical challenges together!

Key facts

We are looking for integrating more Use Cases. Presently Use Cases are being developed in collaboration with

  • Prof. Dr. Carsten Finke for the development of the UseCase Generate
  • Virtual Brain Cloud (lead Prof. Dr. Petra Ritter)
  • Dr. Claudia Chien, Prof. Dr. Kerstin Ritter, Prof. Dr. Friedemann Paul with the Use Case deepMS

Architecture of the Virtual Research Environment

Research Portal
The primary interface for researchers to access VRE functions and resources, including data capture tools, interactive dashboards and viewers, query tools and analysis workspaces.

Data Gateway API
An application programming interface that enables the Research Portal to exchange data and metadata with VRE systems, and allows the VRE to be interoperable with other data platforms, data sources and systems.

Green Room
An environment in which hospital data, such as data derived from electronic health records (EHRs), picture archiving and communication systems (PACS), laboratories, and other sources, can be de-identified, transformed and prepared prior to being transferred to VRE systems and made available for research use.

Data Lake
A zone within the VRE in which data of any type can be received, stored, catalogued, and ultimately ingested into platform databases. Standard data models and ontologies are applied to allow datasets to be aggregated and processed. Data can be further de-identified here for broader sharing. Quality assurance, quality control and preprocessing pipelines prepare the data for visualisation and analysis.

Data Warehouse
A set of databases and services which transform diverse data into a unified, federated context. Federation is critical for harmonizing data and metadata so that information about participants and datasets can be queried, visualized, and analyzed across studies, data sources and modalities. The Metadata Repository and Knowledge Graph implement standardized and extensible schemas to represent metadata derived from datasets as well as annotations generated by researchers.

Shared Services
Automated services that are required by VRE components. The Participant Registry includes systems for generating and storing unique pseudonymized identifiers for all research participants contributing data to the VRE. Identity and access management systems allow the Research Portal and other VRE components to assign or validate the identity and permissions of VRE users.

Workspaces and Analytics Resources
Workspaces are flexible and interactive environments in which users can access, visualise and analyse their data with a range of analysis and visualisation tools. These are supported by underlying computing infrastructure as well as privacy preserving linkage systems that allow de-identified datasets to be linked and compared.

Charité/BIH Infrastructure and Services
Various IT systems and services are integrated to provide networking, computing, storage and other infrastructure required by the VRE. This includes services that transfer data from existing research and

Timeline of VRE development

VRE draft project schedule showing the major deployment stages (represented as orange and purple bars), with associated tasks and timelines.