COMING SOON! COVID-19 Clinical Data Sets for Research
April 29, 2020
In this unique time, the University recognizes the need and opportunity to provide researchers timely access to rapidly accumulating electronic patient data for COVID-19 patients. The 5 Clinical and Translational Science Award institutions—UC Davis, UC Irvine, UCLA, UCSD, UCSF—want investigators to be aware of two such data sets: CORDS and ACT.
UC BRAID: Creation of a UC-Wide COVID Research Data Set (CORDS)
UC Health EVP Carrie Byington requested UC BRAID, in coordination with the UC Health Data Warehouse (UCHDW) team in the Center for Data-Driven Insights and Innovation (CDI2), to accelerate the creation of a COVID Research Data Set (CORDS). The goal is to align our collective efforts to address the pandemic by creating a centralized, secure research data set and then facilitating access to this data set at the local level.
UC BRAID has assembled a team of clinicians, investigators and bioinformaticists from each campus to work with the UCHDW team to create CORDS. By combining patient data from the 5 medical centers, investigators will be enabled to make discoveries they otherwise couldn’t using local data alone. The data set will evolve over time and become more valuable as we include more patients and more clinical depth in the form of additional captured elements.
Timely and broad access to the data set is paramount. Working with campus IRB directors and Office of the President to navigate the regulatory landscape for systemwide data sharing, it was determined that usage of this limited data set is to be considered as ‘non-human subjects research’. As such, research use of this data set will not require individual IRB approval for investigators.
A Data Use Agreement will need to be signed by investigators. Re-identification and contact of patients is not allowed. Downloading or printing the data is prohibited, as is combining with other data sets (unless it is specifically permitted by your Local Responsible Party and the UCHDW Director).
To access CORDS: The data set will be securely transferred to each of the health campuses for use within their own secure virtual systems for research purposes. Investigators can request queries and/or have direct access according to local policies and procedures. Please contact your local Research Analytics/Informatics teams for more complete description, opportunities for training, and methods for accessing the data set. UC BRAID will provide updates about training materials and opportunities.
See below the short description of the currently available data set and plans to create enriched versions.
CORDS v1.0 was developed by matching the daily COVID test result file from sites with patients in the UCHDW. It contains patient demographics, past medical history, all numeric resulted labs, in addition to the COVID-19 test result. It is expected that each UC Health location will be able to pull down the data set to a local server and follow location-based practices for access. This is expected shortly following IRB approval for standing up the data set, which is currently under review. [Note: While the establishment of the data set requires IRB review, access to it by individual investigators does not.]
We are finalizing the creation v2.0, which will add to the data set elements focused on critical care and ventilator settings in the ICU. The v2.0 will be harmonized manually so it will take a bit of time to aggregate and harmonize the data from the five health systems before we can release it to the research community. A date has not yet been set for this, but it is being worked on with high priority by the teams. Further steps are planned to automate the process as we have done with the UCHDW to facilitate the data refresh cycle.
It should be noted that while CORDS v2.0 will be more clinically rich, the additional data elements are ICU-centric. Future efforts are planned to find and include data from OR and ER encounters as well.
CORDS v3.0 (planned): There is a small team of clinicians and researchers assigned to define a v3.0. This is just kicking off. The team will focus on data elements missing from v2.0 that we want to map into the next version, thus increasing its utility. A time frame is not yet set but will take into account the selection of clinical values, investigator requests, completeness, and the ability to get UC consensus.
Accrual to Clinical Trials (ACT) Data Set
UC is part of the leadership team for the NCATS Accrual to Clinical Trials (ACT), a federated data set which allows open access to de-identified EMR data across a national network of leading academic medical centers. The platform was established to bring real-time cohort exploration and discovery to researchers. To date, the COVID data ontology is being implemented and will be available soon. The UC effort is particularly valuable because it enables access to nationwide data through the i2b2 Foundation and to OMOP data across the state.
Please visit the ACT Network website for more information, including instructional tutorials.