Clinical Data Management: Roles, Steps, and Software Tools
About 3.6 million data points are generated during Phase III clinical trials, which is three times more than 10 years ago. The Phase III stage has been conducted the same way for decades: It happens before approving the medication or therapy approach, involves up to 3,000 participants, and can last for several years. So what has caused such a dramatic rise in data volume?
Some key contributors to this growth are extensive drug development projects targeting rare diseases, the use of biomarkers and genetic data, and the growing number of patient data sources, from web-based questionnaires to wearables. No matter the cause, clinical researchers need to harness this information and make the most of it. Clinical data management has appeared to address this challenge.
What is clinical data management
Clinical data management (CDM) is a set of practices that handle information generated during medical research. It aims to ensure data quality, integrity, and compliance with internal protocols and state regulations.
Also, the CDM process helps keep key clinical trial stakeholders on the same page :
1. Sponsors — pharmaceutical companies, institutions, and other organizations that initiate, monitor, and finance the trial.
2. CROs (control research organizations) — research entities hired by the sponsor to plan and run the investigation.
3. Sites — centers coordinating data collection from trial participants.
CDM plays a crucial role in evaluating the safety and effectiveness of drugs, diets, medical devices, digital therapeutics tools, and other types of treatment, diagnostics, or methods to prevent health problems. If properly handled, it significantly reduces time required for a new medical product launch.
Сlinical data management team and stages
CDM activities start early in the clinical trial process, once the trial protocol, describing the study objectives and methodology, is designed. As a rule, data-related responsibilities are allocated across
- a clinical data manager who supervises the entire CDM process;
- a database programmer or designer;
- data entry associates;
- a medical coder who translates diagnosis, procedures, adverse events, and other health data into industry-specific codes; and
- a quality control associate.
Now, let’s see how data management unfolds and who does what at each stage.
Stages and roles in clinical data management.
Data management plan design
Experts in charge: data manager, database designer
A data management plan or DMP is a document detailing all procedures, tasks, milestones, and deliverables throughout the CDM lifecycle. It gives a roadmap on how to work with information and handle possible risks. Another important function is to clearly communicate what happens in the course of the trial to each stakeholder.
The DMP typically describes the following aspects:
- data to be gathered from trial participants,
- existing data that can be integrated,
- data formats,
- metadata and its standards,
- storage and backup methods,
- security measures to protect confidential information,
- data quality procedures,
- responsibility assignments across team members,
- access and sharing mechanisms and limitations,
- long-time archiving and preservation procedures,
- the cost of data preparation and archiving, and
- compliance with relevant regulations and requirements.
The DMP must be ready at a trial design stage, before the first participant is enrolled. This will ensure that data will be collected in the correct format and properly organized. However, the plan is not something immutable: It has to be updated across the trial, capturing any changes that influence data management.
eCRF or electronic case report form design
Experts in charge: data manager, database designer
The case report form is a printed or electronic questionnaire for collecting data from study participants and reporting it to trial sponsors. The document is created specifically for each research project in accordance with
- the trial protocol, and
- recommendations of the Clinical Data Acquisitions Standards Harmonization (CDASH). They are developed by Clinical Data Interchange Standards Consortium to streamline industry-wide data exchange. Say, CDASH dictates dd/mm/yy format for collecting dates. (Read our article on CDISC standards to learn more.)
Starting from the early 1990s, electronic CRFs (eCRFs) have been gradually replacing their paper-based analogs which leads to faster data collection and better quality of information. As of 2020, 84 percent of sites, sponsors, and CROs either go paperless or plan to make this transition soon.
An example of eCRF structure.
Well-designed case report forms collect only data necessary for the particular study avoiding any redundancy. The fields to be filled in may include
- demographics (age, gender),
- basic measurements (height, weight),
- vital signs (blood pressure, temperature, etc.) captured at various time points,
- lab exams,
- medical history,
- adverse events, and
- more, based on the research requirements.
Data managers create data entry screens and eCRF layouts in collaboration with a database programmer. The design usually goes through several review cycles before finalization.
Clinical trial database design
Experts in charge: database designer, data manager
A clinical trial database is a set of data collected during the study and organized in rows and columns. It’s designed with the CRF structure in mind. In other words, the database incorporates a questionnaire schema of the case report forms.
To preserve storage space, some questionnaire data can be coded into meaningful categories. In this case, database specialists create detailed descriptions of decoding or how to map codes into elements of CRFs.
Combined coding approach. Source: Applied Clinical Trials.
Before launching into the production (research) environment, the database is tested with dummy data in a secure, non-study site.
Electronic data capture in clinical trials
Experts in charge: clinicians, data manager, data entry associate, medical coder
As we mentioned above, CRFs are the main instrument of data capturing in clinical trials. The information for report forms is traditionally gathered by clinicians or data entry associates from participants, when they visit medical facilities. Yet, in recent years medical sites stopped to be the primary point of data capturing. Nowadays, details for trials are also extracted from
- Electronic Health Records (EHRs),
- medical devices (blood pressure monitors, ECG recording machines, and others),
- a laboratory information management system (LIMS), and
- patient-reported outcomes (PROs) or any descriptions of health conditions that come directly from patients, without mediation and interpretation from medical experts.
With paper-based questionnaires, the data is manually entered into printed versions and then inserted into the database. In the case of eCRFS, information gets into the computer from the start, and some fields can be populated automatically, with data transmitted from medical devices or EHRs by robotic process automation (RPA) tools. Anyway, all information that made it to the forms and then to the clinical database must go through the data validation process.
Data validation: edit checks, source data verification, and data anonymization
Experts in charge: data manager, database designer, quality control associate
Clinical data validation is a series of quality tests to ensure accuracy, consistency, legibility, and integrity of information. This includes the following steps.
Electronic edit checks. Edit checks are created by a database designer and embedded into eCRFs to automatically compare inputs against numerical and logical criteria. This prevents unlikely values from appearing in the document. Say, a check for the body temperature field may flag all entries lower than 95 and higher than 105 if the system expects measurements in Fahrenheit.
Source data verification (SDV). SDV is a process of checking CRF entries against original medical records and other source files. The aim of this step is to confirm that an eCRF contains all relevant information and truly represents a participant’s profile.
Data anonymization. Before submission to sponsors, clinical data must be de-identified to comply with the Health Insurance Portability and Accountability Act (HIPAA). This means removing all elements of protected health information (PHI) that can link the document to a particular person.
Database lock and data archiving
Experts in charge: data manager, database designer
On the study completion, the database is locked so that no changes can be done to the information. After that, clean data is submitted to stakeholders for statistical analysis, reporting and, finally, publication of the results. However, all these steps are beyond the clinical data management workflow.
All essential documents and trial supplies must be archived for at least three years. This enables post-trial monitoring and evaluation as well as trial reconstruction to facilitate further research.
Сlinical data management software
Spreadsheets and common office programs are obviously not enough to address all the сhallenges of clinical data management. You need software capable of handling large amounts of documents and customized for medical studies — clinical data management system (CDMS), also called electronic data capture (EDC) systems.
CDMSs can be specifically tailored for psychiatry studies, medical device clinical trials, drug development, or other cases. However, all of them have common features covering basic data management operations.
21 CFR part 11 compliance. Title 21 of the CFR (the Code of Federal Regulations) regulates food and drugs produced or consumed across the USA. Specifically, its 11th part sets rules for information systems used by companies subject to the FDA (the Food and Drug Administration). Compliance with this document means that the technology has built-in mechanisms for data security and traceability — namely
- access controls ensuring that only authorized users, under unique IDs (electronic signatures) are able to enter the system and work with data;
- forced periodical password resets; and
- audit trails, or keeping chronological records of all operations and changes to the database.
All these precautions are taken to guarantee that the data produced by the system can be trusted.
eCRF designer. This module provides a set of templates and a drug-and-drop functionality to construct eCRF layouts and data entry fields in accordance with standards. Custom forms are then saved to be reused in future projects. Also, the designer allows for programming edit checks.
Query management. In terms of clinical trials, a query is a request for clarification from trial sponsors to researchers. Such requests are made during data review, before database lock, and aimed at resolving errors and inconsistencies. The query management feature facilitates communication among data managers, sponsors, and other stakeholders. It helps faster resolve all questions.
Monitoring. The monitoring capabilities include but are not limited to scheduling, adverse event tracking, and automatic notifications to sponsors.
Medical coding support. The component automates code search, mapping across coding systems, and error checking.
Data import and export. Some systems are equipped with powerful data integration tools that facilitate multisite studies. The data export feature, in turn, automatically transforms information from the trial database into formats applicable for statistical analysis and required by trial sponsors and regulatory authorities
Below, we’ll review the most widely used CDMSs accommodating all the above-mentioned features.
Clinical data management software suits, compared
IBM Clinical Development: a reliable platform for decentralized studies
Used by 3000+ clinical trials
Pricing model individual plan
Pros: reliability, coding with Watson AI
Cons: archaic UI, high price
IBM Clinical Development (ICD) is an end-to-end cloud-based system allowing for data capturing from various sources and targeting large-scale, decentralized studies. Its capabilities extend beyond the trial data management cycle, covering analytical reports.
The strong points of ICD are reliability, a large library of pre-built forms, ease of use, and a flexible, modular structure. You can choose and pay for only those features that you really need. The medical coding part is supported by the AI power of IBM Watson.
Keep in mind, though, that you will need a programmer to set up a trial. Among other cons admitted by users are archaic UI, slow customer support, and high price of ownership, which makes the system unattainable for startup projects.
Oracle Clinical Research Suite the best support for paper-based studies
Used by 1000+ studies
Pricing model: individual plan
Pros: stability, integrations with Oracle products
Cons: slow data entry, high price
One of the oldest CDMSs on the market comes as a combination of three integrated subsystems
- Oracle Clinical automating trial protocol design, data validation, and reporting;
- Oracle Remote Data Capture, an EDC solution with layout editor to generate collection forms; and
- Thesaurus Management System for standardizing clinical terminology.
Running on the top of the Oracle database, the suite demonstrates great stability and is recognized as one of the best systems for paper-based data collection. It seems to be a natural choice for companies already using other Oracle products.
At the same time, its data entry part involves repetitive and tedious manual operations. Also, Oracle isn’t meant for small companies with limited budgets.
Castor EDC: an affordable way to speed up trial builds
Used by 8500+ studies
Pricing model: based on study needs
Pros: ease of setup and use, quick support, affordability
Cons: limited functionality
Castor EDC became popular across 90+ countries due to its proven ability to significantly shorten the duration of a clinical study build. Intuitive UI simplifies creating eCRF forms, assigning study roles, and joining new users. The data can be smoothly integrated from various sources — EHR systems, medical devices, wearables, etc. Should any problems arise, you’ll get help from an expert in no more than 30 minutes.
Relatively low price makes the platform affordable for small companies. Quite expectedly, it lacks some features and export formats that larger and more expensive systems typically have.
TrialKit: an intuitive CRF designer for virtual studies
Used by 7000+ studies
Pricing model: based on number of features
Pros: ease of use, smooth integrations with wearables
Cons: hard learning curve, limited functionality,
Aimed at decentralized virtual studies, TrialKit comes in two versions — web-based and mobile. Due to a drag-and drop CRF builder and a library of ready-to-use templates, companies can create eCRFs with no programming skills and launch trials in days instead of weeks. In addition to EHRs, the tool easily integrates data from wearables like Fitbit, Apple Watch etc.
As for cons, it takes time to learn how to work with the platform. Another major complaint relates to limitations in functionality.
How to choose and adopt a CDMS: best practices
Here are some additional tips on how to select and implement a CDMS so that it will contribute to a trial success.
Check compatibility and integration options. Make sure that the intended CDMS smoothly links to other platforms you’ll use during the trial — namely, an EHR system, LIMS, and others. In case of compatibility issues, you ‘ll need to find a tech partner with system integration expertise.
Learn more about the level of support offered by a vendor. This includes training programs, the quality of user documentation provided, and speed of response in case of any issues. Also, check if the provider will help you with the system deployment, integrations, and development of new features once you need them.
Make sure that all parties are satisfied. Before finalizing your choice and implementing a new platform, try to get feedback from all intended users — CROs, sponsors, staff and technical support at the research site. Take advantage of a free trial or demo version so that everybody could test the user interface and share their experience.
Start from the core functionality. Redundant features not only inflate your budget but create additional complexities for staff who will have to get familiar with a new system. Later, you can expand functionality as needed and as your project grows by adding new modules from the same provider, integrating third-party tools or using custom development.
Test and validate the system against the eCRF. Once electronic forms are designed, you need to run user acceptance testing (UAT.) End users like data entry associates, clinicians, and researchers should determine if they feel comfortable with the eCRF structure and if the form contains all needed fields.
Take your time to set up a database. Quite naturally, trial sponsors want to get the system run in no time. However, it[s important to balance speed and risk reduction. Take your time to thoroughly design and test a database before the study starts. Software changes during the trial might be costly and impact data validity.