Jump to content

Expanding Translational Research Concepts with Genedata Profiler

Despite significant investments in research and development (R&D), biopharmaceutical companies continuously struggle with low productivity1. This is because delivering a new therapeutic product to the market is a complex, time-consuming endeavour with a high attrition rate. To increase the probability of success of therapeutics in development and accelerate time to market, companies turn to their biggest asset: data. 

The biopharmaceutical industry generates a wealth of data across the whole R&D process. When allied with cutting-edge digital solutions, these data can help make informed decisions to optimize the outcome of clinical trials and maximize returns while benefiting patients in need. Although the primary, project-specific use of R&D data has become commonplace, leveraging data beyond its original intent, is emerging as a game-changer. This secondary use of data is a characteristic of translational research. By making productive use of data primarily collected for discovery or clinical research purposes, translational research seeks to uncover hidden trends and generate data-driven hypotheses that could accelerate the development of more effective, life-changing therapies. Insights from early research can inform druggable molecular targets and enable the design of the most optimal therapeutics. Without the foundational knowledge of biological processes and molecular pathways underlying disease mechanisms, there would be no clinical pipeline. Moreover, clinical trial data, including demographics, pharmacotherapy, laboratory tests, treatment response, and toxicity data (assembled to monitor patient clinical outcomes and support regulatory submissions), if explored in conjunction with patient molecular profiles, can shed light on predictive and prognostic signatures as well as markers of adverse events. Ultimately, such learnings can improve a product’s efficacy and safety by helping to refine patient cohorts, define (and adapt) dosing strategies, or design combination therapy solutions.  

Translational research is often viewed as a linear process that bridges pre-clinical and clinical research. However, this paradigm, where one stage builds upon and informs another can be applied anywhere in the R&D life cycle to drive data-informed decision-making. In addition, to enhance innovation, identify bottlenecks and mitigate risks faster, we can imagine expanding data use across research programs, clinical studies, indications, or assets. For instance, comparing information on drug candidates’ critical quality attributes collected at the discovery research or chemistry, manufacturing, and controls (CMC) stage could help select and prioritize the most efficacious, safe, and manufacturable assets without the risk of costly late-stage failures. Also, historical data from bioprocess development could be used to streamline routine operations and foster process innovation. Post-marketing surveillance data collected to assess a drug’s performance and safety in the broader population can help identify potential alternative uses of approved drugs, maximizing their clinical benefit and financial returns. To name a few examples.

Applying the translational research paradigm at the institutional level for informed decision-making can generate significant value across the pharma industry value chain. Yet, this can be realized only if certain important data and organizational challenges are first addressed. 

Challenges of the Translational Research Approach  

Translational research involves close collaboration between distinct stakeholder groups. Data producers, governance, and compliance experts, as well as different data consumers of varying levels of technical know-how must work in synchrony, regularly exchanging knowledge to advance the overall teams’ R&D projects. While some may be highly technologically adept, routinely using programming languages and advanced expert solutions to curate, process and analyze data, others may require intuitive and user-friendly tools to leverage data. Moreover, translational research relies on cross-disciplinary interactions and data transfer across drug development stages, therapeutic areas, or modalities. Yet, this is often hampered. Data is trapped in silos as the work organization between different company’s units is separated. Not only does each group apply their distinct protocols and selected analytical tools to generate data and data-driven insights, but they also store them in unique systems making data unavailable for broader use beyond their original scientific purpose. Improving data discoverability and access, while ensuring sensitive patient data is protected, can be difficult to realize in practice. Yet, this is important as the requirements for data consumers working at different points of the drug development pathway vary. For example, scientists at the discovery and non-clinical research stages rely on openly accessible data for flexible hypothesis testing and insight generation. On the other hand, clinical researchers need to respect patient data privacy and comply with stringent regulatory requirements. As results generated towards the final stages of drug development are submitted to regulatory authorities, any analyses conducted on data need to be done within a compliant environment. 

Finally, as the data gathered across the R&D lifecycle is generated using diverse technologies, they are often heterogeneous in type and format. Therefore, to leverage data in cross-technology or cross-study analyses, data requires curation and harmonization. In addition, biopharma companies often collaborate with external partners such as Contract Research Organizations (CROs) to conduct exploratory analyses receiving results in a format that may not comply with their ways of working. This may lead to further delays as it requires additional time to structure the data before integration with other datasets for downstream analysis. 

Catalyze Translational Research with Genedata Profiler 

To overcome these bottlenecks and successfully apply data for secondary use, biopharma companies can leverage Genedata Profiler®, a trusted translational research platform. Genedata Profiler democratizes knowledge and enhances interdisciplinary collaboration by centralizing access and FAIRifying multi-source data while addressing the needs of the various stakeholder groups involved. In practice, it streamlines data capture and enables self-service integration and analysis of diverse data modalities ultimately accelerating the generation of insights that present opportunities for innovation and increased productivity.  

Facilitating Interdisciplinary Collaboration 

Enhanced interdisciplinary collaboration is pivotal for an effective translational research approach. Genedata Profiler offers a central secure location where data and generated knowledge can be stored and effectively exchanged between all stakeholder groups. With the platform, data producers using several laboratory techniques and advanced technologies relevant to their research domain can deposit all their experimental results and publish them for use by other teams. This can be done using manual, semi-automatic and fully automatic methods to import data from data buckets such as Amazon S3 or even through data virtualization, efficiently federating data from external systems without duplication. The platforms’ interoperability also allows data producers to feed data directly from Laboratory Information Management Systems (LIMS), Electronic Laboratory Notebooks (eLNs), Biological Sample Management (BSM), Clinical Data Repositories (CDR) and other systems used by the company or its partnering contract development and manufacturing organizations (CDMOs). Genedata Profiler equips data scientists to import data with associated metadata, so all datasets are correctly annotated for improved findability by different user groups from an intuitive Data Portal. Data loss is prevented as all data assets inherit the source file metadata throughout the data lifecycle. Within the platform, research projects are organized in project-specific folders allowing relevant data files to be easily identified and shared. This provides a one-stop shop for all stakeholder groups to navigate across various projects to find and access data, results, and workflows in a self-service manner.  

Data Security & Regulatory Readiness 

To protect patient-sensitive data, reduce the risk of data breaches, and prevent costly financial consequences, organizations can leverage the fine-grained access and data handling permissions of Genedata Profiler. Rather than placing control in the hands of one individual, the platform decentralizes data governance allowing several data product generators to regulate who can access and perform specific actions on them. By assigning different roles to stakeholders, governance managers can define what actions stakeholders can perform with the platform, which projects they can access, and what data-related activities they can handle within the project or study. For further control over who uses the data within Genedata Profiler, access to individual datasets can be configured for each stakeholder using access tags. As you progress further along the drug development pathway, regulatory readiness becomes more important, and analyses must be conducted in a validated environment. To comply with regulatory requirements, biopharma and biotech companies need to provide documentation proving completion of the necessary tests (IQ: Installation Qualification, OQ: Operational Qualification, and PQ: Performance Qualification), to authorities. Without prior knowledge, this whole process can become a bottleneck. With Genedata Profiler, quality & compliance managers benefit from support for faster preparation of required documents using the provided templates. As the platform has out-of-the-box functionalities (e.g. data integrity checks, record traceability, controlled workflows, and automated reporting) to meet common compliance requirements, quality & compliance managers can get jumpstarted, accelerating clinical trial result delivery to regulatory authorities. 

Multimodal Data Integration 

For deriving translational insights that spur R&D innovation, the diverse data modalities originating from non-clinical, clinical, and post-approval drug development stages often require integration for further analysis. To enable this, raw, unstructured, or semi-structured data need to be first converted into a unified format. Genedata Profiler automates data processing and harmonization steps, creating virtual queryable and interoperable data tables. These tables can be then combined as needed, depending on the particular use case, to create an analysis-ready dataset for further exploration. Once published, the dataset becomes augmented with relevant metadata that provides data consumers with analytical context guiding them to the right analytical tool for insight generation. Ultimately, this transforms various data into fit-for-purpose data products allowing the same data to be leveraged by different data consumers in a variety of secondary applications, maximizing the value unlocked from organizational data. To streamline data product creation, data scientists can build workflows composed of domain-specific out-of-the-box plugins. Alternatively, they can work programmatically leveraging advanced languages such as R, Python, and command line. Further, they can apply community-approved pipelines such as Nextflow by Seqera within Genedata Profiler, ensuring correct metadata handling and data governance. For even greater flexibility, data wrangling and analytical tools can be developed using the data scientists’ preferred programming language in Genedata Profiler’s integrated development environment (IDE) benefitting from containerization, version control, traceability, and reproducibility. Once workflows for data processing and integration are approved, they can be continuously re-run from the web by other organization’s members autonomously, ensuring process standardization and reproducibility of results.

Facilitated Insight Generation for Decision-Making 

To enhance innovation potential within biopharma organizations, data consumers need to easily access relevant datasets and be provided with an analytical toolbox for flexible and self-service data exploration. Yet, to do so efficiently they need to first understand the context of these data: its provenance, transformation, and analytical purpose. Continuous reliance and back-and-forth between data consumers such as biologists, data producers who generate data and data scientists, who convert data into data assets, could slow down and hinder the process of insight derivation. Genedata Profiler provides a single environment for all stakeholder groups, where data can be seamlessly deposited, curated and consumed. Importantly, all stakeholders gain transparency over how data have been generated and transformed as the platform provides the data lineage via automatically generated and easily retrievable comprehensive reports. Data consumers, if authorized, can seamlessly navigate between projects to discover their data products of interest via the platform’s Data Portal. The Data Portal’s intuitive functionalities enable them to preview and filter datasets according to their research questions before analysis and identify suitable analytical tools for investigating their hypotheses through cross-study analysis. In Genedata Profiler, data consumers can choose from inbuilt analytical tools or integrated solutions. They can leverage point-and-click applications, pre-configured by their data scientist colleagues as well as business intelligence tools such as PowerBI, Qlik, or Spotfire thanks to APIs. Programming experts can benefit from the Genedata Profiler’s analytics environment for efficient exploration of high-dimensional data e.g. using Posit, Jupyter Notebook, etc. 

Supporting AI Applications 

Standardized analysis-ready data products and a well-documented data lineage are not only important for aiding consumption by subject-matter experts in their preferred tool. By bringing context to datasets, the data product creation process of Genedata Profiler allows data to be interpretable and usable by a broader community of users as well as AI tools. This facilitates the secondary use of data. Genedata Profiler provides a single point of access to vast amounts of harmonized, standardized, and contextualized data on which AI applications can be trained and applied to facilitate scientific insight generation. Whether association models to identify patient subpopulations from RWD, machine-learning algorithms to discover omics predictive signatures or deep-learning models for histology slide image analysis and segmentation, Genedata Profiler enables scientists to leverage these models on large amounts of high-dimensional data by providing scalable and elastic computational power. Ultimately, this supports the generation of informed scientific, clinical, and operational decisions. 

Final Remarks

With the increasing volume of R&D data and the availability of AI-based analytics, there is immense potential to drive innovation and boost productivity in the biopharmaceutical industry. Yet, although data serves as a valuable resource, it often remains untapped. Leveraging data for secondary purposes presents a significant opportunity but requires addressing certain data and organizational limitations. A translational research platform like Genedata Profiler can play a crucial role. By promoting data accessibility, integrity, and security as well as streamlining interdisciplinary data collaboration, Genedata Profiler empowers companies to maximize the value of their data accelerating scientific breakthroughs and enhancing R&D operational efficiency. 

Visit our website to learn more about how to uncover actionable scientific insights from complex big data and elevate your R&D projects. 

 

 1. Accelerating clinical trials to improve biopharma R&D productivity, Agrawal G., Kautzky J., Keane H., Parry B., Sartori V., and Silverstein A., McKinsey & Company, January 2024