Artificial Intelligence in Biotechnology and Pharmaceuticals: Evolution, Applications, and Regulatory Frontiers

Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think, learn, and solve problems like humans. AI systems are designed to perform tasks such as recognizing speech, making decisions, translating languages, or identifying objects. AI rapidly transforms the pharmaceutical and biotechnology industries, driving innovation across the entire drug development lifecycle, from target identification and molecular design to clinical trial optimization and regulatory review. By leveraging advanced machine learning algorithms, natural language processing, and big data analytics, AI enables faster, more precise decision-making and has the potential to significantly reduce the time and cost associated with bringing new therapies to market. This review explores the historical context and evolution of AI along with current landscape of AI applications in pharma and biotech, highlighting key breakthroughs, emerging trends, and the challenges that must be addressed to fully harness the power of AI in advancing healthcare solutions. Table 1 lists the AI related key terms and their definition.

Table 1 Lists the AI related key terms and their definitionHistorical Context and Evolution

The concept of AI dates to the 1950s when the Turing Test, a way to measure a computer’s intelligence, was introduced by Alan Turing [1], though the term “Artificial Intelligence” was introduced by John MacCarthy in 1955 in an application for funding. However, it was in the 1980s when AI was applied in pharmaceutical research to assist in decision-making by simulating the problem-solving expertise of human specialists in specific domains like chemistry or pharmacology [2]. Application to analyze biological data was enabled with significant advancement in machine learning and data mining in 1990s which enabled researchers to handle large datasets to uncover new insights about molecular interactions, genetic patterns and disease mechanisms [3].

During the 1990s and 2000s, significant advancements in machine learning and data mining enabled the analysis of vast biological datasets, unlocking new insights into molecular interactions, genetic patterns, and disease mechanisms. The completion of the Human Genome Project [4] in 2003 led to a surge in personalized medicine and AI applications within genomics and bioinformatics, allowing researchers to mine and interpret complex genetic data [5]. Additionally, AI-driven virtual screening methods emerged, accelerating drug discovery by predicting how molecules would interact with biological targets, while machine learning also began being used for drug repurposing, identifying new uses for existing drugs [6].

A great leap in AI technology was seen in 2010s with breakthroughs in deep learning particularly convolutional and recurrent neural networks. This improved the analysis of complex biological data like protein structures, chemical compounds [7, 8] making AI integral to drug discovery with protein folding predictions and structure-function relationship modeling [9, 10]. This is when AI-driven biotech startups such as AlphaFold, Insilico Medicine, Relay Therapeutics, Recursion Pharmaceuticals, Owkin, and Atomwise came into existence which worked alongside major pharmaceutical companies to accelerate drug design and clinical trial optimization [10, 11]. Furthermore, AI played a key role in personalized medicine by tailoring treatment plans based on individual patient data, such as genomics and clinical history [12, 13]. Figure 1 illustrates the timeline of AI evolution in the pharma and Biotech sector along with current and future emerging systems that allow for the use of machine learning that will learn and apply knowledge across a broad range of tasks at a level similar to humans.

Fig. 1figure 1

Timeline of AI Evolution in Pharma and BioTech - From Rule-Based Beginnings to Autonomous Drug Discovery — illustrating key milestones where artificial intelligence has transformed drug development, clinical research, and precision medicine

Current Integration of Artificial Intelligence

AI is rapidly forging its path in every sector. Its integration into the biotechnology and pharmaceutical sectors has been accelerating, transforming traditional processes and enabling more efficient drug development, precision medicine, and diagnostics [14] (Fig. 2). AI is currently revolutionizing drug design with advanced systems like DeepMind’s AlphaFold predicting protein structures with high precision that are dramatically speeding up drug development [15]. AI is transforming healthcare, enabling earlier disease detection and improved diagnostic tools, especially for diseases like cancer and Alzheimer’s, through image recognition and biomarker analysis [16, 17]. Additionally, AI is enhancing the use of real-world data (e.g., from wearables and health records) to improve patient outcomes and drug effectiveness [18]. In biomanufacturing, AI is optimizing the production of biologic drugs, improving yield, cost-efficiency, and consistency [14, 19, 20]. For example, Cradle Bio applies machine learning to optimizing protein engineering and the design of biomaterials; Antiverse’s AI platform utilizes vast biological datasets to predict optimal antibody structures for therapeutic targets. These advancements promise to reduce the time and costs associated with drug development and transform patient care. In recent years, AI technologies have been increasingly utilized to optimize clinical trials by improving patient recruitment, predicting outcomes, and designing more efficient studies [21]. Natural Language Processing (NLP) tools have enhanced the ability to mine scientific literature, electronic health records, and clinical trial data, providing novel ways to identify new biomarkers, drug interactions, and treatments [22].

Fig. 2figure 2

Key areas where currently AI is being integrated in Biotech and Pharma sector- This figure illustrates the primary domains where artificial intelligence is transforming the pharmaceutical and biotechnology industries. From accelerating drug discovery and optimizing clinical trials to enhancing manufacturing and enabling personalized medicine, AI plays a pivotal role across the entire drug development lifecycle. Additional applications include real-world evidence analysis, diagnostic imaging, regulatory automation, and commercial strategy optimization

Genetic variations, either independently or combined with environmental factors, can alter gene expression profiles, disrupt protein metabolism processes, and lead to pathological changes associated with diseases. Analyzing changes in gene expression is essential to identify key genes and pathways related to disease progression, which can serve as potential targets for therapeutic intervention. High-throughput microarray and RNA sequencing technologies based on next-generation sequencing provide detailed insights into the transcriptome of cells or tissues. However, the high dimensionality and complexity of the data often limit the ability to extract meaningful information about specific biological processes of a disease. As a result, many researchers have shifted from traditional statistical methods to machine learning (ML) approaches, effectively revealing complex biological characteristics.

For example, a group of researchers has developed a multimodal deep learning (DL) model capable of ascertaining the precise relationship between transcription factors, proteins that regulate gene activities, leveraging an aspect of DNA called DNA breathing, in which the double-helix structure opens and closes spontaneously [23]. The model holds promise for aiding in the development of drugs targeting diseases rooted in gene activity. Similarly, by leveraging AI’s data analysis and predictive power, gene editing techniques like CRISPR can achieve greater precision and effectiveness in targeting genes linked to various diseases [24]. CRISPR enables the modification of these genes, either to correct mutations or optimize the patient’s treatment response based on their genetic profile. In another approach, a new generative AI method has been developed that precisely controls how genes are switched on or expressed in specific kinds of cells in the body for potential gene therapy application [25].

Regulatory Guidelines for AI in Biotech and Pharmaceuticals

As AI’s role in drug development grows, regulatory agencies like the U.S. Food and Drug Administration (FDA) and European Medicines Agency (EMA) have started creating frameworks to guide its ethical and regulatory use in both drug development and healthcare applications. Given the rapid growth of AI applications in biotechnology and pharmaceuticals, having comprehensive regulatory frameworks in place becomes essential to ensure patient safety, data privacy, and effective product outcomes. By maintaining appropriate oversight, regulatory bodies help balance the innovation potential of AI with the need for patient safety and efficacy. Table 2 provides a summary of key regulatory guidelines for AI applications in the sector, categorized by relevant oversight bodies. These guidelines are constantly evolving as AI technology advances, and stakeholders must stay informed about the latest developments in regulatory frameworks to ensure compliance and mitigate risks.

Table 2 Key regulatory guidelines for AI applications

Medicines and Healthcare products Regulatory Agency (MHRA), UK. FDA and Health Canada (HC) from Canada have jointly identified 10 guiding principles that can inform the development of good machine learning practice (GMLP). The 10 guiding principles identify areas where the International Medical Device Regulators Forum (IMDRF), international standards organizations and other collaborative bodies could work to advance GMLP. Areas of collaboration include research, creating educational tools and resources, international harmonization, and consensus standards, which may help inform regulatory policies and guidelines. Continued feedback through the public docket (FDA-2019-N-1185) at Regulations.gov are allowed to engage the user.

The AI applications or tools used in biotech and pharma sectors can be broadly categorized into regulated and unregulated domains, with each having distinct levels of oversight, risk management, and application scopes. Understanding the distinction between regulated and unregulated AI applications, along with their respective guidelines, is key to navigating the integration of AI into the biotechnology and pharmaceutical industries.

Regulated AI Applications

Regulated AI applications are subject to stringent oversight by relevant governmental or international regulatory bodies as these applications directly impact patient safety. In the pharmaceutical and biotechnology industries, these bodies include the U.S. FDA, Europe EMA, and other national health agencies. Regulated applications typically involve critical functions such as drug discovery, clinical trials, patient data handling, medical devices, and diagnostics, where the safety, efficacy, and privacy of the product or service are paramount. Some examples of regulated applications include the use of AI in Drug Discovery, AI in Diagnostics and AI in Personalized Medicine. AI platforms like Insilico Medicine’s PandaOmics and Atomwise utilize AI models to predict potential drug candidates and optimize molecular designs. These platforms are becoming integral to pharmaceutical companies’ Research and Development (R&D) processes, but they must meet regulatory requirements to ensure their predictions and subsequent drug development processes adhere to safety standards. AI-powered tools such as Zebra Medical Vision and Aidoc provide diagnostic assistance for radiologists by analyzing medical imaging to detect conditions like cancer, hemorrhages, and fractures. These tools are regulated as medical devices and must comply with FDA or EMA standards for clinical accuracy and reliability. AI systems like Tempus use vast amounts of genomic data to guide personalized cancer treatments. These systems rely on AI algorithms to analyze genetic mutations and recommend targeted therapies, a process governed by medical regulations to ensure patient safety and compliance with data privacy laws.

Unregulated AI Applications

On the other hand, unregulated AI applications do not directly impact patient safety or involve critical medical decision-making, and therefore, do not fall under the same stringent regulatory frameworks. These AI tools may still have significant value, but they typically operate in areas such as early-stage research, non-invasive diagnostics, or administrative functions within biotech and pharma companies. Some of the examples of unregulated AI applications include AI in drug repurposing, AI in pharma operations. AI-driven tools like Healx analyze existing drug databases to identify new therapeutic uses for approved drugs. While these AI systems play a crucial role in innovation, they typically do not require direct regulatory oversight unless the repurposed drug proceeds to clinical trials or becomes a medical device. AI applications such as Aizon in manufacturing or Aiforia for visualizing microscopy data aid in operational efficiency, predictive maintenance, and quality control. These tools enhance the overall functioning of pharmaceutical companies but remain outside the realm of clinical or patient-facing applications, thus requiring less regulation.

Challenges and Limitations Associated with AI

Despite the enormous potential for AI to advance R&D in biotech and pharmaceuticals, a complex and rapidly evolving set of challenges and limitations remain unresolved. These key issues include IP and data rights, privacy protection, the use of synthetic data, digital twins, the need for standards and guidelines for AI governance, as well as non-technical barriers to the adoption of AI [26,27,28].

Intellectual Property and Data Rights Authorship and Ownership

Determining who owns AI-generated works is contentious, as conventional IP laws and existing regulations do not clearly address creations made by autonomous systems [26]. In addition, it is often difficult to ascertain the original data sources and prevent the unauthorized use of scientific data from these sources.

Patentability

The patentability of AI algorithms and inventions is under scrutiny, with debates on whether AI can be considered an inventor under current laws [26]. Current legal frameworks do not adequately protect researchers’ rights as inventors or co-inventors of AI-based technologies. For example, assigning attribution becomes even more fraught if the invention results from collaboration between AI tools and human researchers rather than human inventors alone.

Copyright Issues

AI’s reliance on training data, often sourced from copyrighted materials, raises concerns about potential copyright infringement and misappropriation of proprietary information [29, 30]. Beyond data ownership, access, and management there are fundamental concerns about the responsibilities of AI tools providers to comply with established policies for protecting IP. Notably, OpenAI’s allegation that DeepSeek unlawfully used its AI models through “distillation” is ironic, given the fact that OpenAI itself is alleged to have violated numerous copyrights in accessing third-party data to train its own AI models [31].

Privacy Protection

Because AI systems can process and analyze vast amounts of personal data, it is crucial to protect individuals’ privacy and reduce the risks of harmful misuse of their information [32].

Data Privacy

The collection and processing of personal data by AI systems must comply with regulations like GDPR and CCPA, which impose strict data handling requirements [27, 29]. Ensuring that AI systems respect and protect individuals’ privacy is vital for progress. Federated learning systems enable life sciences researchers to safely share patient data at scale and collaborate without restricting innovations that are only possible when analyzing large samples drawn from representative and diverse patient populations around the world [33, 34].

Data Security

Safeguarding the data used by AI systems from unauthorized access and breaches is essential. The sensitivity and volume of data handled by AI systems make them attractive targets for cyberattacks. In healthcare settings, “medjacking” or medical device hijacking of devices and equipment is a particular concern and can place patients at risk [32].

Use of Synthetic Data

Synthetic data is data generated artificially rather than collected from real-world processes which offers a way to address some of the IP, data rights, and privacy issues raised earlier [35, 36].

Advantages

Synthetic data can help protect privacy, as it does not contain real personal information. It can also be used to augment training datasets, enabling the development of more robust AI models [35]. This can accelerate the implementation of new AI tools.

Disadvantages

The quality and representativeness of synthetic data are critical. Poorly generated synthetic data can lead to biased or ineffective AI models. The methodology employed to create synthetic data requires careful validation to ensure it adequately reflects the rich complexities of real-life data without an undue loss of fidelity [36].

Digital Twins

Digital twins are virtual replica of physical entities such as tissues, organs, or even entire patients to enable real-time planning, monitoring, simulation, and testing for R&D purposes [26].

Data Integration

Creating an accurate digital twin requires integrating data from multiple sources, including sensors, IoT devices, and historical databases such as patient records. Ensuring interoperability and consistency across these data sources can be challenging [29].

Model Accuracy

The effectiveness of a digital twin depends on the accuracy of the underlying models including the representativeness of the sample(s) used. Inaccurate or incomplete models can lead to poor decision-making and suboptimal outcomes [32].

Need for Standards and Guidelines for AI Governance

A pressing issue in the field of AI is the lack of standardized frameworks and guidelines for AI governance. The rapid advancement of AI technology has far outpaced the development of traditional regulatory frameworks, leading to a fragmented and inconsistent policy landscape that compounds the risks and uncertainties associated with utilizing AI[2729, 37].

Establishing Standards

There is an urgent need for international standards that define best practices for the development, deployment, and management of AI systems. These standards should address issues such as fairness, transparency, accountability, and security. The FDA recognizes the increased use of AI throughout the drug product life cycle across a range of therapeutic areas and has released a series of draft guidance reports [26].

Guidelines for Ethical AI

Clear guidelines are needed to ensure that AI systems are developed and used ethically. This includes addressing biases in AI algorithms, ensuring transparency in AI decision-making processes, and promoting accountability for the outcomes of AI-driven actions. The EU Artificial Intelligence Act (AI Act) is a law that regulates the development, use, and deployment of AI in the European Union. The Act aims to make AI safer, more ethical, and more transparent [30, 33, 39, 40].

Non-Technical Barriers to AI Adoption Workforce Readiness

The adoption of AI in R&D for biotech and pharmaceuticals requires a skilled workforce that is capable of utilizing AI. There is currently a substantial gap between the demand for AI expertise and the supply of qualified professionals. Industry and academia must be prepared to invest in education and training programs to crucial to bridge this gap [40].

Cultural Resistance

Organizational culture can be a barrier to AI adoption for R&D teams. Employees and stakeholders may resist changes brought about by AI due to fear of job loss, lack of trust, as well as a healthy skepticism about the technology’s benefits. Creating small AI experiments within an organization to acclimate R&D teams to the potential impacts and possible unintended consequences is essential for overcoming these barriers [39, 41, 42].

Ethical Concerns

Ethical considerations, such as the impact of AI on employment, social inequalities, and human rights, can also impede AI adoption [29, 39, 42]. Developing AI systems that are aligned with societal values and ensuring that their deployment does not exacerbate existing inequalities is an important priority for the biotech and pharmaceuticals industry.

Conclusion and Recommendations

This review underscores the transformative role of Artificial Intelligence (AI) in reshaping the biotechnology and pharmaceutical industries. From revolutionizing drug discovery and diagnostics to enabling precision medicine and optimizing clinical trials, AI is accelerating progress across the entire biomedical pipeline. The integration of generative AI with genome editing technologies and personalized medicine approaches holds significant promise for decoding complex biological systems and tailoring treatments to individual genetic profiles.

Tracing the evolution of AI in this field—from early adoption in the 1980s to the surge of machine learning and data mining in the 1990s and 2000s—the paper highlights how milestones such as the Human Genome Project catalyzed a new era of genomics and bioinformatics. The 2010s brought deep learning to the forefront, with convolutional and recurrent neural networks driving major advancements in drug development and clinical research.

Despite these achievements, challenges remain. Issues such as intellectual property rights, data privacy, and the lack of standardized practices highlight the critical need for robust regulatory frameworks and ethical oversight. Regulatory bodies like the FDA and EMA are actively working to distinguish between regulated AI—used in high-stakes applications like clinical trials and diagnostics—and unregulated uses in early research or administrative tasks, where the risks to patient safety are minimal.

Future research and development in AI applications will increasingly focus on integrating machine learning with genomics, precision medicine, and drug discovery. AI algorithms can identify disease-associated genetic variants, map gene regulatory networks, and predict the functional effects of mutations, offering deeper insights into complex biological systems. Generative AI, combined with genome editing technologies and multi-omics data, is poised to accelerate personalized treatment strategies tailored to individual genetic profiles. Emphasis will also grow on explainable AI, federated learning for secure data sharing, and interdisciplinary collaboration to ensure ethical, transparent, and impactful clinical implementation.

Overall, the responsible and innovative application of AI in biotech and pharma holds the potential to revolutionize healthcare, but it must be guided by clear ethical standards and regulatory alignment to realize its full potential.

Comments (0)

No login
gif