Frontiers in Computational Chemistry: Volume 7 -  - E-Book

Frontiers in Computational Chemistry: Volume 7 E-Book

0,0
65,65 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Frontiers in Computational Chemistry (Volume 7) offers a comprehensive overview of the latest advances in molecular modeling techniques for drug discovery and development. This book focuses on key computational approaches such as rational drug design, adsorption studies, quantum mechanical calculations, and molecular interactions in drug development. It provides insights into lead generation, optimization, and the creation of novel chemical entities targeting various biological mechanisms, including inflammation.
The chapters explore modern computational tools and their applications, particularly in low—and middle-income countries (LMICs). The book is essential for researchers, academics, and professionals in computational chemistry, molecular modeling, and pharmaceutical sciences.

Readership: Students and researchers.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 464

Veröffentlichungsjahr: 2024

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents
BENTHAM SCIENCE PUBLISHERS LTD.
End User License Agreement (for non-institutional, personal use)
Usage Rules:
Disclaimer:
Limitation of Liability:
General:
PREFACE
List of Contributors
In Silico Tools to Leverage Rational Drug Design and Development in LMICs
Abstract
INTRODUCTION
DRUG DISCOVERY PROCESS: AN OVERVIEW
ADVANTAGES AND DISADVANTAGES OF RATIONAL DRUG DISCOVERY APPROACHES
EARLY DRUG DEVELOPMENT: IN SILICO TOOLS
Therapeutic Targets: Identification, Prioritization and Validation
Computational Methods to Aid Target Identification
Text Mining: Identification of Disease-Associated Entities
Microarray Data Mining
Open Proteomics
Chemogenomics
Integrated Data Mining
Computational Methods to Aid Target Prioritization
Computational Methods to Aid Target Validation
Identification and Optimization of Drug Candidates
Novel Drug Development vs. Repurposing
VLS and Drug Design
Compound Libraries
Hit-to-Lead and Lead Optimization Tools
Additional Chemoinformatics Tools
PHARMACOKINETICS AND PHARMACODYNAMICS PREDICTION
ARTIFICIAL INTELLIGENCE (AI) FOR DRUG DISCOVERY
COST-EFFECTIVE AND COST/BENEFIT RATIOS
EXAMPLES OF COMPUTATION TOOLS APPLIED TO DRUG DISCOVERY IN LMICs
CONCLUDING REMARKS
LIST OF ABBREVIATIONS
REFERENCES
The Computational Chemistry in Adsorption Studies: The Cases of Drug Carriers and Biosensors
Abstract
INTRODUCTION
ADSORPTION
Types of Adsorption
Chemisorption
Physisorption
Adsorbents
Adsorbent Materials
Activated Carbon
Low-Dimensional Structures: Carbon-Based Structures
Low-Dimensional Structures: Boron-Nitride Structures
Implication of Adsorption in Drug Carriers and Biosensors
The Computational Chemistry and The Adsorption Processes
Electronic Parameters to Elucidate the Adsorption Phenomena
Application of Computational Chemistry for Drug Carriers Development
Application of the Computational Chemistry for Biosensors
CONCLUSION
ACKNOWLEDGEMENTS
REFERENCES
Perspective on the Role of Quantum Mechanical Calculations on Cellular Molecular Interactions
Abstract
INTRODUCTION
Quantum Mechanics
Background
Studies of QM in Biological Systems
Metallic Bonding and Geometry in Organometallics
Evidence for Quantum Chemical Effects in Receptor-Ligand Binding between Integrin and Collagen Fragments — A Computational Investigation With an Impact on Tissue Repair, Neurooncolgy and Glycobiology
Eckert, et al., 2021 [34]
Correlation between Biological Activity and Binding Energy in Systems of Integrin with Cyclic RGD-Containing Binders: A QM/MM Molecular Dynamics Study
Xiang, et al., 2012 [43]
Metal Ion Dependent Adhesion Sites in Integrins: A Combined DFT and QMC Study on Mn2+
San Sebatian et al., 2007 [53]
Small-Molecule Inhibitors of Integrin α2β1 that Prevent Pathological Thrombus Formation via an Allosteric Mechanism
Miller, et al., 2009 [68]
A Quantum Mechanical Analysis of RGD-Integrin Binding via DFT and Semi-Empirical Methods
Photoinduced Electron Transfer
Density Functional Theory Based Analysis of Photoinduced Electron Transfer in a Triazacryptand-Based K+ Sensor
Briggs and Besley, 2015 [89]
Blocking the Dark State as Sensing Mechanism of 3-Nitro-1,8-Naphthalimide Derivatives for Detection of Carbon Monoxide in Living Cells
Yu et al., 2021 [99]
Mechanisms: Bond Breaking and Bond Formation
Succinimide Formation from an NGR-Containing Cyclic Peptide: Computational Evidence for Catalytic Roles of Phosphate Buffer and the Arginine Side Chain
Kirikoshi et al., 2017 [128]
Glycolic Acid-Catalyzed Deamidation of Asparagine Residues in Degrading PLGA Matrices: A Computational Study
Manabe et al., 2015 [135]
Subsystem Methods
Divide and Conquer Method
Pose Scoring by NMR
Wang et al., 2004 [148]
Fragment Molecular Orbitals Approach
Fragment Molecular Orbital Calculations with Implicit Solvent Based on the Poisson–Boltzmann Equation: II. Protein and Its Ligand-Binding System Studies
Okiyama, et al., 2018 [161]
Using the Fragment Molecular Orbital Method to Investigate Agonist–Orexin-2 Receptor Interactions
Heifetz et al., 2016 [170]
Fragment Molecular Orbital Based Interaction Analyses on COVID-19 Main Protease − Inhibitor N3 Complex (PDB ID: 6LU7)
Hatada et al., 2020 [176]
X-Pol Model
Semi-Empirical Methods
Comparison of ab Initio, DFT, and Semiempirical QM/MM Approaches for Description of Catalytic Mechanism of Hairpin Ribozyme
Mlynsky, et al., 2014 [210]
Semi-Empirical and Linear-Scaling DFT Methods to Characterize Duplex DNA and G-Quadruplexes in the Presence of Interacting Small Molecules
Ortiz de Luzuriaga, et al. 2022 [225]
Advances in Computer Science
Machine Learning in Quantum Chemistry
Solvation Free Energy Calculations with Quantum Mechanics/Molecular Mechanics and Machine Learning Models
Zhang et al., 2018 [245]
Quantum Computing and Machine Learning Algorithms
Summary
Acknowledgements
REFERENCES
Computational Approaches in Evaluating the 5-HT Subtype Receptor Mechanism of Action for Developing Novel Chemical Entities
Abstract
INTRODUCTION
MULTISCALE MOLECULAR MODELING
FRAGMENT-BASED LEAD DISCOVERY FOR GPCR-BASED DRUG DISCOVERY
Accurate and Dependable Computational Methods for Lead Discovery
Molecular Dynamics Simulations and Free Energy Perturbation Approach (MD/FEP+)
MOLECULAR MODELING MECHANISMS FOR LIGAND DESIGN
Homology Modeling, Docking, Dynamics, and the Role of Mutations/Non- mutations in Ligand Binding and Specificity
Specific Databases for Pharmacological Analysis and Analogues for 5-HT Receptor
Quantum Mechanics (QM) for Identifying Critical Binding Site Residues and in GPCR Studies
Hartree-Fock (HF)
Density Functional Theory (DFT)
Configuration Interaction (CI)
Coupled Cluster (CC)
Moller-Plesset Perturbation Theory (MP2)
CONCLUSION
ACKNOWLEDGEMENTS
LIST OF ABBREVIATIONS
REFERENCES
Current Trends in Molecular Modeling to Discover New Anti-inflammatory Drugs Targeting mPGES-1
Abstract
INTRODUCTION
PHYSIOLOGY OF INFLAMMATION
Pharmacological Intervention
mPGES-1 Functions
TARGETING MPGES-1 INHIBITION
Structure and Functions
Catalytic Mechanism
MOLECULAR MODELING APPROACHES TO DISCOVER MPGES-1 INHIBITORS
Virtual Screening
Molecular Docking and Dynamics
Pharmacophore Modeling
Fragment-Based Drug Design (FBDD)
Quantitative Structure-Activity Relationship (QSAR)
FINAL CONSIDERATIONS AND FUTURE OUTLOOKS
CONFLICTS OF INTEREST
ACKNOWLEDGEMENTS
REFERENCES
Frontiers in Computational Chemistry
(Volume 7)
Edited by
Zaheer Ul-Haq
Dr. Panjwani Center for Molecular Medicine and Drug Research
International Center for Chemical and Biological Sciences
University of Karachi
Karachi, Pakistan
&
Angela K. Wilson
Department of Chemistry
Michigan State University
East Lansing, MI
USA

BENTHAM SCIENCE PUBLISHERS LTD.

End User License Agreement (for non-institutional, personal use)

This is an agreement between you and Bentham Science Publishers Ltd. Please read this License Agreement carefully before using the book/echapter/ejournal (“Work”). Your use of the Work constitutes your agreement to the terms and conditions set forth in this License Agreement. If you do not agree to these terms and conditions then you should not use the Work.

Bentham Science Publishers agrees to grant you a non-exclusive, non-transferable limited license to use the Work subject to and in accordance with the following terms and conditions. This License Agreement is for non-library, personal use only. For a library / institutional / multi user license in respect of the Work, please contact: [email protected].

Usage Rules:

All rights reserved: The Work is the subject of copyright and Bentham Science Publishers either owns the Work (and the copyright in it) or is licensed to distribute the Work. You shall not copy, reproduce, modify, remove, delete, augment, add to, publish, transmit, sell, resell, create derivative works from, or in any way exploit the Work or make the Work available for others to do any of the same, in any form or by any means, in whole or in part, in each case without the prior written permission of Bentham Science Publishers, unless stated otherwise in this License Agreement.You may download a copy of the Work on one occasion to one personal computer (including tablet, laptop, desktop, or other such devices). You may make one back-up copy of the Work to avoid losing it.The unauthorised use or distribution of copyrighted or other proprietary content is illegal and could subject you to liability for substantial money damages. You will be liable for any damage resulting from your misuse of the Work or any violation of this License Agreement, including any infringement by you of copyrights or proprietary rights.

Disclaimer:

Bentham Science Publishers does not guarantee that the information in the Work is error-free, or warrant that it will meet your requirements or that access to the Work will be uninterrupted or error-free. The Work is provided "as is" without warranty of any kind, either express or implied or statutory, including, without limitation, implied warranties of merchantability and fitness for a particular purpose. The entire risk as to the results and performance of the Work is assumed by you. No responsibility is assumed by Bentham Science Publishers, its staff, editors and/or authors for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products instruction, advertisements or ideas contained in the Work.

Limitation of Liability:

In no event will Bentham Science Publishers, its staff, editors and/or authors, be liable for any damages, including, without limitation, special, incidental and/or consequential damages and/or damages for lost data and/or profits arising out of (whether directly or indirectly) the use or inability to use the Work. The entire liability of Bentham Science Publishers shall be limited to the amount actually paid by you for the Work.

General:

Any dispute or claim arising out of or in connection with this License Agreement or the Work (including non-contractual disputes or claims) will be governed by and construed in accordance with the laws of Singapore. Each party agrees that the courts of the state of Singapore shall have exclusive jurisdiction to settle any dispute or claim arising out of or in connection with this License Agreement or the Work (including non-contractual disputes or claims).Your rights under this License Agreement will automatically terminate without notice and without the need for a court order if at any point you breach any terms of this License Agreement. In no event will any delay or failure by Bentham Science Publishers in enforcing your compliance with this License Agreement constitute a waiver of any of its rights.You acknowledge that you have read this License Agreement, and agree to be bound by its terms and conditions. To the extent that any other terms and conditions presented on any website of Bentham Science Publishers conflict with, or are inconsistent with, the terms and conditions set out in this License Agreement, you acknowledge that the terms and conditions set out in this License Agreement shall prevail.

Bentham Science Publishers Pte. Ltd. 80 Robinson Road #02-00 Singapore 068898 Singapore Email: [email protected]

PREFACE

Computational Chemistry has evolved into a multifaceted discipline, encompassing a wide range of applications from understanding protein-ligand interactions to the development of large nano-carriers for drugs. Frontiers in Computational Chemistry aims to present comprehensive material on the application of computational techniques in biological and chemical processes. This includes computer-aided molecular design, drug discovery and delivery, lead generation and optimization, quantum and molecular mechanics, computer and molecular graphics, as well as the creation of new computational methods and efficient algorithms for simulating a wide range of biophysical and biochemical phenomena, particularly in analyzing biological or chemical activity.

In this volume, we explore five distinct perspectives on the application of simulation methods in drug design and discovery, biosensing, and the elucidation of cellular molecular interactions:

Chapter 1, "In Silico Tools to Leverage Rational Drug Design and Development in LMICs," underscores the significant impact of computational tools on drug discovery and development, especially in low and middle-income countries. This chapter highlights various strategies for drug target selection, optimization of novel drug candidates, and cost-effective drug repurposing.

Chapter 2, "Computational Chemistry in Adsorption Studies: The Cases of Drug Carriers and Biosensors," explores the role of computational methods in designing nanomaterials for drug carriers and biosensors. It provides an overview of adsorption processes, with examples of adsorbent materials (e.g., activated carbon) and the main interactions in adsorbate-adsorbent complex formation, supported by density functional theory.

Chapter 3, "Perspective on the Role of Quantum Mechanical Calculations on Cellular Molecular Interactions," examines how quantum mechanical calculations enhance our understanding of cellular interactions, including metal interactions and hydrogen bonding. The chapter emphasizes the importance of these calculations in studying the Arg-Gly-Asp (RGD) sequence, crucial for cellular binding to the extracellular matrix (ECM). Since cell adhesion to the ECM occurs via integrin-RGD binding, these calculations significantly impact our understanding of cellular adhesion and movement along the ECM.

Chapter 4, "Computational Approaches in Evaluating the 5-HT Subtype Receptor Mechanism of Action for Developing Novel Chemical Entities," focuses on molecular modeling techniques for studying G-protein coupled receptors (GPCRs) and 5-HT receptors related to neurological disorders.

Chapter 5, "Current Trends in Molecular Modeling to Discover New Anti-inflammatory Drugs Targeting mPGES1," highlights the latest advances in computational methods for designing anti-inflammatory drugs targeting mPGES1. Both chapters cover the application of various computational methods, including homology modeling, docking, dynamics, and quantum mechanical/molecular mechanical (QM/MM) approaches for their respective targets.

We hope this volume provides valuable insights and shares advancements in the field of computational chemistry, demonstrating its essential role in the ongoing quest for innovative solutions in drug design and development.

Zaheer Ul-Haq Dr. Panjwani Center for Molecular Medicine and Drug Research International Center for Chemical and Biological Sciences University of Karachi Karachi Pakistan &Angela K. Wilson Department of Chemistry Michigan State University USA

List of Contributors

Anne Dayse Soares da SilvaCesmac University Center, Pharmacy Departament, Maceió, BrazilArushi ChauhanDepartment of Biophysics, Postgraduate Institute of Medical Education and Research, Chandigarh, IndiaBinglin SuiDepartment of Chemistry, University of North Dakota, Grand Forks, ND 58202-9024, USADaniel Calazans MedeirosCesmac University Center, Pharmacy Departament, Maceió, BrazilErwin García-HernándezTecnológico Nacional de México Campus Zacapoaxtla, Subdirección de Posgrado e Investigación, División de Mecatrónica, Zacapoaxtla Puebla, MéxicoGeorgina A. CardamaCenter of Molecular and Translational Oncology, National University of Quilmes, Bernal, ArgentinaIgor José dos Santos NascimentoPostgraduate Program of Pharmaceutical Sciences, Pharmacy Department, State University of Paraíba, Campina Grande, BrazilMouhmad ElayyanDepartment of Chemistry, University of North Dakota, Grand Forks, ND 58202-9024, USAMark R. HoffmannDepartment of Chemistry, University of North Dakota, Grand Forks, ND 58202-9024ND 58202-9024, USAMarianny de SouzaCesmac University Center, Pharmacy Departament, Maceió, BrazilPaula L. BucciInstitute of Sustainable Processes, University of Valladolid, Valladolid, 47011, SpainPramod K. AvtiDepartment of Biophysics, Postgraduate Institute of Medical Education and Research, Chandigarh, IndiaRicardo Olimpio de MouraPostgraduate Program of Pharmaceutical Sciences, Pharmacy Department, State University of Paraíba, Campina Grande, BrazilWashley Phyama De Jesus MarinhoDrug Development and Synthesis Laboratory, Department of Pharmacy, State University of Paraíba, Campina Grande 58429-500, BrazilYvnni Maria Sales de Medeiros e SilvaPostgraduate Program of Pharmaceutical Sciences, Pharmacy Department, State University of Paraíba, Campina Grande, Brazil

In Silico Tools to Leverage Rational Drug Design and Development in LMICs

Paula L. Bucci1,Georgina A. Cardama2,*
1 Institute of Sustainable Processes, University of Valladolid, Valladolid, 47011, Spain
2 Center of Molecular and Translational Oncology, National University of Quilmes, Bernal, Argentina

Abstract

Drug discovery and development is a time-consuming, complex, and expensive process. Usually, it takes about 15 years in the best scenario since drug candidates have a high attrition rate. Therefore, drug development projects rarely take place in low and middle-income countries (LMICs). Traditionally, this process consists of four sequential stages: (1) target identification and early drug discovery, (2) preclinical studies, (3) clinical development, and (4) review, approval and monitoring by regulatory agencies.

During the last decades, computational tools have offered interesting opportunities for Research and Development (R & D) in LMICs, since these techniques are affordable, reduce wet lab experiments in the first steps of the drug discovery process, reduce animal testing by aiding experiment design, and also provide key knowledge involving clinical data management as well as statistical analysis.

This book chapter aims to highlight different computational tools to enable early drug discovery and preclinical studies in LMICs for different pathologies, including cancer. Several strategies for drug target selection are discussed: identification, prioritization and validation of therapeutic targets; particularly focusing on high-throughput analysis of different “omics” approaches using publicly available data sets. Next, strategies to identify and optimize novel drug candidates as well as computational tools for cost-effective drug repurposing are presented. In this stage, chemoinformatics is a key emerging technology. It is important to note that additional computational methods can be used to predict possible uses of identified human-aimed drugs for veterinary purposes.

Application of computational tools is also possible for predicting pharmacokinetics and pharmacodynamics as well as drug-drug interactions. Drug safety is a key issue and it has a profound impact on drug discovery success.

Finally, artificial intelligence (AI) has also served as a potential tool for drug design and discovery, expected to be a revolution for drug development in several diseases.

It is important to note that the development of drug discovery projects is feasible in LMICs and in silico tools are expected to potentiate novel therapeutic strategies in different diseases.

Keywords: Artificial intelligence, Bioinformatics, Chemoinformatics, Computational tools, Diseases, Drug design, Drug-drug interactions, In silico, Low and middle income countries, Novel therapeutic strategies, Omics, Target identification.
*Corresponding author Georgina A. Cardama: Center of Molecular and Translational Oncology, National University of Quilmes, Bernal, Argentina; E-mail: [email protected]

INTRODUCTION

A typical drug discovery process is long, expensive and complex. It traditionally consists of four sequential stages: (1) target identification and early drug discovery, (2) preclinical studies, (3) clinical development and (4) review, approval and monitoring by regulatory agencies (Fig. 1).

Fig. (1)) The drug discovery process encompasses a variety of processes that include target identification and validation, early drug discovery, preclinical studies, clinical development and review, approval, and monitoring by regulatory agencies.

The high costs and lengthy timelines associated with drug development have been well-documented in the literature. One of the latest reports estimates a mean value of $1.3 billion in R&D investment required to bring a new therapeutic agent to market, with significant variation by therapeutic area. For example, the cost per drug for nervous system agents rises to $765 million, while anticancer and immunomodulating agents could cost about $2.7 billion per drug [1]. Prior research has estimated that preclinical costs account for, on average, 42.9% of total capitalized drug development costs [2]. Additionally, the development time of a typical innovative drug is around 12 to 15 years, with 7 to 9 years typically spent in the early drug discovery and preclinical phases [3].

The complexity of the drug development process has also been well-established. This process requires expertise from various scientific fields, including chemistry, pharmacy, physics, biochemistry, and medicine [4]. Despite these challenges, there remains a compelling necessity to search for new therapeutic agents to address a plethora of unmet medical needs worldwide, including certain types of cancer, rare diseases, neglected tropical diseases, antibiotic resistance, immunological disorders, and neurodegenerative diseases.

Importantly, research has highlighted the imbalance between disease burden and global health research attention; with diseases more prevalent in high-income countries receiving significantly more research focus. This imbalance contributes to widening healthcare access inequalities globally [5, 6]. In line with this, it is important for low- and middle-income countries (LMICs) to pursue research projects addressing their own unmet medical needs that otherwise would not be tackled.

Overall, modern drug discovery has shifted towards more rational, knowledge-driven approaches that leveragecomputational tools, structural biology, and a deeper understanding of disease mechanisms. These strategies aim toimprove the efficiency and success rate of identifying promising therapeutic candidates. These include computationaland in silico techniques like virtual screening and molecular modeling, which allow researchers to rapidly evaluatelarge chemical libraries and identify promising compounds even before physical screening. Fragment-based drugdiscovery, which starts with smaller molecular fragments, and target-based approaches that design compounds tomodulate specific disease implicated proteins, have also become more prevalent. Additionally, drug repurposingstrategies that leverage existing approved drugs have proven to be a rational and efficient path to new indications.

This book chapter aims to highlight different valuable computational tools that can be used to accelerate research and reduce drug development costs, particularly focusing on early drug development stages. It provides an overview of rational drug discovery strategies, including computational techniques, fragment-based design, target-based screening, and drug repurposing. This chapter seeks to provide a guide for researchers, particularly those without extensive computational expertise, to explore the various rational tools and techniques that can be applied to advance their drug discovery projects, with a focus on anticancer drug development.

DRUG DISCOVERY PROCESS: AN OVERVIEW

As mentioned before, the drug discovery process is long and complicated. Before starting a drug discovery project, there is a pre-discovery phase where a disease or condition with an unmet medical need is identified. In this regard, this phase entails the study of the underlying mechanisms of the disease. Usually, this part of the process includes a state-of-the-art revision and gathering detailed information about the molecular basis of the pathology, if available. This can lead to a hypothesis that the inhibition or activation of one or more components of this characterized mechanism will result in a change in cell behavior and affect disease progression in a particular patient population. This or these components become putative molecular targets [7].

As a first step of the drug discovery process, it is important to identify the most suitable molecular target. In one particular disease, there may be several putative targets, but many may be undruggable, while others may be involved in a particular disease-associated mechanism but its modulation may not have a direct effect on disease progression. Selecting the right target is extremely challenging. In fact, it has been shown that one of the main reasons for the lack of clinical efficacy of novel drugs involves poor target validation and selection [8]. Some key recommendations are defined by the Guidelines on Target Assessment for Innovative Therapeutics (GOT-IT) working group, in a valuable document that gives a robust framework for the process of target selection and prioritization, especially for academic research [9].

Upon target selection, the next step is to identify novel molecules able to modulate their activity to be able to interfere or prevent the progression of the disease. In this early drug discovery phase, there are several strategies that can be used and combined, such as high throughput screening (HTS) and computer-aided drug design (CADD), among others. After identifying a candidate, known as hit, this molecule suffers from systematic modifications in its structure to enhance potency, improve its physicochemical characteristics, and also to reduce the possibility of unwanted effects. After rounds of testing, the candidate drug is identified and the stage of preclinical testing starts. This promising candidate is evaluated in different in vitro and in vivo models, to determine preclinical efficacy, safety, tolerability, pharmacodynamics (PD) and pharmacokinetics (PK) and also possible biomarkers to be used in a putative clinical setting. If the candidate drug is effective and safe, it is eligible to be taken forward for clinical trials. In the clinical development stage, safety and tolerability are tested in Phase I; while in Phase II trials efficacy and dosing are determined. Finally, in Phase III, the efficacy of the drug candidate is evaluated in a larger patient population. Drug candidates that show therapeutic effectiveness, safety, and adequate pharmaceutical quality are revised by regulatory agencies and they may be approved. Finally, these approved new drugs are subjected to follow-up studies (Phase IV), which can change the labeling of the novel drug, or include other observations such as drug-drug interactions [10].

ADVANTAGES AND DISADVANTAGES OF RATIONAL DRUG DISCOVERY APPROACHES

All the aforementioned rational approaches serve as a valuable roadmap for modern drug development projects. However, it is important to note that the discovery of many drugs known and used today was not always based on a pre-existing idea of specific targets and pharmacological mechanisms for a particular disease. Rather, a significant portion of drug development has historically followed a more empirical, serendipitous paradigm.

The rational approaches to drug discovery that have emerged in recent years offer several key advantages but also come with some potential drawbacks that must be carefully considered.

On the positive side, these rational methods have demonstrated improved efficiency and success rates in identifying viable lead compounds. By leveraging computational screening, fragment-based design, and a deeper understanding of disease biology and molecular targets, researchers can narrow down the pool of candidates before starting wet lab testing. This can accelerate the identification of mechanism-based therapeutics that are more likely to have favorable pharmacological profiles and reduced off-target effects. The detailed target information generated through rational approaches is also valuable for regulatory approval processes.

However, some notable disadvantages exist as well. Many of the computational models and algorithms underpinning rational drug design are highly complex, requiring specialized expertise and robust technological infrastructure that may not be readily available, especially in resource-constrained settings. There is also the potential for inherent biases or limitations in the data and assumptions used in these in-silico approaches. Additionally, there is a risk of overlooking serendipitous discoveries that could arise from more exploratory, phenotypic screening methods.

These key advantages and disadvantages highlight the importance of considering a balanced, multi-pronged strategy for drug discovery. While the rational approaches mentioned in this chapter can have a tremendously positive impact, they may need to be complemented by other empirical techniques such as manual or visual selection of drug candidates to maximize the chances of successful drug development.

EARLY DRUG DEVELOPMENT: IN SILICO TOOLS

Therapeutic Targets: Identification, Prioritization and Validation

The mining of available biomedical data and information has greatly boosted target discovery in the large-scale “omics” era. Therefore, computational tools in LMICs are affordable strategies that play a vital role in advancing biomedical research and addressing public health priorities in regions with high poverty, inequality, and disease burden indicators.

The process of finding novel drugs and biomarkers for human diseases starts with target discovery. In the biomedical field, the term “target” refers to a wide variety of biological events, including molecular functions, pathways, and phenotypes, that may involve different molecular entities such as genes, proteins, and RNAs. In this regard, data mining pertains to a bioinformatics methodology that blends biological principles with computer instruments or statistical techniques, mainly employed for target discovery, selection, and prioritization. In most drug discovery projects nowadays, physical (HTS), and computational-aided drug design (CADD) are frequently combined to enhance the drug development process success [11].

The process of target identification involves studying the mechanism and points of intervention in the disease or condition of interest and verifying that a potential component is important in initiating and/or maintaining the disease. Strategies and approaches encompass a wide range of technologies available to study diseases, such as molecular biology, functional assays, image analysis, and in vivo studies related to functional assessment, among others. Some techniques used in this phase include data mining [12], phenotype screening [13] and epigenetic, genomic, transcriptomic and proteomic methods [13, 14].

After identifying putative targets, prioritization approaches [14] serve as key tools to rank these targets based on their likelihood of being suitable targets in the context of a specific disease. There are several computational approaches availa- ble that include network-based approaches, phenome-wide association studies involving genetic evidence, machine learning methods, among others.

Finally, potential targets must then be validated to determine whether they limit disease progression or induction. Establishing a strong link between target and disease increases confidence in the scientific hypothesis and thus success and efficacy in later phases of the drug discovery process [15]. The validation of a particular target involves the technical assessment of whether a target plays a key role in a disease process and whether pharmacological modulation of the target could be effective in a defined patient population. Expression and enzymatic assays are commonly used techniques, and also the development of knockout/knock-in animal models using antisense/SiRNA and genomic strategies. Translation between humans and animals can also be an important feature to build confidence in the development of screening assays for lead molecule identification [16].

Regarding target identification, prioritization and validation, several computational methods have shown to be useful, faster, less biased and more informative since they are able to integrate and analyze available data systematically. However, it is important to take into consideration that the estimation of the predictive power of the method is necessary to be able to validate the performance of the model. In fact, target prediction methods can have a profound impact on the success of a drug discovery process [17]. Next, we describe some of these valuable tools regarding target selection.

Computational Methods to Aid Target Identification

Text Mining: Identification of Disease-Associated Entities

Text mining has been widely applied to identify disease-associated entities (genes/proteins) and understand their roles in disease. In fact, one important strategy is to build a network of specific gene interactions that have shown to have a key role in a specific disease. The development of such a network commonly starts with a literature search of text articles stored in PubMed Central (PMC), based on dependency parsing and the support vector machine (SVM) method.

The databases with permissive licenses, are namely manually curated associations from Genetics Home Reference (GHR) [18] and UniProt Knowledgebase (UniProtKB) [19], genome-wide association studies (GWAS) [20] results from DistiLD, and mutation data from the Catalog of Somatic Mutations in Cancer (COSMIC) [21].

Using text mining, it is possible to assign curated bibliographic sources and comparable quality scores to each stored information, allowing it to be downloaded massively when necessary or required. The information will then be available as a web resource (e.g.http://diseases.jensenlab.org/) aimed at end users interested in individual diseases or genes.

An illustrative example of these approaches is reported by Pospisil et al. [22] where the combination of textual-structural mining of searches of PubMed abstracts, universal gene/protein database (UniProt, InterPro, NCBI Entrez) and pathway knowledge bases (LSGraph and Ingenuity Pathway Analysis) are used to identify putative enzymes in the extracellular space of different tumor types. In this regard, the LSGraph program, a popular tool for bibliographic mining, is utilized to extract entities from a curated database using keywords and Gene Ontology (GO) terms. These entities have been updated and expanded with relevant functional annotations, and then categorized based on cellular locations and biochemical functions within the Ingenuity knowledge base.

Another commonly used mining tool is GeneWays. It is designed to automatically analyze a large volume of full-text research articles in order to predict physical interactions (edges) between candidate disease genes (seed nodes) that are mentioned in the literature [23].

Text mining is a valuable tool for extracting biological entities and knowledge from a vast number of research articles. However, there are still some challenges that need to be addressed. One issue is the variability in terms used to describe biomedical concepts, which can result in incorrect associations between molecular biology and human diseases. Another limitation is the limited access to full-text articles and citation information, as more comprehensive and detailed information is often found in the full text rather than in abstracts. This can lead to an underestimation of the number of entities identified through text mining.

Microarray Data Mining

Microarray data mining involves the utilization of bioinformatics techniques to analyze microarray data and identify biological elements and pathways that characterize a specific phenotype, such as a human disease [24, 25]. With the generation of large amounts of microarray data, it has become increasingly important to address the challenges of data quality and standardization related to this technology [26]. Presently, microarray data mining has demonstrated its efficacy in identifying target genes linked to human diseases.

There are two common methods for in depth microarray data analysis, i.e., clustering and classification [27, 28]. Clustering is one of the unsupervised approaches to classify data into groups of genes or samples with similar patterns that are characteristic to the group. K-means clustering is a data mining/machine learning algorithm used to cluster observations into groups of related observations without any prior knowledge of those relationships. It is one of the simplest clustering techniques and it is commonly used in medical imaging and biometrics [29]. On the other hand, a self-organizing map (SOM) is a neural network-based non-hierarchical clustering approach.

Classification is known as class prediction or discriminant analysis. Generally, classification is a process of learning-from-examples. Given a set of pre-classified examples, the classifier learns to assign an unseen test case to one of the classes [30]. In a typical supervised analysis, the overall gene expression profiles of tissues or fluids associated to a certain disease will be compared with those of normal tissues or fluids (e.g., cancer vs. healthy tissues or fluids), from which a list of target genes or biological pathways that are important in diseases will be identified. In this type of analysis, it is important to complement with the use of supervised classification methods such as linear discriminant analysis, nearest neighborhood search and genetic algorithms [31].

Differentially expressed genes are the genes whose expression levels are significantly different between two groups of experiments [32].

Some computational methods are used to analyze microarray data. One of them is named Gene Set Enrichment Analysis (GSEA). GSEA is employed to determine the statistical significance and consistency of differences in gene expression between two biological states. Gene sets are constructed based on prior biological knowledge, including information on biochemical pathways, genes located in the same cytogenetic band, those sharing a common Gene Ontology category, or any user-defined set. The main aim of GSEA is to identify whether the genes within a set are predominantly found at the top or bottom of a ranked list, indicating a correlation with the phenotypic class distinction [33]. GeneCards (http://www.genecards.org/) is another tool commonly used on a daily basis where the retrieved genes are organized by putative function.

In spite of its advantages, microarray data mining also presents a number of limitations and challenges for target discovery [34]. First, data mining of a list of target genes is not the end of genomic analysis, and since gene expression levels do not always correlate with protein levels, follow-up experiments are required to validate protein expression levels and protein functions [35]. Second, microarray data exist at a variety of scales depending on the specific technology platform as well as the individual experimental procedures. Therefore, microarray data from different laboratories are not always directly comparable. Third, data availability and integration can be a challenge for microarray data mining. In the post-genomic era, the explosion of gene expression data requires timely data storage and updating of gene databases. Moreover, different data storage formats across databases have posed a great challenge for data mining and analysis [36].

Basic microarray data analysis tasks include the classification, clustering, and identification of differential genes using gene expression profiles exclusively. Nonetheless, the connection of gene expression profiles with external sources can facilitate the emergence of new findings and information.

Currently, driven by the exponential growth of microarray data in recent years, considerable effort has been made to develop microarray databases with timely public accessibility in a manner that facilitates target discovery. Indeed, the identification of functional elements such as transcription-factor binding sites (TFBS) on a whole-genome level is the next challenge for genome sciences and gene-regulation studies.

Open Proteomics

The Open Proteomic Database (OPD) and the EMBL Proteomic Database (PRIDE) have been released to the public, providing access to valuable proteomic datasets (Open Proteomic Database (OPD) (http://bioinformatics.icmb.utexas.edu/ OPD/) and EMBL Proteomics Identifications Database (PRIDE) (www.ebi.ac.uk/ pride/)).

In order to uncover diagnostic signature patterns, various mining methods such as Bayesian analysis, rule-based analysis, and similarity scoring have been suggested to these computational methods [37].

These databases serve as a valuable resource for researchers working in the field of proteomics, enabling them to access and analyze mass spectrometry data for diverse biological studies (http://bioinformatics.icmb.utexas.edu/OPD/).

Chemogenomics

Chemogenomic data mining is an emerging approach in the field of data mining. This innovative technology focuses on interpreting chemical genomics data and analyzing various phenotypes of interest, including viability, cell morphology, behavior, and gene expression profiles. By utilizing small molecule chemical libraries in conjunction with cell libraries, researchers are able to generate a 2D matrix that represents the chemical library on one axis and the library of different cell types on the other axis [38]. However, the process of chemogenomic data mining presents certain challenges, which have prompted the development of specialized mining tools and methods. These tools aim to systematically profile and analyze the data [39]. To achieve this, several supervised or unsupervised clustering algorithms have been proposed. These algorithms help identify a subset of genes within the entire dataset that possess significant functions.

Integrated Data Mining

The identification of potential targets in the field of medicine is a challenging task, mainly due to the intricate nature of human diseases and the diverse range of biological data available. It is widely acknowledged that no single data mining approach can fully comprehend the complex cellular mechanisms and reconstruct biological networks, both now and in the foreseeable future. Therefore, in order to enhance the discovery of valuable targets, it is imperative to integrate and analyze data from various sources and disciplines, while considering the strengths and limitations of each approach. Among the most commonly employed strategies is the combination of text mining with high-throughput data analysis, such as genomic, proteomic, or chemogenomic data [40]. This integration has proven to be instrumental in the identification of disease markers and potential drug targets.

Publicly available curated databases such as Gene Expression Omnibus (GEO), GWAS central (https://www.gwascentral.org/) and GWAS catalog (https://www.ebi.ac.uk/gwas/), Clinvar (https://www.ncbi.nlm.nih.gov/clinvar/), Disgenet (https://www.disgenet.org/) and genomic datasets are key tools to extract and analyze population-based and evidence-based targets associated with different diseases [41, 42].

As an interesting example, an in silico molecular study of asthma-associated targets in LMICs has been reported. This research leveraged computational techniques to evaluate thousands of asthma-associated molecular targets and patient expression datasets to identify the most relevant therapeutic targets. In this study, the authors used a proprietary tool called Ontosight® Discover (https://ontosight.ai/) (US20200090789A1) to annotate asthma-associated genes and proteins. In addition, they also collected and evaluated asthma-related patients’ datasets through bioinformatics- and machine learning-based approaches to identify the most suitable targets. However, the disadvantage is that they are not free, the users should pay to work with them. Even though this report used licensed software, it is an important example of target identification using valuable computational tools.

Computational Methods to Aid Target Prioritization

The implementation of various of the aforementioned bioinformatic tools leads to the identification of several putative targets, but computer-based target prioritization methods have been less studied so far [43]. However, several useful tools have been described. Of particular interest, these are platforms that systematically integrate and harmonize different databases comprising different information types to finally prioritize targets within a particular disease. The underlying strategy varies depending on the platform used, as well as the scoring criteria and the visualization interphase.

One example is DisGeNET, an open-access platform that integrates databases with text-mined data and then features a score based on the supporting evidence to prioritize gene-disease associations [44]. Additionally, GuiltyTargets presents an association approach that uses a genome-wide protein-protein interaction network annotated with disease-specific differential gene expression and uses positive-unlabeled machine learning for candidate ranking [43]. Of interest, the OpenTagets platform is a public-private partnership to establish an informatics platform that associates targets and diseases, providing tools to prioritize these target-disease hypotheses [45]. Other examples of interesting resources for target prioritization are PHAROS [46] and TargetMine [47], among others.

Computational Methods to Aid Target Validation

Target validation is a key part of target selection. Once the target is identified, it is important to confirm the causal link between a potential target and a specific disease phenotype. For this purpose, there are many datasets used to establish the association of the target with the disease using a computational approach [48]. The application of advanced molecular techniques has led to an increased amount of genomic and lately proteomic data [49]. Therefore, bioinformatic tools play an increasingly important role in the target validation process, where biomedical knowledge leading to biological functions of putative targets can be mined from different databases.

In silico analysis involves, as a part of the analysis, the evaluation of differential expression patterns in groups of patients or between normal or diseased cells or tissues. There are several platforms, mainly derived from microarray data, that gather information about expression levels, polymorphisms, mutations, etc. Alongside this, it is important to address where the expression of the putative molecular target is localized. Large-scale omics data also serves as a key input to analyze this issue.

Some important bioinformatic tools to aid this important issue include but are not limited to, the Drug Gene Interaction database (DGIdb) [50]. Therapeutic Target Database [51], and many of the databases cited before include Gene Expression Omnibus (GEO) [52], OpenTagets [53], and TargetMine [54], among others.

Lastly, it is imperative in the validation process to include functional assays to substantiate the target´s role in a disease mechanism or biological effect, using relevant in vitro and in vivo models and high-quality patient tissue samples.

Identification and Optimization of Drug Candidates

Computational tools have become a standard approach for accelerating drug development and discovery in the pharmaceutical industry, mainly in LMIC countries. In this regard, a plethora of chemoinformatic tools have emerged in the last decades. “Chemoinformatics” refers to the integration of software for chemical information processing. It bridges the fields of chemistry and computer science. Chemoinformatic can be divided into the following areas of analysis:

1- Selection of biological targets and collection of possible ligand datasets: this involves characterization of the compound by searching freely available data sources, extracting molecule information from databases as well as substructure and similarity search. This allows the extraction of substructure fragments or other chemical descriptors [55].

2- Selection and prioritization of chemical characteristics: finding chemical fingerprints that represent chemical characteristics of the compound, which allows comparing different compounds based on shared chemical characteristics.

3- Study and evaluation of model prediction: the identification of chemical fingerprints in the previous point can be used in several machine learning models to predict other chemical and physicochemical properties in the QSAR/QSPR (further explained below) analysis from the three-dimensional chemical structure [56, 57].

4- Compound optimization and hit identification: The structural features of the 3D compounds identified allow for determining their level of chemical similarity. Finally, the 3D Tanimoto index, a widely used 3D similarity metric, calculates the ratio of shared molecular volumes between two ligands [58]. Statistical models can be used to train such models, which can then make inferences from the training data by comparison [59].

Finally, this workflow allows the identification and optimization of potential drug candidates using virtual screening, dynamic simulation, and docking simulation experiments or a combination of these approaches [59, 60].

Virtual screening involves rapidly screening large chemical libraries to identify compounds with desirable characteristics. On the other hand, molecular dynamics simulations predict the behavior and interactions of molecules in large biological systems, while docking simulations predict the binding affinity of small compounds to their target proteins. These methods are responsible for greatly accelerating the process of new drug discovery and increasing the rate at which promising candidates are found [61] and are further described below.

Novel Drug Development vs. Repurposing

Despite the precise and rigorous nature of the drug discovery process, these types of projects come with their share of challenges. In fact, the attrition rate is high. Many compounds that start the journey do not make it to the end due to various reasons, including failure in efficacy, unexpected side effects, or commercial considerations. Therefore, several alternatives are used to reduce the failure risk, being drug repurposing strategies the most relevant, particularly in LMICs. Drug repurposing involves exploring new uses for existing drugs, a faster and more cost-effective approach. Repositioning schemes for already existing drugs are interesting since the use of validated, toxicologically safe, and approved pharmaceuticals can increase the success rate of drug development [62-64].

A drug repurposing strategy usually consists of three steps: identification of a candidate drug for a given indication, mechanistic assessment of the drug effect in preclinical models; and further evaluation of efficacy in clinical trials. For the first step, computational approaches are typically used and usually involve systematic analysis of large-scale data.

The hypothesis of identifying a potential drug for a given disease can be based on signature matching. This approach is based on the comparison of a particular drug candidate against another drug (drug-drug similarity), disease (estimating drug-disease similarity), or clinical condition using transcriptomic, proteomic, metabolomic data; chemical structures or adverse effects. In this regard, it is important to mention the connectivity Map (cMap) resource, a popular platform designed for data-driven drug repositioning using a large transcriptomic compendium [65]. Other approaches widely used are molecular docking, pathway mapping, retrospective clinical analysis, etc. [64]. Some of these tools are described below.

VLS and Drug Design

Once the target is identified, several bioinformatics tools can provide its predicted structure. Commonly used targets are proteins; in this regard, there are multiscale models that take into consideration structural and thermodynamic features that can be used to model the 3D structure of the protein target [66]. For this purpose, some key tools are listed in Table 1.

Table 1Target (gene-centric) databases for integrative knowledge graphs.Target DatabaseDescriptionLinksAlphaFold Protein Structure DatabaseOpen access database: It predicts protein structures based on the state-of-the-art AI system.https://alphafold.ebi.ac.uk/Binding databaseIt allows us to make a prediction about affinities primarily between proteins and druglike molecules.https://www.bindingdb.org/bind/index.jspBindingMOADPrediction of the highest quality ligand-protein binding all derived from PDB.https://bindingmoad.org/PDBbind databaseIt allowsthe prediction and quantification of binding affinity data for biomolecular complexes found in PDB.http://www.pdbbind.org.cn/Protein Data Bank (PDB)It shows 3D structures of proteins, nucleic acids, and complex assemblies from enzymes and health disorders.https://www.rcsb.org/Sequence Read Archive (SRA)Big repository of sequencing data pertaining to all biological fields.https://www.ncbi.nlm.nih.gov/sra

Once the target´s structure is available, identifying putative binding sites is necessary to properly carry out the drug design phase. For binding site identification and analysis, several tools are available, such as SiteHound [67] and fPocket [68].

After defining the desired drug binding site, virtual screening can be used to discover new ligands on the basis of the biological structure of the target protein [69]. This computer-aided approach significantly reduces the possible number of candidates for in vitro and in vivo testing, providing key opportunities for drug discovery in LMICs.

There are several drug design methods, but there are mainly two approaches widely used: structure-based drug design and ligand-based drug design. In the first approach, experimental structural data provided by crystallography, NMR, etc. enable prediction methods of the protein 3D structure with high precision. Molecular modeling software can be used to analyze the physicochemical properties of the selected drug binding sites on the target protein, analyzing key residues, electrostatic and hydrophobic fields, and hydrogen bonds [70]. On the other hand, ligand-based drug design relies on known ligands that bind to the selected target. Using this molecule, different computational models are used to design novel molecules that interact with the target.

Virtual screening is a relevant component of computer-aided drug design. The main purpose is to predict which compound could present pharmacological activity, reducing the amount of compounds actually being tested experimentally. Some common methods of virtual screening include molecular docking, pharmacophore modeling, and quantitative structure-activity relationship (QSAR).

In molecular docking, the software predicts the interaction patterns between the target and small molecules or peptides taking into consideration spatial shape, energy matching, and molecule conformation. The list of the freely accessible is extensive and many are web-based resources. Some of them include AutoDock [71], AutoDock Vina [72], RosettaLigand [73], GlamDock [74], and EDock [75]. Many other platforms are reviewed in Glaab E. in a thorough report [76]. Interestingly, in the last years, several open-source platforms have been developed to carry out virtual screening of ultra-large libraries of compounds, such as VirtualFlow [77].

Pharmacophore modeling is a technique that identifies the essential 3D features of a molecule that are necessary for its biological activity. It involves abstracting the key structural and electronic characteristics of a set of active compounds into an idealized 3D model [78]. Some useful open-access platforms for virtual screening based on pharmacophores are Pharmit [79], DrugOn [80] and ZINCPharmer [81] among others.

Finally, QSAR (Quantitative Structure-Activity Relationship) modeling is used to develop predictive models that can correlate chemical structure with biological activity. These QSAR models are then applied to the virtual screening of large chemical databases to identify promising compounds for further experimental testing. There are several freely available tools to derive QSAR models for virtual screening of the biological activity of chemical compounds, such as DPubChem [82], the workflow described by Mansouri et al. [83], among many others [84].

Compound Libraries

There is a wide variety of chemical databases containing small molecules available for virtual screening approaches. Comparative studies have shown that there is a substantial overlap among the different chemical collections nevertheless each database has unique features that can be more suitable for a certain project. Already prepared virtual libraries can be used, but users can also generate their own. Some of them are freely available and have already been filtered regarding “drug-likeness” while others encompass a wider part of the chemical space. Some of these collections are directly available from the vendor ligand catalogs, assuring the possibility of being purchased right after hit selection. Many databases include novel structures, while other libraries comprise natural products [85] or approved drugs for repurposing projects [86]. It is important to mention that many of these databases need to be prepared and filtered before virtual screening methodologies. Some of them are listed in Table 2.

Table 2Chemical databases for virtual library screening experiments.DatabaseDescriptionAccessCitationsZINC20Provides downloadable 2D and 3D versions as well as a website that enables rapid molecule lookup and analog search.www.zinc20.docking.org[87]PubchemLarge repository of chemical compounds and their biological activities obtained through biological assays.https://pubchem.ncbi.nlm.nih.gov/[88]NCI Open Database libraryCompounds from the Developmental Therapeutics Program at NCI/NIH.http://dtp.nci.nih.gov-ChEMBLThis is a resource of bioactive molecules with drug-like properties. Initially introduced in 2009 and continuously incorporating new features.https://www.ebi.ac.uk/chembl/[89]COCONUTAn aggregated dataset of elucidated and predicted Natural Products collected from open sources and a web interface to browse.https://coconut.naturalproducts.net[90]SuperNatural 3.0A freely available database of natural products and derivatives.http://bioinf-applied.charite.de/supernatural_3[91]e-Drug3DA collection of annotated 3D structures of FDA-approved drugs.http://chemoinfo.ipmc.cnrs.fr/e-drug3d.html[92]Collective Molecular Activities of Useful Plants (CMAUP) databaseSummarizes biological activities of traditional medicinal plants worldwide. Includes metadata on human target proteins and disease indications.http://bidd.group/CMAUP/[93]Benzylisoquinoline Alkaloid Database (BIAdb)Alkaloids as a source of therapeutic agents.https://webs.iiitd.edu.in/raghava/biadb[94]NaturaL prOducTs occUrrence databaSe (LOTUS) onlineAn open-source project for Natural Products (NPs) storage.https://lotus.naturalproducts.net/[95]Naturally Occurring Plant-based Anti-cancer Compound-Activity-Target database (NPACT)Compounds isolated from medicinal plants reported to have anti-cancer activities (including in vitro or in vivo testing).http://crdd.osdd.net/raghava/npact/[96]Nuclei of Bioassays, Ecophysiology and Biosynthesis of Natural Products Database (NuBBEDB)Database covering chemical and biological information from Brazil.https://nubbe.iq.unesp.br/portal/nubbe-search.html[97]Traditional Chinese Medicine Integrated Database (TCMID) and Traditional Chinese medicine (TCM) Database@TaiwanA repertoire of compounds from medicinal plants and 3D structures of isolated compounds from Chinese.http://bidd.group/TCMID/http://tcm.cmu.edu.tw/about01[98]Vendor ligand catalogsENAMINE MolPort.https://enamine.net/https://www.molport.com/[99]
Hit-to-Lead and Lead Optimization Tools

Once a hit is identified, computational chemistry and molecular modeling play an important role during the hit-to-lead (H2L) stage by both suggesting putative optimizations and decreasing the number of compounds to be experimentally evaluated.

Typically, H2L involves chemical modifications of the validated hit to optimize its affinity for the target to become a lead compound. Usually, this stage of the drug discovery process can be time-consuming and expensive. Trial-and-error strategies involve cycles of compound synthesis, evaluation, and selection or rejection with the aim of reaching a suitable affinity and maintaining its selectivity. Different computer-aided approaches can be used to improve H2L that have already been described, such as QSAR, molecular docking, and pharmacophore screening, among others.

Several authors have described some of the most important algorithms to tackle H2L optimization, such as LigBuilder [100], AILDE [101, 102], and ChemoDOTS [102], among others [103].

Additional Chemoinformatics Tools

Diverse methods have been utilized to successfully implement chemoinformatics in drug discovery. Many of them have been described before, but there are many other key tools that can be used for data mining, visual screening, and structure-activity relationship studies. We have listed some of these tools below:

ChemDraw is a Macintosh and Microsoft Windows program first developed by David A. Evans and Stewart Rubenstein in 1985 and later by the chemoinformatics company CambridgeSoft. This is a molecular editor tool that, along with Chem3D and ChemFinder, is part of the ChemOffice suite of programs [104].ChemReader is a fully automated tool that extracts chemical structure information from images in research articles and translates that information into standard chemical formats that can be searched and analyzed [105].ChemSketch is a molecular modeling program that allows the drawing and modification of structures of chemical compounds and structural analysis that includes understanding chemical bonds and functional groups.ChemWindow is a program developed by Bio-Rad Laboratories, Inc. that allows drawing chemical structures, 3D visualization, and database searching.Chemistry Development Kit (CDK) is a JAVA software for use in bioinformatics and chemoinformatics available for Windows, Linux, and Macintosh. The program allows 2D molecular generation, 3D geometry generation, descriptors, and fingerprints calculation and supports various chemical structure formats [106].ChemmineR is an R language chemoinformatics program for analyzing small molecule drug-like compounds data and enables similarity searching, clustering, and classification of chemical compounds using a wide range of algorithms [107].JME molecular editor is a JAVA applet that allows the creation and modification of chemical compounds and reactions and can display molecules within an HTML page [108].Molecular Operating Environment is a scientific vector language-based software program the applications of which include structure- and fragment-based design, pharmacophore synthesis, protein and molecular modeling and simulations in addition to cheminformatics and QSAR.Open Babel is a software that is used for the interconversion of chemical file formats. It also allows substructure searching as well as fingerprint calculation [108]. It is available for Windows, Linux, and Macintosh.OpenEye is a drug discovery and design software kit and its areas of application include the generation of chemical structures, docking, shape comparison, cheminformatics, and visualization. OpenEye toolkits are available in multiple programming languages that are C++, JAVA, and Python [109].Chemaxon provides various chemoinformatics software programs, applications, and services for drawing structures of chemical compounds and their visualization, searching and management of chemical databases, clustering of chemical compounds, and drug discovery and design [110].Online Chemical Modeling Environment (OCHEM) is a web-based platform designed to automate and simplify the typical steps required for QSAR modeling. It aims to provide users with a comprehensive tool for data storage, model development, and the publication of chemical information. OCHEM offers features such as estimating the accuracy of predictions, providing applicability domain assessment, and allowing users to seamlessly integrate predictions with other approaches. The primary objective of OCHEM is to consolidate a comprehensive range of chemoinformatics tools into a single, accessible, and user-friendly resource. The OCHEM is free for web users and it is available online at http://www.ochem.eu [111].

Various other tools such as PowerMV, PaDEL, CDD (Collaborative Drug Discovery), RDKit, 3D-e-chem, MedChem Studio, MedChem Designer, Mol2Mol, Chimera, VMD, ArgusLab, ChemTK, Premier Biosoft and many others are also widely used for chemoinformatics applications.

PHARMACOKINETICS AND PHARMACODYNAMICS PREDICTION

The pharmacokinetic (PK) profiles of new chemical entities are key determinants of efficacy and toxicity. These profiles are governed by absorption, distribution, metabolism, and excretion (ADME). Each ADME parameter can be assessed experimentally using certain experimental settings. Absorption is evaluated mainly using solubility and membrane permeability, while distribution can be tested using protein binding, tissue binding, and P-glycoprotein binding. Metabolism can be determined by using liver microsomes or by hepatic clearance and finally, excretion can be established by using renal clearance and urinary excretion rates.

Interestingly, ADME prediction using in silico models has become essential in the drug discovery process. There are several IT companies that have developed robust predictive platforms for various ADME parameters, however, due to high licensing fees there is limited access to these commercial software [112]. Importantly, freely available prediction platforms have been developed, and some of them are listed in Table 3.

Table 3Description of freely available prediction platforms.Prediction PlatformsDescriptionWeb LinksSwissADMEFree web tool with free access to a pool of robust predictive models for PK, drug-likeness, bioavailability, etc., designed for specialists and nonexperts [113, 114].http://www.swissadme.ch/pkCSMIntegrated platform to rapidly evaluate PK and toxicity properties using graph-based signatures to develop predictive models [114].http://structure.bioc.cam.ac.uk/pkcsmiD3-INSTA platform that comprises high-quality open-access databases and innovative computational models with high predictive performance for PK, cardiotoxicity, and liver injury of small-molecule drugs.https://www.id3inst.orgADMETlab 2.0Web server for the predictions of pharmacokinetics and toxicity properties of chemicals.https://admetmesh.scbdd.com