Modern Drug Design & Development including medicinal, combinatorial and biological chemistry as well as in silico modeling and agrochemistry.

FIG. 1: JAK docking model for the identification of JAK Inhibitors 

Achievements of the CI Team

 As a result, more than 200 scientific papers (full-paper articles, letters and comprehensive reviews) have been published in various authoritative and high IF scientific journals. In addition, a number of government-sponsored/supported projects (RSF, RFBR, MESR, RMIT, etc.) have been successfully carried out, maintained and supervised.

Several small molecule drugs have been developed and are currently being tested in clinics.

Among them are: neuraminidase inhibitors (AV5080, AV5027 and AV5075S, Phase І), NS5A inhibitor (AV4025, Phase ІІ), androgen receptor antagonist (ONC1-13B, Phase І), 5HT6R serotonin antagonists (AVN-211 and AVN-322, Phase І/ІІ), AVR-560 (against HCV, Phase І); new anticancer drugs and selective ASPGP-R and PSMA-R drug conjugates (targeted drug delivery) are currently undergoing preclinical studies.

The project leader is the author of several specific software in the field of chemoinformatics and in silico-modeling, as well as commercial focused libraries for HTS.

The prime focus is on professional scientific research in the field of medicinal chemistry and in silico drug design, literacy and article writing (FROM the original idea/materials - to manuscript writing, preparation, editing/correction/revision and the resulting publication);

- Research in the field of modern DD&D as well as chemo- and bioinformatics including medicinal, biological, computational and combinatorial chemistry, biological and virtual screening as well as agrochemistry and preclinical evaluation. The list includes the following specific areas: in silico Drug Design, H2L and L2D optimization, targeted (focused) and random library design, advanced data mining, ADME/Tox assessment (pharmacokinetic profiling), QSAR modeling (3D-molecular docking, 3D- and topological pharmacophore modeling/searching, bioisosteric morphing, similarity, etc.) and close areas. In silico, the main focus is on Kohonen and Sammon Self-organizing mapping, Principal Component Analysis (PCA), Stochastic Proximity Embedding (SPE) as well as classical artificial neural networks (ANN) and related algorithms, etc.;

- Business partnership and collaboration with global leading pharmaceutical companies – Big Pharma – e.g. Novartis, Merck, Bayer, Sanofi, Johnson & Johnson, Pfizer, Roche, GSK, AstraZeneca, Abbott, Eli Lilly, etc.; as well as numerous Russian academic institutions and companies, including MSU (mainly chemical dept., and biological dept.), IBG RAS, IBC RAS, MTU, MISIS, Dmitry Mendeleev UCT, MIPT, St. Petersburg University, SCHELKOVO AGROCHIM, agrosintez, etc. Scientific consulting, expert opinion and development (CRO) in numerous R&D government projects ranged among bioinformatics, medicinal and biological chemistry and pharmacology, as well as computational modeling and software development;

The core developer/architect is the inventor of several computational algorithms and software in the field of advanced data mining, bioinformatics and nonlinear analysis, particularly for Medicinal Chemistry and Agrochemical applications, these include:

1. SmartMining (ChemDiv. Inc.,, DD&D, in silico modeling;

2. ARC-Soft (exclusive agrochemical software developed for "Shelkovo-agrochim",;

3. InformaGenesis (the `master` software product, InformaGenesis,;

4. Focused Library Manager (CDL), (ChemDiv. Inc.,

5. Nano-Predict. Prediction of toxicity of different types of nanotubes

6. Neural-based Intelligent Scientific Translator (NIST) – automatic scientific translator (the official registration number in ROSPATENT / programs for computers / №2011617859 / 07.10.2011)

7. Chemoinformatic software PC_Searcher – for specific medicinal chemistry purpose.

In silico Screenings

The following description shows a maximum diversity set in preparation out of the 1,550,000 stock available to our Partners.

A compound set will complement the already existing diversity of the current stock and will be combined with a focused set from the Chemoinformatic (CI) & Medicinal Chemistry(MC) Team towards requested Target(s)

Number of chemotypes

The team is known for its unparalleled ability to expand the portfolio of validated chemical library systems, development of new reactions, scale up of intermediates, and innovations in diversity- and target-specific chemistry space design.

The compounds will be selected from our Partners’s small molecules collection of more than 1,550,000 diverse drug-like small molecules. Selection can be made against the existing customer library and/or taking into account structural and functional considerations of the customer to maximally increase the diversity of the original data set to maximize the void chemical space between the libraries.

The collection consists of more than 15,000 chemical families (chemotypes), that provides the highest available diversity score and reasonable number of analogs in the series.

Diversity calculations (described in Trepalin et al.).

Molecular diversity calculation procedures include assessment of 2D structural fragment descriptors, diversity of heterocycles, and maximal plate diversity. Each algorithm allows optimization of both intra- and interlibrary diversities.

The ChemoSoftTM approach is used to dissect a molecule into the structural fragments and calculate the diversity. Sequentially, taking each atom in the molecule as a centroid, the algorithm determines two structural fragments (“screens”) around each centroid, comprised of either “nearest” atoms (captured in the sphere with a radius equal to one valent bond) and the fragment with the “nearest” and “next-to-nearest” atoms (captured in the sphere with a radius equal to two valent bonds). After the screens are determined for each molecule, a combined “screen key” is created that contains all of the fragments describing the two (or more) molecules including fragments that are common to the molecules under consideration, as well as unique fragments. For each molecule a binary code is then determined. The binary code of 1 or 0 is assigned to each screen that is present or absent in the molecule.” The molecule keys are then compared using the following equation:

Analog Series

The team developed and practiced a strategy combining the rigor of single compound synthesis in liquid phase with the high throughput of parallel synthesis and purification of combinatorial methods. One promising method enhancing the efficiency of compound synthesis is the use of multicomponent reactions, in which several building blocks are brought together in a single step.

We applied this method for the Ugi, Biginelli, Passerini, Tsuge, and other reactions and developed several significant modifications of known multicomponent reactions.

Such an approach gives our customers the opportunity to perform direct hit analogs search from stock available compounds.  

Privileged structures/ Focused Sets

Privileged structures are defined as chemical scaffolds that are present in many biologically active ligands and determining the molecule’s specificity (Evans et al., 1988).

Using “privileged” scaffolds as building blocks is advantageous in synthesis of diverse sets of derivatives for discovery libraries, particularly in the cases when no small molecule ligands are known for the target and no structural information is available. In order to enrich the IP potential of these libraries, we applied the privileged scaffolds approach to implement structural morphing of privileged structures based on functional equivalents of their constituent hetero atoms.

The team actively explores a space of natural and semi-natural scaffolds as an important libraries development strategy. When synthetic routes are designed for a new series, we maintain a “genetic originality” of the compounds by retaining their unique core scaffolds and apply focused modifications of side chains.

Our strategy for lead generation (focused) libraries design includes :

• Gathering project data (reference compounds, X-ray structure if available, literature patents)

• Computational chemistry (methods depends on target)

• Chemistry evaluation (investigate parallel feasibility)

• CCE database. Library ideas. Synthesis

The computational tool that we use on a regular basis to narrow down large chemical space to a more relevant chemical space are listed here.

• ChemoSoftTM (MIPT)

• Smart Mining (MIPT)

• Cerius2 (Accelrys)

• Discovery Studio (Accelrys)

• NeuroSolution (NeuroDimension)

• ISIS Base (MDL)

• AutoDock (Scripps)

• Surflex (Biopharmics)

• MolSoft ICM (MolSoft LLC)

The first two programs have been created here at MIPT/CDRI. The last one is our partner. All of these tools help maximize the chance of your finding an active compound using such familiar approaches as docking, searches based on 3D pharmacophore models and shape similarity (target-based strategy) and 2D fingerprint similarity, QSAR models, and substructure searches (in the ligand-based franchise).

We also actively use a Neural Networks (NN) approach, frequently termed as AI-based approach, to assess libraries’ various parameters influencing ADME/PK characteristics such as potential interactions with P450, blood-brain barrier permeability, DMSO solubility, probability of being modulators of particular target classes, etc.

For such libraries, using ChemoSoftTM, we can efficiently calculate prediction of their major physio-chemical parameters, which are routinely used for the assessment of the compounds’ drug-likeness based on the Lipinski rules of 5 and certain ADME predictions.

Examples of available focused libraries platforms

Kinase library (set of sub-libraries), GPCR library, AGRO library, Apoptotic library, CNS library (set of sublibraries), Fragment based library, HSP90 library, ion channel library, proteases library, NHR library, phosphatases library, peptidomimetic library, focused diversity set.

Examples of custom made focused sets:

• GPCRs: Serotonin 5-HT6, 5-HT7, 5-HT2C, Glutamate mGluR5, mGluR2/3, mGluR7, Galanin Gal-3, Dopamine D1, Histamine H1, H3, NeurotensinNT1/2, Bradycinin B1/2, Tachykinin NK1, NK3, OrexinOX-2, Opioid-like ORL-1, Urotensin, Bombesin brs3, Chemokine CXCR4, CXCR3, CCR5, CXCR1/2, CCR2, PAR-1, Oxytocin, Glucagon, Glucagon-like GLP-1, Cannabinoid CB-1/2, GPR 30, GPR116, GPR119, TAAR1, GPR40, SNORF25, Niacin, etc.

• Kinases: VEGF2R, Raf1, PI3Kalpha, CDK-1, VEGFR2, Raf-1, Akt, mTOR, PKC-beta, GSK-3b, MK2, JNK1, IKKa, c-Met, CHK1, IGF1R, Rho, Aurora 1/2, etc.

• Ion Channels: P2X7, Vaniloid, NMDA, AMPA, iGluR3, Kv1-3, nAChR-alpha7, nAChR-alpha4beta2, etc.

• Nuclear Receptors: PR-2, RARgamma, ERRalpha, FXR, LXR, ERbeta, GR, etc.

• Others: MCL-1, BCL-2, HSP-90, GlyT1/2, PDZ, Sulfotransferase, etc

For a project with a customer we propose a special set of 23,000 compounds towards requested targets(s) (see description in separated presentation)

Drug-Likeness/Potential IP

Most screening compounds in the collection satisfy stringent criteria, partly predictable by molecular properties. In establishing the diversity-oriented chemical space for discovery libraries, we use four types of Medicinal Chemistry Filters (MCF) which sieves off compounds containing chemical groups undesirable in drug development for various reasons.

The first filter (MCF-1) screens the compounds for the presence of 100 chemical groups considered as reactive, unstable, or toxicophoric (e.g. haloanhydrides, hydrazines, aldehydes, etc.)


The second filter (MCF-2), based on 30 groups (e.g. naphthylamines, barbiturates, acetals, thiols, etc.) flags the chemotypes believed to be toxic, cancerogenic, mutagenic, etc. The applicability of this filter is conditional on the purpose of libraries design. For example, a library targeted against metal-containing enzymes (MMP, PDF, HAD, etc) should include compounds with all types of chelating groups (hydroxamic acids, thiols, oximes, etc.), etc.

The third filter (MCF-3) evaluates the physico-chemical parameters of the compounds and classifies them in accordance with Lipinski’s “rule of 5” (LR5) of drug-likeness (Lipinsky et al., 2001) and Veber’s “rule of 2” (VR2) (Veber et al., 2002) However, this filter is not universal as the rules are rather related to compounds; bioavailability than efficacy. For example, it should be noted that the majority of anti-infective or oncolytic drugs do not conform to either LR5 or VR2. We recommend using human medicinal chemistry expertise to further analyze the compounds flagged with the MCF-3.

The fourth filter (MCF-4) screens the compounds on the basis of their novelty and IP potential. For example, it would reject trivial compounds readily obtainable by coupling of two simple commercially available reagents. These simplistic, easy to synthesize “garbage” compounds are present in almost all random libraries from commercial sources, yet they pass through the general in silico filters (as a rule, these compounds lack reactive and toxicophoric groups, have low molecular weight, high hydrophilicity, low number of rotatable bonds, etc.). The MCF-4 filter detects such compounds and allows the library designer to make a decision on the desirability of having such compounds in the collection.

For IP assessment of novel compounds we do similarity search with SciFinder. Use scientific databases Beilstein, Integrity, Wombat.

SAR support

All products resulting from customer screening projects can be supported by hit-to-lead development chemistry support of MIPT Medicinal Chemistry Team.

• typically, small project manager supervised teams of highly skilled synthetic chemists (2-4 chemistry FTEs per biological target)

• project timelines – 6-12 months (e. g. “Milestone X – improvement of in vitro potency 1 uM à <100 nM in enzymatic assay, 10 uM à <1 uM in cell-based assay – 3-4 months” requires 3-FTE synthetic support)

• frequent project team meetings with assay biology team and the client’s project team (project manager)

• highly flexible synthetic program, with frequent synthetic target changes

• the team is motivated by and rewarded upon achieving a particular contractual milestone

All project activities require a high level of data/information security and data keeping practices for adequate IP protection  

Resynthesis and re-supply

All products can be supported by adequate replenishing, and scale up. Resupply is subject to MIPT’s current stock availability. Compounds resynthesis (subject to the Customer’s confirmation) will be offered by MIPT if no adequate sample quantity is available for resupply.

The CI Team comprehensive chemistry skill set includes:

• Pd-catalyzed coupling reactions (Sonogashira, Buchwald, Heck, Suzuki, Grubbs metathesis, etc. )

• Large-scale (100-150 g per run) (organometallic reactions, lithiation of aromatics and heteroaromatics , Cu, Mg, Zn, SM chemistry)

• Modern protective group strategies

• High-pressure and high-temperature reactions

• Microwave-assisted organic synthesis (Set of Biotageand CEM automated reactors)

• Solid phase supported library synthesis (Teabag technology)

• Liquid phase parallel synthesis (Solid phase catalysts, Scavenger resins, SPE purification)

Quality Control

  Not less than 90% purity (+/- 5% accuracy). We provide 100% quality control for all compounds and guarantee more than 95% purity (+/- 5% accuracy). The purity accuracy is confirmed by 1H NMR and/or LC (UV)/ MS spectra in electronic format (MS TIF files) for all stock available compounds

FIG. 2: Molecular structures relevant in hair-growth, obtained thru our CIS2 project 2018