- Structure Search
- Prediction
- About this site
-
Updates
Release 2020.05.01
AMED cardiotoxicity database release 2020.05.01 contains updates to expand the coverage of inhibition data to all human ion channels related to heat rate. The ion channels were obtained from UniProt as human ion channels for which GO:0086091 (regulation of heart rate by cardiac conduction) or GO:0002027 (regulation of heart rate) were assigned. Then inhibitory activities assigned to the ion channels were extracted from ChMEBL (v25), and merged to those for hERG, Nav1.5, Kv1.5, and Cav1.2 already registered in the previous release 2020.04.02. Table 1 shows the list of the targets and the number of assay results/compounds registered in AMED cardiotoxicity database release 2020.05.01.
Table 1 Target Target type The number
of assaysThe number
of compoundsCAB2 SINGLE PROTEIN 1 1 CACNA2D1 SINGLE PROTEIN 164 156 Cav1.2 SINGLE PROTEIN 578 427 Cav1.3 SINGLE PROTEIN 36 29 Cav3.1 SINGLE PROTEIN 1,146 746 HCN4 SINGLE PROTEIN 66 51 hERG SINGLE PROTEIN 345,280 324,114 hTRPM4 SINGLE PROTEIN 1 1 KCNE1(MinK) SINGLE PROTEIN 9 9 Kir2.1 SINGLE PROTEIN 19 19 Kv1.5 SINGLE PROTEIN 1,387 1,086 Kv4.3 SINGLE PROTEIN 76 68 Kv7.1 SINGLE PROTEIN 109 73 Nav1.5 SINGLE PROTEIN 2,860 2,233 Nav1.8 SINGLE PROTEIN 315 227 Ryanodine receptor 2 SINGLE PROTEIN 63 53 Sodium channel alpha subunit PROTEIN FAMILY 16 16 Voltage-gated L-type calcium channel PROTEIN FAMILY 290 215 Voltage-gated potassium channel PROTEIN FAMILY 19 19 Voltage-gated calcium channel PROTEIN COMPLEX GROUP 51 49 CACNA2D1/Cav2.2/CAB3 PROTEIN COMPLEX 2 1 Cav1.2/CACNA2D1/CAB1 PROTEIN COMPLEX 64 49 Cav2.1/CACNA2D1/CAB1 PROTEIN COMPLEX 6 6 KChIP2/Kv4.3 PROTEIN COMPLEX 4 4 Kir3.1/Kir3.2 PROTEIN COMPLEX 249 121 Kir3.1/Kir3.4 PROTEIN COMPLEX 324 177 Kv7.1/KCNE1(MinK) PROTEIN COMPLEX 175 133 Kv7.1/Misshapen-like kinase 1 PROTEIN COMPLEX 3 3
Updates
Release 2020.04.02
AMED cardiotoxicity database release 2020.04.02 contains following updates about data and prediction models. Detailed description about each update/function were to be available soon.
Update of database
1. Inclusion of activity information about other ion channels related to cardiotoxicity (Nav1.5, Kv1.5, and Cav1.2) along with hERG.
2. Update following those in public database.
• ChEMBL (→v.24)
• NCGC (→v2.1)
3. Additional assay results measured in AMED Development of a Drug Discovery Informatics System project by RIKEN and sourcing to Eurofins (hERG) and Icagen. (Nav1.5, Kv1.5, and Cav1.2)
Additional functions
1. Updated hERG discrimination model
• New model corresponding to database update
• Assessment of applicability domain based on molecular similarity to learning data
• Output values are now scaled as probability from 0 to 1
• Change of SVM implementation from SVMlight to scikit-learn.svm.svc
2. New prediction models
• hERG regression model (implemented using scikit-learn.svm.svr)
• Nav1.5 discrimination/regression model (implemented using tensorflow.keras) with pre-learning of other sodium channels
• Kv1.5 discrimination model (implemented using tensorflow.keras) with pre-learning of other potassium channels
• Cav1.2 discrimination model (implemented using tensorflow.keras) with pre-learning of other calcium channels
3. Downloads of search/prediction results
• Search/prediction results can be download by clicking "download" button in the results page.
• Due to the calculation cost, sdf input for prediction was limited up to 100 compounds in a single run.
• Direct download of the whole database was removed due to the large data size and ongoing collaboration with a software company. When license and usage conditions are sorted, we intend to prepare application form for the data download.
AMED Cardiotoxicity Database is a database of small molecules which bind to various ion channels and potentially cause cardiotoxic risk.
AMED Cardiotoxicity Database compiles cardiotoxicity-related information from publicly available databases and integrates them in standardized format. As an initial target, bioactivities for hERG potassium channel were collected from ChEMBL, NIH Chemical Genomics Center, and hERGCentral because the inhibition of hERG potassium channel is closely related to the prolonged QT interval, and to assess the risk could greatly contribute to avoid delay of the development of therapeutic compounds or withdrawal of marketed drugs.
Data sources
1. ChEMBL
ChEMBL is a bioactivity database maintained by European Bioinformatics Institute, and frequently used in various cheminformatics researches as the de facto standard database. According to the target ID (CHEMBL240) of hERG, 2,153 hERG-related bioassays were registered in ChEMBL version 22, then, 10,976 activity entries for the assays were extracted. To ensure validity of the data, entries with low confidence value, undesirable data validity comments, or specified as potential duplicate were excluded. hERG-related assays which did not measure inhibitory activities were manually removed by checking assay descriptions.
2. NIH Chemical Genomics Center (NCGC)
Quantitative high throughput screening to determine in vitro hERG channel blockage by NCGC was derived from PubChem bioassays (AID = 588834). The data related to hERG (about 2,688 compounds) in LOPAC1280 library (Sigma) were determined by FluxORTM thallium flux assay. The data contain both EC50 values for both hERG inhibitors and activators along with some undefined data because the EC50 values were calculated from automated sigmoid curve-fitting to dose response of hERG activities by Hill equation, and did not distinguish the positive and negative values of Hill coefficient (inhibitor/activator) or fitting quality (inconclusive entries). Some results in this dataset were redundantly included in ChEMBL database. However, the outcome comments attached to the entries to specify whether EC50 values means inhibitors, activators, and inconclusive ones were omitted in ChEMBL. Thus, the corresponding entries were excluded from the ChEMBL dataset. In the NCGC dataset, hERG inhibitors were defined as the entries with both outcome comments specifying "inhibitor" and sufficient inhibitory activity (EC50<10µM in this case). Compounds with EC50 exceeding 10µM and compounds specified as activators and inconclusive entries were defined as negative compounds.
3. hERGCentral
hERGCentral is a database containing hERG activity information of more than 300,000 compounds. Because hERGCentral database (www.hergcentral.org) is currently out of order, values of the percent inhibitory activities of 318,496 compounds at 10µM concentration determined by IonWorks Quattro (MDC, Sunnyvale, CA) in population patch clamp (PPC) mode were retrieved from supporting information of manuscript about statistical analysis of hERGCentral dataset published by Fang et. al.(2015).
The number of compounds
AMED cardiotoxity database consists of 9,259 hERG inhibitors (IC50≤10µM) and 279,718 inactive compounds (IC50>10µM). The assessment of structural diversity using Murcko frameworks revealed that the database contains more than 2 times as many scaffold for hERG inhibitors as any of the existing databases, and covering 18.0% of all chemical space occupied by whole compounds in ChEMBL (438,551 frameworks).
database class Number of compounds Number of Murcko frameworks ChEMBL Inhibitors 4,793 2,474 Inactives 5,275 3,012 All 10,068 4,954 NCGC Inhibitors 232 173 Inactives 1,234 504 All 1,466 639 hERGCentral Inhibitors 4,321 2,708 Inactives 274,536 73,419 All 278,857 74,687 AMED cardiotoxity database Inhibitors 9,259 5,203 Inactives 279,718 75,868 All 288,977 79,014 Reference
Sato T, Yuki H, Ogura K, Honma T Construction of an integrated database for hERG blocking small molecules.
PLOS ONE 13(7) (2018): e0199348. https://doi.org/10.1371/journal.pone.0199348
Last Update: 2020-05-01
Copyright © 2017 AMED - RIKEN . All Rights Reserved.