Titel
Selectivity profiling of BCRP versus P-gp inhibition: from automated collection of polypharmacology data to multi-label learning
... show all
Abstract
Background: The human ATP binding cassette transporters Breast Cancer Resistance Protein (BCRP) and Multidrug Resistance Protein 1 (P-gp) are co-expressed in many tissues and barriers, especially at the blood–brain barrier and at the hepatocyte canalicular membrane. Understanding their interplay in affecting the pharmacokinetics of drugs is of prime interest. In silico tools to predict inhibition and substrate profiles towards BCRP and P-gp might serve as early filters in the drug discovery and development process. However, to build such models, pharmacological data must be collected for both targets, which is a tedious task, often involving manual and poorly reproducible steps. Results: Compounds with inhibitory activity measured against BCRP and/or P-gp were retrieved by combining Open Data and manually curated data from literature using a KNIME workflow. After determination of compound overlap, machine learning approaches were used to establish multi-label classification models for BCRP/P-gp. Different ways of addressing multi-label problems are explored and compared: label-powerset, binary relevance and classifiers chain. Label-powerset revealed important molecular features for selective or polyspecific inhibitory activity. In our dataset, only two descriptors (the numbers of hydrophobic and aromatic atoms) were sufficient to separate selective BCRP inhibitors from selective P-gp inhibitors. Also, dual inhibitors share properties with both groups of selective inhibitors. Binary relevance and classifiers chain allow improving the predictivity of the models. Conclusions: The KNIME workflow proved a useful tool to merge data from diverse sources. It could be used for building multi-label datasets of any set of pharmacological targets for which there is data available either in the open domain or in-house. By applying various multi-label learning algorithms, important molecular features driving transporter selectivity could be retrieved. Finally, using the dataset with missing annotations, predictive models can be derived in cases where no accurate dense dataset is available (not enough data overlap or no well balanced class distribution).
Stichwort
BCRPP-glycoproteinOpen DataMulti-label classificationBinary relevanceClassifiers chainSelective inhibitionPolyspecific inhibitionKNIMEOpen PHACTS
Objekt-Typ
Sprache
Englisch [eng]
Persistent identifier
https://phaidra.univie.ac.at/o:533839
Erschienen in
Titel
Journal of Cheminformatics
Band
8
Verlag
Springer Nature
Erscheinungsdatum
2016
Zugänglichkeit

Herunterladen

Universität Wien | Universitätsring 1 | 1010 Wien | T +43-1-4277-0