MTPNet is a novel deep learning framework for activity cliff prediction, a crucial task in drug discovery and material design. Unlike previous models limited to single-target prediction, MTPNet leverages multi-grained semantic information from receptor proteins to guide molecular representation learning, achieving state-of-the-art performance across both regression and classification tasks.
Activity cliff prediction is a critical task in drug discovery and material design. Existing computational methods are limited to handling single binding targets, which restricts the applicability of these prediction models.
MTPNet addresses this limitation by incorporating Macro-level Target Semantic (MTS) and Micro-level Pocket Semantic (MPS) guidance into a unified framework. It dynamically conditions molecular embeddings on protein-level semantics, capturing complex molecular-protein interactions.
On 30 benchmark datasets from MoleculeACE, MTPNet outperforms existing methods with an average 18.95% RMSE improvement across several mainstream GNN architectures.
MTPNet supports two major tasks:
- Benchmark platform: MoleculeACE
- 30 datasets, each representing an individual macromolecular target
- Over 35,000 molecules in total
- 12 datasets contain fewer than 1,000 training samples → ideal for low-data evaluation
- Dataset: CYP3A4 activity cliff data
- Source: [Veith et al., 2009], curated by [Rao et al., 2022]
- 3,626 active inhibitors/substrates vs. 5,496 inactive compounds
- Clone this repository:
- Create the Conda environment (CUDA 11.8 compatible):
📝 Tip: You can also specify individual datasets using --dataset CHEMBL244_Ki, etc.
By default, training uses all datasets (--dataset ALL). You can also train on a single dataset by replacing ALL with one of the dataset names:
📌 Each dataset corresponds to a specific protein target. You can find all supported names in the MoleculeACE benchmark.
This project is licensed under the MIT License.
.png)


