In this study, we developed an AI-driven protein stability prediction framework that integrates AlphaFold-generated structures, molecular refinement, and deep learning techniques. Multiple machine learning and neural network architectures were evaluated, with a multi-input CNN model leveraging protein contact maps delivering the best performance. The framework enables efficient prediction of mutation-induced stability changes, reducing reliance on extensive experimental screening and supporting faster, data-driven protein engineering.
Highlights:
- By Integrating 1D contact scores and 2D spatial maps, model captures complex protein interactions more accurately than traditional approaches.
- Explored Random Forest, single input 2D CNN, multi-input 2D CNN, and multimodal CNN approaches, with the multi-input 2D CNN trained on Sum of contact score (SCS), Vander waals and Cα-Cα contact maps outperforming other models.
- Achieved 0.679 accuracy, negative prediction value of 0.74, and 0.81 specificity, showcasing advancements in predicting protein stability.
- Our approach addresses data heterogeneity and overfitting through rigorous normalization and validation techniques.

