Guolin Ke is currently the head of Machine Learning Group at DP Technology, working on AI for Science. Previously, he was a Senior Researcher at the Machine Learning Group at Microsoft Research Asia (MSRA), where he focused on the development of high-performance machine learning algorithms and large-scale pretrained language models.
Guolin’s current research focuses on using AI for Science, specifically large-scale representation learning for molecules, 3D geometry learning, 3D molecular generation, protein structure prediction, and more. His goal is to accelerate the development of science through machine learning. If you are interested in joining his team, please email your CV to him.
During his academic career, Guolin won several machine learning competitions, including the first place (1M CNY) in the 1st Alibaba (Tmall) Big Data Competition (2014) and the first place (100K USD) in the 1st Didi Algorithm Competition (2016). In 2016, he created LightGBM, one of the most popular GBDT tools, during his internship at MSRA. It has received ~15K stars in GitHub and 220M+ total downloads. In 2021, he led the development of Graphormer, which won the 1st place of the quantum prediction track of Open Graph Benchmark Large-Scale Challenge (KDD CUP 2021) and the 1st place of Open Catalyst Challenge (NeurIPS 2021). In 2021 & 2022, he led the development of Uni-Fold, the first AlphaFold reimplementation, with open-sourced training codes and full training dataset.
Nov 11, 2023, LightGBM is selected in AI100: Top 100 AI achievements (1943-2021).
Oct 4, 2023, Guolin is selected in World top 2% scientists.
Jul 7, 2023, Uni-Mol+ wins the FIRST place in the Open Catalyst 2020 IS2RE Direct benchmark.
Mar 16, 2023, Uni-Mol+ wins the FIRST place in the OGB Large-Scale Challenge PCQM4Mv2.
Feb 16, 2023, we release Uni-Fold MuSSe, a de novo protein complex prediction tool that requires only a single sequence input.
Aug 31, 2022, we release Uni-Fold Symmetry, supporting the end-to-end prediction of extremely large protein complexes.
Aug 6, 2022, we release the pytorch version of Uni-Fold, which firstly reimplemented AlphaFold and AlphaFold-Multimer, including training codes and datasets.
Jul 21, 2022, LightGBM is faster thanks to quantilize training and the new GPU version. See our paper for more information.
Jun 10, 2022, we release Uni-Mol, the first 3D molecular/pocket pretraining framework, outperforms SOTA in molecular property prediction, protein-ligand binding pose prediction, and molecular conformation generation, etc.
Dec 8, 2021, Graphormer-3D wins the FIRST place in the Open Catalyst Challenge at NeurIPS 2021.
Jun 17, 2021, Graphormer wins the FIRST place in the OGB Large-Scale Challenge PCQM4M-LSC at KDD Cup 2021.
The entire list is available at Google Scholar.
Jingqi Wang, Jiapeng Liu, Hongshuai Wang, Musen Zhou, Guolin Ke, Linfeng Zhang, Jianzhong Wu, Zhifeng Gao, Diannan Lu. “A comprehensive transformer-based approach for high-accuracy gas adsorption predictions in metal-organic frameworks”. Nat Commun 15, 1904 (2024).
Lin Yao, Wentao Guo, Zhen Wang, Shang Xiang, Wentan Liu, Guolin Ke. “Node-Aligned Graph-to-Graph: Elevating Template-free Deep Learning Approaches in Single-Step Retrosynthesis”. JACS Au 2024, 4, 3, 992–1003.
Qingsi Lai, Lin Yao, Zhifeng Gao, Siyuan Liu, Hongshuai Wang, Shuqi Lu, Di He, Liwei Wang, Cheng Wang, Guolin Ke. “End-to-End Crystal Structure Prediction from Powder X-Ray Diffraction”. ArXiv preprint arXiv:2401.03862.
Shuqi Lu, Zhifeng Gao, Di He, Linfeng Zhang, and Guolin Ke. “Highly Accurate Quantum Chemical Property Prediction with Uni-Mol+”. ArXiv Preprint ArXiv:2303.16982, 2023.
Jinhua Zhu, Zhenyu He, Ziyao Li, Guolin Ke, and Linfeng Zhang. “Uni-Fold MuSSe: De Novo Protein Complex Prediction with Protein Language Models”. BioRxiv 2023.
Yuejiang Yu, Shuqi Lu, Zhifeng Gao, Hang Zheng, and Guolin Ke. “Do Deep Learning Models Really Outperform Traditional Approaches in Molecular Docking?” ArXiv Preprint ArXiv:2302.07134, 2023.
Gengmo Zhou, Zhifeng Gao, Zhewei Wei, Hang Zheng, and Guolin Ke. “Do Deep Learning Methods Really Perform Better in Molecular Conformation Generation?” ArXiv Preprint ArXiv:2302.07061, 2023.
Shuqi Lu, Lin Yao, Xi Chen, Hang Zheng, Di He, and Guolin Ke. “3D Molecular Generation via Virtual Dynamics”. ArXiv Preprint ArXiv:2302.05847, 2023.
Lin Yao, Ruihan Xu, Zhifeng Gao, Guolin Ke, and Yuhang Wang. “Boosted Ab Initio Cryo-EM 3D Reconstruction with ACE-EM”. ArXiv Preprint ArXiv:2302.06091, 2023.
Gengmo Zhou, Zhifeng Gao, Qiankun Ding, Hang Zheng, Hongteng Xu, Zhewei Wei, Linfeng Zhang, and Guolin Ke. “Uni-Mol: A Universal 3D Molecular Representation Learning Framework”. ICLR 2023.
Ziyao Li, Shuwen Yang, Xuyang Liu, Weijie Chen, Han Wen, Fan Shen, Guolin Ke, and Linfeng Zhang. “Uni-Fold Symmetry: harnessing symmetry in folding large protein complexes”. BioRxiv 2022.
Ziyao Li, Xuyang Liu, Weijie Chen, Fan Shen, Hangrui Bi, Guolin Ke, and Linfeng Zhang. “Uni-Fold: An Open-Source Platform for Developing Protein Folding Models beyond AlphaFold”. BioRxiv 2022.
Yu Shi, Guolin Ke, Zhuoming Chen, Shuxin Zheng, and Tie-Yan Liu. “Quantized Training of Gradient Boosting Decision Trees”. NeurIPS 2022.
Yu Shi, Shuxin Zheng, Guolin Ke, Yifei Shen, Jiacheng You, Jiyan He, Shengjie Luo, Chang Liu, Di He, and Tie-Yan Liu. “Benchmarking graphormer on large-scale molecular modeling datasets”. ArXiv Preprint ArXiv:2203.04810, 2022.
Payal Bajaj, Chenyan Xiong, Guolin Ke, Xiaodong Liu, Di He, Saurabh Tiwary, Tie-Yan Liu, Paul Bennett, Xia Song, and Jianfeng Gao. “Metro: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals”. ArXiv Preprint ArXiv:2204.06644, 2022.
Shuqi Lu, Di He, Chenyan Xiong, Guolin Ke, Waleed Malik, Zhicheng Dou, Paul Bennett, Tie-Yan Liu, and Arnold Overwijk. “Less Is More: Pretrain a Strong Siamese Encoder for Dense Text Retrieval Using a Weak Decoder”. EMNLP 2021.
Shengjie Luo, Shanda Li, Tianle Cai, Di He, Dinglan Peng, Shuxin Zheng, Guolin Ke, Liwei Wang, and Tie-Yan Liu. “Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding”. NeurIPS 2021.
Chengxuan Ying, Mingqi Yang, Shuxin Zheng, Guolin Ke, Shengjie Luo, Tianle Cai, Chenglin Wu, Yuxin Wang, Yanming Shen, and Di He. “First Place Solution of KDD Cup 2021 & OGB Large-Scale Challenge Graph Prediction Track”. ArXiv Preprint ArXiv:2106.08279, 2021.
Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, and Tie-Yan Liu. “Do Transformers Really Perform Badly for Graph Representation?” NeurIPS 2021.
Dinglan Peng, Shuxin Zheng, Yatao Li, Guolin Ke, Di He, and Tie-Yan Liu. “How Could Neural Networks Understand Programs?”. ICML 2021
Chengxuan Ying, Guolin Ke, Di He, and Tie-Yan Liu. “Lazyformer: Self Attention with Lazy Update”. ArXiv Preprint ArXiv:2102.12702, 2021.
Guolin Ke, Di He, and Tie-Yan Liu. “Rethinking Positional Encoding in Language Pre-Training”. ICLR 2021.
Qiyu Wu, Chen Xing, Yatao Li, Guolin Ke, Di He, and Tie-Yan Liu. “Taking Notes on the Fly Helps Language Pre-Training”. ICLR 2021.
Zhenhui Xu, Linyuan Gong, Guolin Ke, Di He, Shuxin Zheng, Liwei Wang, Jiang Bian, and Tie-Yan Liu. “Mc-Bert: Efficient Language Pre-Training via a Meta Controller”. ArXiv Preprint ArXiv:2006.05744, 2020.
Mingqing Xiao, Shuxin Zheng, Chang Liu, Yaolong Wang, Di He, Guolin Ke, Jiang Bian, Zhouchen Lin, and Tie-Yan Liu. “Invertible Image Rescaling”. ECCV 2020.
Zhenhui Xu, Guolin Ke, Jia Zhang, Jiang Bian, and Tie-Yan Liu. “Light Multi-Segment Activation for Model Compression”. In AAAI 2020.
Guolin Ke, Zhenhui Xu, Jia Zhang, Jiang Bian, and Tie-Yan Liu. “DeepGBM: A Deep Learning Framework Distilled by GBDT for Online Prediction Tasks”. KDD 2019.
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. “LightGBM: A Highly Efficient Gradient Boosting Decision Tree”. NeurIPS 2017.
Qi Meng, Guolin Ke, Taifeng Wang, Wei Chen, Qiwei Ye, Zhi-Ming Ma, and Tie-Yan Liu. “A Communication-Efficient Parallel Algorithm for Decision Tree”. NeurIPS 2016.