      Research Fellow

      School of Computing, Faculty of Engineering and Physical Sciences

      University of Leeds, UK

      Office (LS):  Room 8.02, E.C.Stoner Building, Leeds, UK, LS2 9JT  

      Email: r.yang1 AT leeds DOT ac DOT uk (main) or renyu.yang1 AT gmail DOT com  

What's New

(27/04/2021) One paper on GPU cluster scheduling is accepted by IEEE Trans. on Parallel and Distributed Systems (TPDS).

(23/02/2021) I will serve as Track Chair for IEEE International Conference on Cloud Computing (CLOUD'21). See the Call for Papers

(31/01/2021) Our work on resource efficient incremental learning for heterogenous information networks is accepted by IEEE Trans. on Computers (TC).

(17/01/2021) Our work on streaming social event detection is accepted by ACM Trans. on Knowledge Discovery from Data (TKDD).

(19/10/2020) I will serve as PC member for International Joint Conferences on Artificial Intelligence 2021 (IJCAI'21).

~$ No news is good news...Check older events.

Short Bio

I am currently a research fellow, funded by a UK EPSRC grant, with the School of Computing, Faculty of Engineering and Physical Sciences, University of Leeds, UK. I am also a visiting research scientist with the Big Data and Brain Computing Research Center (BDBC), Beijing, China. I was a research scientist at Edgetic Ltd., a UK-based startup high-tech company that employs distributed scheduling, machine learning, hardware-software modeling, etc. to reshape the future of data center efficiency. I have been leading several China/UK national and international projects in terms of distributed resource scheduling, cloud storage and geo-distributed data processing for intelligent decision making and massive-scale data analysis. We are building large-scale resource management infrastructures and system profiling framework to support those functionalities. My research mainly focuses on: 1) scalable and intelligent resource scheduling at Internet scale; 2) system dependability by leveraging effective system failover, long-tail task mitigation, and quantitative reliability modeling etc.; 3) improved system utility for resource management through resource over-subscription mechanism and multi-objective optimization, etc.; and 4) applied machine/deep learning for representation learning, anomaly detection, etc.

I has published more than 40 peer-reviewed papers, in the field of distributed systems, cloud computing, big data analytics and applied deep learning techniques. They appear in top journals and conference proceedings such as IEEE Transactions on Parallel and Distributed Systems (TPDS), IEEE Transactions on Computers (TC), IEEE Transactions on Knowledge and Data Engineering (TKDE), IEEE Transactions on Services Computing (TSC), ACM Transactions on Knowledge Discovery from Data (TKDD), IEEE Internet Computing, VLDB, IEEE ICDCS, IEEE DSN, USENIX LISA, ACM SoCC, etc. One of my work won the Best Paper Award in IEEE ISADS 2013. I was awarded the Grand Class of Scientific and Technological Progress Award of Chinese Institute of Electronics of year 2017 (the only grand class award since the award is established) for the key participation and contribution to the reliable resource management and scheduling at massive scale.

I obtained my BSc and PhD from School of Computing, Beihang University, China in June 2011 and January 2017, respectively. I had been working in both State Key Laboratory of Software Development Environment (SKLSDE) in Beihang University and Distributed Systems and Services (DSS) lab in the University of Leeds for many years. I had led several UK-China research collaborations including the COLAB (short for Leeds, Alibaba, Beihang) project and developed strong links with China-UK Computer Science community (such as Tsinghua University, Peking University, SJTU, NUDT, University of Edinburgh, Lancaster University, Newcastle University upon Tyne, etc.). I co-founded the SIGRS (Special Interest Group on Resource Scheduling) and subsequent MSDS (Massive-Scale Distributed Systems) Research Group in COLAB. During 2014 to 2016, I was also with Fuxi, the Distributed Resource Scheduling Team in Alibaba Cloud Inc., participating in the development and research on resource scheduling and performance optimization at Internet scale. I am a member of IEEE.

Research Interests

Massive-scale distributed systems of big data, cloud computing and IoT

• Play distributed systems to improve the efficiency and reliability for adapting to ever-changing environments: resource management, resource scheduling, fault tolerance, dependability etc.

• Play with data: system profiling, log analysis and applied deep/machine learning (e.g., GNN)


• 2007.9 - 2011.6:    BSc. Computer Science (equivalent to 1st Class Honours), supervised by Dr. Tianyu Wo, Beihang University, China   

• 2011.9 - 2017.1:    PhD. Computer Science (with Honor), Supervised by Prof. Jie Xu, Beihang University, China   

• 2012.10 - 2013.1, 2013.12 and 2016.3:   Visiting Student (Supervised by Dr. Paul Townend and Prof. Jie Xu),   University of Leeds, UK   

Academia & Industry Experience

• 2011.9 - 2017.2:   Teaching and Research Assistant, Team Leader, SIGRS and MSDS, Beihang University, Beijing, China   

• 2014.2 - 2016.2:    R&D Intern (Mentored by Mr. Zhuo Zhang and Chao Li), Fuxi Distributed Scheduling Group, Apsara, Alibaba Cloud Computing Inc., Beijing, China     

• 2016.4 - 2016.4:   Visiting Researcher (work with Dr. Zhenyu Wen),   University of Edinburgh, UK   

• 2017.3 - 2018.3:    Research Scientist and Project Director, Big Data and Brain Computing Research Center (BDBC), Beijing, China   

• 2018.3 - 2020.8:    Research Scientist, Edgetic Ltd., UK   

Selected Publications  

• G.Yeung, B.Borowiec, R.Yang*, A.Friday, R.Harper, P.Garraghan. Horus: Interference-Aware and Prediction-Based Scheduling in Deep Learning Systems. IEEE Trans. on Parallel and Distributed Systems (TPDS), 2021

• H.Peng, R.Yang, Z.Wang, J.Li, L.He, P.S.Yu, A.Zomaya, R.Ranjan. LIME: Low-Cost and Incremental Learning for Dynamic Heterogeneous Information Networks. IEEE Trans. on Computers (TC), 2021

• H.Peng, J.Li, Y.Song, R.Yang, R.Ranjan, P.S.Yu and L.He. Streaming Social Event Detection and Evolution Discovery in Heterogeneous Information Networks. ACM Trans. on Knowledge Discovery from Data (TKDD), 2021

• R.Yang, C.Hu, X.Sun, P.Garraghan, T.Wo, Z.Wen, H.Peng, J.Xu and C.Li. Performance-aware Speculative Resource Oversubscription for Large-scale Clusters. IEEE Trans. on Parallel and Distributed Systems (TPDS), 2020

• Z.Wen, T.Lin, R.Yang*, S.Ji, R.Ranjan, A.Romanovsky, C.Lin, J.Xu. GA-Par: Dependable Microservice Orchestration Framework for Geo-Distributed Clouds. IEEE Trans. on Parallel and Distributed Systems (TPDS), 2019

• P.Garraghan, X.Ouyang, R.Yang*, D.Mckee and J.Xu. Straggler Root-cause and Impact Analysis for Massive-scale Virtualized Cloud Datacenters. IEEE Trans. on Services Computing (TSC), 2019

• R.Yang, Y.Zhang, P.Garraghan, Y.Feng, J.Ouyang, J.Xu, Z.Zhang, C.Li. Reliable Compute Service in Massive-scale Systems through Rapid Low-cost Failover. IEEE Trans. on Services Computing (TSC), 2017

• R.Yang, Z.Wen, D.Mckee, T.Lin, J.Xu and P.Garraghan. Software-Defined Fog Orchestration for IoT Services. Book Chapter, Fog and Fogonomics Challenges and Practices, Wiley, ISBN: 978-1-119-50109-1, 2020

• Z.Wen, R.Yang*, P.Garraghan, T.Lin, J.Xu and M.Rovatsos. Fog Orchestration for IoT Services. IEEE Internet Computing, 2017

• R.Yang', J.Zhu', C.Hu, T.Wo, S.Xue, J.Ouyang and J.Xu. Perphon: a ML-based Agent for Workload Co-location via Performance Prediction and Resource Inference. ACM Symposium on Cloud Computing (SoCC), 2019

• X.Sun, C.Hu, R.Yang*, P.Garraghan, T.Wo, J.Xu, J.Zhu, C.Li. ROSE: Cluster Resource Scheduling via Speculative Over-subscription. IEEE ICDCS 2018 and ACM SOSP 2017 (poster)

• Z.Zhang, C.Li, Y.Tao, R.Yang*, H.Tang and J.Xu. Fuxi: a Fault-Tolerant Resource Management and Job Scheduling System at Internet Scale. VLDB 2014

• L.Cui, J.Li, T. Wo, B.Li, R.Yang, Y.Cao, and J.Huai. HotRestore: A Fast Restore System for Virtual Machine Cluster. USENIX LISA 2014

• X.Ouyang, P.Garraghan, R.Yang, P.Townend and J.Xu. Reducing Late-Timing Failure at Scale: Straggler Root-Cause Analysis in Cloud Datacenters. IEEE DSN 2016

~$ More publications can be found in: [Complete List] [DBLP] [Googe Scholar]

Selected Research Grants

I have the opportunity to work closely with many institutes around the world and participate in many government/industry funded research projects to solve real-world problems with innovative cloud and analytic technologies.

~$ grants -l

• Algorithmic Support for Massive Scale Distributed Systems (EP/T01461X/1), funded by UK Engineering and Physical Sciences Research Council (EPSRC), co-author and project leader, 2020.9 -

• Server Modelling Capability (No.45904), funded by Innovate UK Programme, UKRI, key member, 2020.2 - 2020.8

• BODEN TYPE Data Centre ONE Project (European Comission’s Horizon 2020 Programme), Member, 2018.6 - 2019.12

• Cloud Operating System - Multiple-grained Resource Management and Scheduling (No. 2016YFB1000503), supported by National Key Research and Development Programme, co-author, project leader/coordinator, 2016.6 - 2018.5

• Storage Model and Mechanism for Joint Clouds (No. 2016YFB1000103), supported by National Key Research and Development Programme, co-author, project leader/coordinator, 2016.5 - 2017.11

• On-demand Resource Aggregation and Execution in Massive-scale iVCE Environments (No.2011CB302602), supported by China 973 Fundamental Research and Development Programme, co-author, project leader/coordinator, 2012.1 - 2015.9

• ChinaCloud Phase II: Fundmental Key Software for Cloud Platforms (No.2013AA01A213), supported by National 863 Hi-Tech Research and Development Programme, Key Member, 2013.1 - 2014.1

• ChinaCloud Phase I: Resource Virtualization and Scheduling (No. 2011AA01A202), supported by National 863 Hi-Tech Research and Development Programme, Key Member, 2011.9 - 2013.12

• Nature Science Foundation of China (91118008, 90818028, 6117029 4)

Research Team

I have been co-advising and/or working closely with many students mainly from COLAB. The programmes primarily include PhD and postgraduate projects (MSc research programme in China and MSc projects of module COMP5200M and COMP3211 in University of Leeds.)

~$ team -l current

• PhD Students: Jianyong Zhu (2017.5- , Beihang Uni., with Prof. Chunming Hu), Xiaoyang Sun (2019.9- , Uni of Leeds, with Prof. Jie Xu and Dr. Zheng Wang), Gingfung Yeung (2018.1- , Lancaster University, UK, with Dr. Peter Garraghan), Yiming Hei (2019.9- , Beihang Uni., with Prof. Jianwei Liu and Dr. Hao Peng), Youmei Song (2017.9- , Beihang Uni., with Prof. Jie Xu and Prof. Tianyu Wo)

• Undergraduate Final Year Project Students: Chris London, Charlee Boyle, Kunchaya Soinak, Sam Hepburn (Uni. of Leeds, 2020 Autumn)

• MSc Projects: Zixuan Fu, Kenan Jiang, Mengwei Zhao, Chao Guo, Wang Tang, Zhenbin Jin, Xianjie Yang (MSc. of COMP5200M, Spring 2021, Uni. of Leeds)

~$ team -l alumni

• Jaber Almutairi (PhD., 2018.6 - 2020.1, Uni. of Leeds, with Prof. Jie Xu), Xiaochen Sun (2019.9-2020.9, Uni. of Leeds, Visiting PhD. Student from ISCAS, with Prof. Jie Xu)

• Shiqing Xue (2018.9-2020.12, Beihang Uni.), Dr. Qi Shen (2017.9-2019.8, Postdoc researcher, Beihang Uni.)

• Xuyang Cao, Yuecheng Pei, Yuncheng Xie, Guoxiang Liao (MSc of COMP5200M, 2019.4-2019.9, Uni. of Leeds)

• Junqing Xiao (MSc, 2016.9-2018.12, first employment at Alibaba Group), Ximing Qu (MSc, 2016.9-2018.12, first employment at Alibaba Group)

• Xiaoyang Sun (MSc, 2015.9-2018.3, PhD student at Uni. of Leeds. UK), Lian Du (MSc, 2015.9-2018.3, first employment at Alibaba Group), Hailun Wang (MSc, 2015.9-2017.3, first employment at Baidu Inc.)

• Xixu Wang (MSc, 2014.9-2017.3, first employment at Baidu Inc.), Wenbo Jiang (MSc, 2013.9-2016.3, first employment at NetEase Inc.), Yunchang Zhao (MSc, 2013.9-2016.3, first employment at an Internet startup)

Michael Hao Tong (B.Sc, 2012.9-2014.6, PhD student at Uni. of Chicago, US), Yuda Wang (MSc, 2012.9-2015.3, first employment at China Telecom.)

I have been collaborating closely with many brilliant researchers around the world.

~$ team -l collaborator

• Dr. Ismael Solis Moreno (now at IBM Research), Dr. Peter Garraghan (now Lecturer at Lancaster University)

Dr. Zhenyu Wen (now Postdoc at Newcastle University), Dr. Xue Ouyang (now assistant professor at NUDT), Tao Lin (EPFL)

• Dr. Lei Cui (now associate professor at CAS), Dr. Hao Peng (now assistant professor at Beihang University)

• Jin Ouyang, Jiamang Wang (senior staff engineer at Alibaba Group), etc.

Teaching and Presentations

• Cloud Computing (CS61839, Graduate Level, Beihang University), Spring 2017 (together with Prof. Chunming Hu et. al)

• Operating Systems (XJCO2211, Undergraduate Level, University of Leeds), Autumn 2020 (together with Prof. Jie Xu)

• Talks given in different venues such as academic conferences or industrial meeting-ups can be found here: ~$ talks

Selected Professional Services

• Program Chair: IEEE International Conference on Joint Cloud Computing (JCC) 2020, 2019, etc.

• Area/Track Chair: IEEE International Conference on Cloud Computing (CLOUD) 2021.

• PC Member: IJCAI'21, ACM/IEEE UCC'18-'21, IEEE/ACM BDCAT'16-'21, etc.

• Journal Reviewers: IEEE TPDS, TDSC, TC, TSC, TSE, TCC, ACM CSUR, etc.

• More professional services can be found here: ~$ services

Selected Awards     

Outstanding Service Award , IEEE JCC-SOSE, 2020

The Grand Class of Scientific and Technological Progress Award of Chinese Institute of Electronics, 2017 (中国电子学会科技进步特等奖)

Outstanding Graduate Award of Beijing Graduates, 2017

Guanghua Scholarship Award for outstanding students, 2014

Best Paper Award of the IEEE 11th International Symposium on Autonomous Decentralized System(ISADS), 2013

Excellent Academic Award of The Institute of Advanced Computing Technology, Beihang University

Excellence Student Scholarship of Beihang University


Outside of work, I really enjoy traditional chinese calligraphy, architectural design, cooking world wide foods, music and long-distance running.

