Professor Wang, Cho-Li

BS Nat. Taiwan; MS, PhD S. Calif
BEng(CE) Programme Coordinator; Professor


Tel: (+852) 2857 8458
Fax: (+852) 2559 8447
Email: clwang [AT] cs [DOT] hku [DOT] hk
Homepage: https://www.cs.hku.hk/~clwang

Professor Cho-Li Wang received his B.S. degree in Computer Science and Information Engineering from National Taiwan University in 1985. He obtained his M.S. and Ph.D. degrees in Computer Engineering from University of Southern California in 1990 and 1995 respectively. He is currently a professor at the Department of Computer Science. Professor Wang's research interests include parallel architecture, operating system, performance optimization on heterogeneous multicore systems (GPU/AI chips); high-performance software systems for Cloud Computing, and large-scale Distributed Deep Learning system. Professor Wang has published papers in various peer reviewed journals and conference proceedings. He is/was on the editorial boards of  several international journals , including IEEE Transactions on Computers (TC), IEEE Transactions on Cloud Computing, Multiagent and Grid Systems (MGS), Journal of Information Science and Engineering (JISE), International Journal of Pervasive Computing and Communications (JPCC). He was the program chair for Cluster'03, CCGrid'09, InfoScale'09, and ICPADS'09, ISPA'11, FCST'11, FutureTech'12, and Cluster2012; and the General Chair for IPDPS2012. He is a founding member of China's Supercomputing Innovation Alliance  (超级计算创新联盟), executive member of IEEE Technical Committee on Parallel Processing (TCPP), and Amazon AWS Educate Cloud Ambassador (2020). Prof. Wang has served as a member of Engineering Panel of Hong Kong Research Grant Council in 2017-2022.

Research Interests

Operating Systems, Computer Architecture, Virtual Machines and Cloud Computing, Big Data Computing systems, Performance optimization on Manycore/GPU/AI processors, Large-scale Distributed Deep/Federated Learning Systems.

Selected Publications (DBLP)

  • Zhaorui Zhang, Cho-Li Wang, "MIPD: An Adaptive Gradient Sparsification Framework for Distributed DNNs Training", to appear in IEEE Transactions on Parallel and Distributed Systems (TPDS), pp. 3053 - 3066, Vol. 33, No.11, Nov. 2022. (Link)

  • Xueyu Wu, Cho-Li Wang, “KAFL: Achieving High Training Efficiency for Fast-K Asynchronous Federated Learning”,  42nd IEEE International Conference on Distributed Computing Systems (ICDCS’22), July 10-13, 2022, Bologna, Italy.

  • Zhaorui Zhang, Cho-Li Wang, ``SaPus: Self-Adaptive Parameter Update Strategy for DNN Training on Multi-GPU Clusters,'' IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 33, No. 7, July 2022, pp. 1569-1580. (Link)

  • Zhuoran Ji and Cho-Li Wang, Efficient Exact K-Nearest Neighbor Graph Construction for Billion-Scale Datasets using GPUs with Tensor Cores, the 36th ACM International Conference on Supercomputing (ICS'22), June 27-30, 2022.

  • Zhaorui Zhang, Zhuoran Ji, Cho-Li Wang, ``Momentum-Driven Adaptive Synchronization Model for Distributed DNN Training on HPC Clusters,'' Journal of Parallel and Distributed Computing (JPDC), Vol. 159, January 2022, Pages 65-84. (Link).

  • Zhuoran Ji and Cho-Li Wang, "Accelerating DBSCAN Algorithm with AI Chips for Large Datasets", 50th International Conference on Parallel Processing (ICPP’21), Aug. 9-12, 2021, Chicago, Illinois, USA. (Link)

  • Zhuoran Ji and Cho-Li Wang, "Collaborative GPU Preemption via Spatial Multitasking for Efficient GPU Sharing", 27th International European Conference on Parallel and Distributed Computing (Euro-Par'21), 30 Aug. – 3 Sept. 2021.

  • Zhuoran Ji and Cho-Li Wang, "CTXBack: Enabling Low Latency GPU Context Switching via Context Flashback", 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS’21), 17-21 May 2021. 

  • Xueyu Wu, Xin Yao and Cho-Li Wang, "FedSCR: Structure-based Communication Reduction for Federated Learning," IEEE Transactions on Parallel and Distributed Systems (TPDS), 32(7): 1565-1577 (July 2021). (Link)

  • Xin Yao and Cho-Li Wang, Probabilistic Consistency Guarantee in Partial Quorum-based Data Store, IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 31, Issue 8, Aug. 2020, Page 1815 – 1827.

  • Huanxin Lin, Cho-Li Wang. On-GPU Thread-Data Remapping for Nested Branch Divergence, Journal of Parallel and Distributed Computing, Volume 139, May 2020, Pages 75-86.

  • Hao Wu, Weizhi Liu, Huanxin Lin,  and Cho-Li Wang, A Model-Based Software Solution for Simultaneous Multiple Kernels on GPUs, ACM Transactions on Architecture and Code Optimization (TACO), Volume 17, Issue 1, March 2020. 

  • Huanxin Lin, Cho-Li Wang, Efficient Low-Latency Packet Processing Using On-GPU Thread-Data Remapping, Journal of Parallel and Distributed Computing (JPDC), Volume 133, November 2019, Pages 51-62.

  • Xin Yao, Xueyu  Wu, Cho-Li Wang, "FluentPS: A Parameter Server Design with Low-frequency Synchronization for Distributed Deep Learning",  2019 IEEE International Conference on Cluster Computing (Cluster 2019), Sept 23-26. 2019, Albuquerque, NM, USA.  

  • Xin Yao, Mingzhe Zhang, Cho-Li Wang, EC-Shuffle: Dynamic Erasure Coding Optimization for Efficient and Reliable Shuffle in Spark, The 19th Annual IEEE/ACM International Symposium in Cluster, Cloud, and Grid Computing (CCGrid 2019), Larnaca, Cyprus, May 14-17, 2019.

  • Huanxin Lin, Cho-Li Wang, Hongyuan Liu, “On-GPU Thread-Data Remapping for Branch Divergence Reduction,” ACM Transactions on Architecture and Code Optimization (TACO), Vol. 15, No. 3, Oct. 2018.

  • Mingzhe Zhang, King Tin Lam, Xin Yao, Cho-Li Wang, SIMPO: A Scalable In-Memory Persistent Object Framework Using NVRAM for Reliable Big Data Computing, ACM Transactions on Architecture and Code Optimization (TACO), Volume 15 Issue 1, April 2018.

  • Hongyuan Liu, King Tin Lam, Huanxin Lin, Cho-Li Wang, Junchao Ma, Lightweight Dependency Checking for Parallelizing Loops with Non-Deterministic Dependency on GPU, The 22nd IEEE International Conference on Parallel and Distributed Systems (ICPADS 2016), Dec. 13-16, 2016, Wuhan, China. [Best Paper Awards]

  • Zhiquan Lai, King Tin Lam, Cho-Li Wang, and Jinshu Su, PoweRock: Power Modelling and Flexible Dynamic Power Management for Many-core Architectures, IEEE Systems Journal, Issue: 99, pp. 1-13, 20 January 2016.

  • Sheng Di, Cho-Li Wang, Franck Cappello, Adaptive Algorithm for Minimizing Cloud Task Length with Prediction Errors, , IEEE Transactions on Cloud Computing, Vol.2, No.2, pp 194 - 207, April-June 2014

  • S. Di and C.L. Wang, Dynamic Optimization of Multi-Attribute Resource Allocation in Self-Organizing Clouds, IEEE Transactions on Parallel and Distributed Systems (TPDS), 14 May 2012
  • S. Di and C.L. Wang, Decentralized Proactive Resource Allocation for Maximizing Throughput of P2P Grid, Journal of Parallel and Distributed Computing (JPDC), Vol. 72, No. 2, February 2012, pp. 308–321

Recent Research Grants

  • Co-PI: Hong Kong RGC Research Impact Fund (RIF) project entitled “Edge Learning: the Enabling Technology for Distributed Big Data Analytics in Cloud-Edge Environment” (Ref: R5060-19), led by Prof. Guo Song from PolyU. Project period: May 01, 2020 to April 30, 2025.
  • Co-PI: CRF Equipment Fund 2019/20, “X-GPU: An Extreme GPU Cluster for Interdisciplinary Research on Molecular Dynamics Simulations and Genomics Studies”, led by Dr. Xuhui Huang from HKUST.
  • Co-PI: Hong Kong RGC Collaborative Research Fund (C5026-18G) entitled ``Multi-stage Big Data Analytics on Complex Systems: Methodologies and Applications', led by Prof. Jiannong Cao from PolyU. Period: June 28, 2019 to June 27, 2022.
  • RGC's General Research Fund (2016-2019): Big-Little Heterogeneous Computing with Polymorphic GPU Kernels
  • RGC's General Research Fund (2015-2018): Software Architecture for Fault-Tolerant Multicore Computing with Hybridized Non-Volatile Memories
  • Huawei research grant (2015-2017): Big Data Acceleration on GPU-based Heterogeneous Architecture
  • RGC's General Research Fund (2012-2015): Scalable Cloud-on-Chip Runtime Support with Software Coherence for Future 1000-Core Tiled Architectures.
  • Huawei research grant (2012-2013): A New Multikernel OS for High Throughput Computing on Manycore Systems
  • RGC's General Research Fund (2011-2013): Transparent Runtime and Memory Coherence Support for GPU Based Heterogeneous Many-Core Architecture.
  • China 863 Project (2006-2010): 香港大学网格自适应服务技术研究 (CNGrid HKU Grid Point).