The School of Computing and Data Science (https://www.cds.hku.hk/) was established by the University of Hong Kong on 1 July 2024, comprising the Department of Computer Science and Department of Statistics and Actuarial Science and Department of AI and Data Science.

Events for
Seminars and Events (Including Past and Upcoming)
December 19, 2025
  • Title: LLM based Zero Shot Speech Synthesis

    Time: 02:00pm 

    Venue: CB 308

    Speaker(s): Dr. Shujie Liu

    Remark(s): 

    Abstract

    With the rapid development of large language models (LLMs) in natural language processing, speech LLMs have also begun to receive increasing attention. In this talk, we will introduce VALL‑E, a zero‑shot text‑to‑speech (TTS) synthesis approach built upon large language models. Leveraging the in‑context learning capabilities of LLMs, VALL‑E can generate high‑quality, personalized speech using only a three‑second audio prompt from an unseen speaker. Building upon this foundation, we will further introduce several extensions of VALL‑E, including: VALL‑E X (the multilingual version), VALL‑E 2(addressing stability issues), PALLE(combining AR and NAR modelling), MELL‑E and FELLE (based on continuous speech representations).

    About the speaker

    Shujie Liu is a Principal Researcher at MSRA Hong Kong. His research focuses on natural language processing, speech processing, and machine learning. He has published over 100 papers in top-tier conferences and journals in NLP and speech, co‑authored the book Machine Translation, and contributed to Introduction to Artificial Intelligence. He has won multiple first‑place awards in international NLP and speech evaluation campaigns and has served as a reviewer and area chair for several major conferences. His research has been widely deployed in Microsoft products, including Microsoft Translator, Skype Translator, Microsoft IME, and Microsoft Speech Services.

  • Title: Towards Scalable Serverless LLM Inference Systems

    Time: 10:30am 

    Venue: CB 308

    Speaker(s): Prof. Minchen Yu

    Remark(s): 

    Abstract

    Serverless computing has become a compelling cloud paradigm for model inference due to its high usability and elasticity. However, current serverless platforms suffer from significant cold-start overhead---especially for large models---limiting their ability to deliver low-latency, resource-efficient inference. In this talk, I will present three systems we built for scalable serverless inference. First, Torpor proposes node-level GPU pooling that enables fine-grained GPU sharing and fast model swapping. Second, LambdaScale leverages high-speed interconnects to scale models across nodes and performs pipelined inference for lower latency. Third, for emerging large mixture-of-experts (MoE) models, we design fine-grained expert scheduling with elastic scaling to improve the cost-effectiveness of MoE inference.

    About the speaker

    Minchen Yu is an Assistant Professor at the School of Data Science, The Chinese University of Hong Kong, Shenzhen. He received his Ph.D. from Hong Kong University of Science and Technology. His research interests cover cloud computing and distributed systems, with a recent focus on serverless computing and machine learning systems. His research has been published at various prestigious venues, such as NSDI, ATC, EuroSys, INFOCOM, and SoCC, and has been applied in leading cloud platforms, such as Alibaba Cloud. He received the Best Paper Runner-Up Award at IEEE ICDCS 2021.

December 18, 2025
  • Title: Towards Consistent and Physically Plausible Visual Generation

    Time: 11:00am 

    Venue: CB 308

    Speaker(s): Departmental Seminar by Prof. Jianfei Cai, Monash University

    Remark(s): 

    Abstract

    Recent advances in large language models (LLMs) and multimodal large language models (MLLMs) have significantly enhanced the understanding and encoding of textual information. Leveraging these capabilities, a growing number of diffusion-based generative models have emerged for text-conditioned visual generation — spanning text-to-image, text-to-video, and text-to-3D tasks. While these models offer remarkable flexibility and produce increasingly realistic content, they still face fundamental challenges: aligning precisely with user intent, maintaining spatial, view, and temporal consistency, and adhering to the laws of physics. In this talk, I will present several recent research projects from my group that attacks these challenges. PanFusion enforces global consistency in text-to-panorama image generation; MVSplat360 uses image conditions and explicit 3D representation to enhance view consistency of 3D generation. VLIPP integrates physics-informed priors to ensure physically plausible text-to-video generation. I will conclude by pointing out the limitations and discussing future directions such as developing world models.

    About the speaker

    Jianfei Cai is a Professor at Faculty of IT, Monash University, where he had served as the inaugural Head for the Data Science & AI Department. Before that, he was Head of Visual and Interactive Computing Division and Head of Computer Communications Division in Nanyang Technological University (NTU). His major research interests include computer vision, deep learning and multimedia. He is a co-recipient of paper awards in ACCV, ICCM, IEEE ICIP and MMSP, and a winner of Monash FIT’s Dean's Researcher of the Year Award and Monash FIT Dean's Award for Excellence in Graduate Research Supervision. He serves or has served as an Associate Editor for TPAMI, IJCV, IEEE T-IP, T-MM, and T-CSVT as well as serving as Senior/Area Chair for CVPR, ICCV, ECCV, ACM Multimedia, ICLR and IJCAI. He was the Chair of IEEE CAS VSPC-TC during 2016-2018. He had served as the leading TPC Chair for IEEE ICME 2012, the best paper award committee chair & co-chair for IEEE T-MM 2020 & 2019, and the leading General Chair for ACM Multimedia 2024. He is a Fellow of IEEE.

     

December 17, 2025
  • Title: Genetics of Human Longevity: Regional Association Signals and Cross-cohort Replication

    Time: 02:30pm 

    Venue: Room 301, Run Run Shaw Building

    Speaker(s): Prof. Heping Zhang

    Remark(s): 

    Abstract

    Understanding the genetic basis of human longevity remains a fundamental challenge in aging research. Although previous genome-wide association studies (GWAS) have identified a few loci,most prominently APOE, their replication has often been inconsistent, and most analyses have relied primarily on single-variant testing. To better understand the genetics of human longevity,we examined genetic determinants of age at death in the UK Biobank (UKB) and conducted independent replication in the All of Us (AoU) cohort. After standard quality control and covariate adjustment, we performed single-variant GWAS and complemented these with the Regional Association Score (RAS) framework to capture cumulative regional effects. To gain functional insight, we carried out transcriptome-wide association studies (TWAS) using RNA sequencing data from Mayo Pilot brain tissue. The UKB discovery and AoU replication analyses revealed robust associations across chromosome 19 encompassing APOE, APOC1, and NECTIN2(PVRL2), reaffirming this locus as the major genetic contributor to human lifespan. In addition,suggestive and potentially novel associations, such as those involving TRIM10 on chromosome 6, highlight new avenues for investigation. Together, these results underscore the enduring importance of the APOE region in longevity, demonstrate the value of regional association approaches alongside conventional GWAS, and provide new leads for elucidating the molecular mechanisms of human aging. This is a joint work with Yiran Jiang and Yue Hu.

December 16, 2025
  • Title: AI-Assisted System Security: from Offense to Defense

    Time: 10:30am 

    Venue: CB 308

    Speaker(s): Prof. Yan Chen

    Remark(s): 

    Abstract

    The emergence of Large Language Models (LLMs) has significantly transformed many fields. In this talk, we will present our recent research on leveraging LLMs for both offensive and defensive cybersecurity applications—specifically, automated penetration testing, as well as automated vulnerability discovery and patching. Evaluations show that our systems achieve state-of-the-art performance among open-source solutions in these areas, including in the recent DARPA AI Cyber Challenge (AIxCC) competition. We believe these advancements will play a critical role in combating major cyber threats, such as Advanced Persistent Threats (APTs).

    About the speaker

    Yan Chen received his Ph.D. in Computer Science from University of California at Berkeley in 2003 and after that he joined Northwestern University USA where he became a Full Professor in 2014. His research interests are in security and measurement for networking systems. He won the DOE Early CAREER Award in 2005, the DOD (Air Force of Scientific Research) Young Investigator Award in 2007, and the Most Influential Paper Award of ACM ASPLOS in 2018. In 2024, he co-led the team that was selected as one of the seven finalists in the DARPA AI Cyber Challenge (AIxCC), securing a total of $3 million in funding. Based on Google Scholar, his papers have been cited over 17,000 times, and the h-index of his publications is 63. He is a Fellow of IEEE.

December 12, 2025
  • Title: AI Planning for Data Exploration

    Time: 04:00pm 

    Venue: HW312, Haking Wong Building, HKU

    Speaker(s): Prof. Sihem Amer-Yahia

    Remark(s): 

    Abstract

    Data Exploration is an incremental process that helps users express what they want through a conversation with the data. Reinforcement Learning (RL) is one of the most notable approaches to automate data exploration and several solutions have been proposed. With the advent of Large Language Models and their ability to reason sequentially, it has become legitimate to ask the question: would LLMs and,more generally AI planning, outperform a customized RL policy in data exploration? More specifically, would LLMs help circumvent retraining for new tasks and strike a balance between specificity and generality? This talk will attempt to answer this question by reviewing RL training and policy reusability for data exploration.

    About the speaker

    Sihem Amer-Yahia is a Silver Medal CNRS Research Director and Deputy Director of the Lab of Informatics of Grenoble. She works on exploratory data analysis and algorithmic upskilling. Prior to that she was Principal Scientist at QCRI, Senior Scientist at Yahoo! Research and Member of Technical Staff at at&t Labs. Sihem served as PC chair for SIGMOD 2023 and as the coordinator of the Diversity, Equity and Inclusion initiative for the database community. In 2024, she received the 2024 IEEE TCDE Impact Award, the SIGMOD Contributions Award, and the VLDB Women in Database Award.

  • Title: Updating Quantum Beliefs

    Time: 10:30am 

    Venue: CB 308

    Speaker(s): Prof. Valerio Scarani

    Remark(s): 

    Abstract

    In the classical world, upon receiving new evidence, beliefs are (well… should be) updated according to the Bayes’ rule. The quantum analog of this rule is not straightforward and has been the subject of several discussions. Recently, we have found the quantum Bayes’ rule that emerges from a “minimum change principle”, that is the idea that one should update one’s beliefs in the least disruptive way. In many cases, it coincides with the Petz map, one of the most popular previously proposed candidates; in some situations, it is a map that nobody had anticipated. I shall also discuss how the quantum belief should not be restricted to the state, but comprises any information we may have about its preparation. Based on this observation, we have been able to unify various proposals for the task of “smoothing” in a single framework.

    About the speaker

    Valerio Scarani is Principal Investigator (currently serving as Deputy Director) at the Centre for Quantum Technologies, and Professor at the Department of Physics, National University of Singapore. His research is in theoretical quantum information science, with works in quantum cryptography, Bell nonlocality and other topics.

     

December 11, 2025
  • Title: Data Science and AI for Remote Sensing

    Time: 03:00pm 

    Venue: CB 308

    Speaker(s): Departmental Seminar by Prof. Peng Gong, The University of Hong Kong

    Remark(s): 

    Abstract

    Satellite remote sensing is a field expanding exponentially, with data at the PB level. In the past 40 years, it has evolved from partially covering the Earth's surface with 100–30 meter resolution to covering every corner of the Earth at 30 cm resolution. The repeat frequency is improving, and there is a potential point expected to be reached when we collect Earth surface data at submeter resolution constantly. The speed of data acquisition moves ahead, with data processing and information extraction several blocks behind. How should we fill the gap? Environmental scholars learning from the computer science community is clearly not enough. We need better pattern recognition and machine learning technologies to make better use of the explosion of Earth observation data. Will computer scientists, particularly data scientists and AI researchers, join forces to tackle these problems? I propose the concept of iEarth, calling for the participation of data scientists and AI researchers to join hands with environmental scientists to tackle today's grand environmental challenges of the human society, food insecurity, disaster early warning and prevention, water and energy shortages, global health, climate change, etc.

    About the speaker

    "Professor Peng Gong is the Vice-President and Pro-Vice-Chancellor (Academic Development) at The University of Hong Kong, where he also serves as Chair Professor of Global Sustainability in the Departments of Geography and Earth & Planetary Sciences (since 2021). He holds a BS and MS from Nanjing University and a PhD from the University of Waterloo. His academic career spans York University, the University of Calgary, and UC Berkeley, where he became a full professor in 2001. He later founded Tsinghua University’s Department of Earth System Science (2016) and served as Dean of Science (2017).

    In addition to being a Foreign Member of the Academy of Europe, he was the Founding Editor-in-Chief of Geographic Information Sciences (now Annals of GIS). He advises Future Earth as well as Earth Commission; and co-chairs the Lancet Climate Change and Health Commission and Countdown 2030. An interdisciplinary leader, he co-founded the Center for Assessment and Monitoring of Forest and Environmental Resources at UC Berkeley and established key Chinese institutions, including the first Earth System Science Institute in China at Nanjing University. 

    His research spans urbanization and health, environmental change monitoring, and infectious disease modelling. He received research awards from the American Society for Photogrammetry and Remote Sensing, the Association of American Geographers and the Joint Board Council of Science China and Science Bulletin. Over 30 of his former PhD students now hold faculty positions at top universities worldwide.

December 09, 2025
  • Title: Genetic and Epigenetic Landscape of Self-Identified Hispanics in All Of Us

    Time: 11:30am 

    Venue: CB 308

    Speaker(s): Dr. Fritz Sedlazeck

    Remark(s): 

    Abstract

    Hispanic populations in the United States are highly admixed and genetically diverse, yet remain underrepresented in genomic studies. To address this, we present the first large-scale long-read sequencing analysis of 1,490 self-reported Hispanic individuals from the All of Us Research Program, capturing small variants, structural variants, tandem repeats (TRs), and CpG methylation. We characterize global and local ancestry across the cohort, enabling ancestry-aware analysis of genetic and epigenetic features. Over 10.3 million previously unknown autosomal variants are identified, including medically relevant alleles stratified by local ancestry and pathogenic risk revealing 402 carriers with potential risk for subsequent generations. We discover 135 individuals with TR alleles exceeding established pathogenic ranges, and conduct the first genome-wide TR-mQTL analysis, identifying 3,329 TR alleles associated with methylation. Allele-specific methylation (ASM) is resolved at >12,000 loci per genome and 24 novel recurrent ASM loci are identified. This includes ancestry specific regulatory activity such as activation of paralogous genes driven by ancestry-enriched variants and epigenetic markers. These findings establish a foundational resource for biomedical research and highlight the critical role of ancestry-aware analyses in understanding gene regulation, disease risk, and personalized medicine.

    About the speaker

    Dr. Fritz Sedlazeck is an Associate Professor at the Human Genome Sequencing Center at Baylor College of Medicine and an Adjunct Associate Professor at Rice University. His research focuses on algorithmic developments and high-performance computing for genomic and genetic applications. Specifically, he studies ways to improve the characterization of complex genomic alterations between individuals’ genomes based on large genomic sequencing data and as such improve our understanding of complex phenotypes such as human diseases.

December 08, 2025
  • Title: Fighting Noise with Noise: Causal Inference with Many Candidate Instruments

    Time: 04:00pm 

    Venue: Room 301, Run Run Shaw Building

    Speaker(s): Prof. Linbo Wang

    Remark(s): 

    Abstract

    Instrumental variable methods provide useful tools for inferring causal effects in the presence of unmeasured confounding. To apply these methods with large-scale data sets, a major challenge is to find valid instruments from a possibly large candidate set. In practice, most of the candidate instruments are often not relevant for studying a particular exposure of interest. Moreover, not all relevant candidate instruments are valid as they may directly influence the outcome of interest. In this article, we propose a data-driven method for causal inference with many candidate instruments that addresses these two challenges simultaneously. A key component of our proposal involves using pseudo variables, known to be irrelevant, to remove variables from the original set that exhibit spurious correlations with the exposure. Synthetic data analyses show that the proposed method performs favourably compared to existing methods. We apply our method to a Mendelian randomization study estimating the effect of obesity on health-related quality of life. .

    About the speaker

    Linbo Wang is an associate professor from the University of Toronto, Canada, and he holds a joint appointment at statistic, mathematics and computer science departments. His research interests are at casual inference and graphical models. Currently he is a Canada Research Chair in Causal Machine Learning.

     




Division of Computer Science,
School of Computing and Data Science

Rm 207 Chow Yei Ching Building
The University of Hong Kong
Pokfulam Road, Hong Kong
香港大學計算與數據科學院, 計算機科學系
香港薄扶林道香港大學周亦卿樓207室

Email: csenq@hku.hk
Telephone: 3917 3146

Copyright © School of Computing and Data Science, The University of Hong Kong. All rights reserved.
Don't have an account yet? Register Now!

Sign in to your account