[MDS Job Talk] Neural Knowledge Integration from Large-Scale Multimodal Health Data
Speaker: Suqi Liu (Harvard)
Date: 1/7/25
Abstract: Data is playing an increasingly crucial role in modern health sciences. With the transition to electronic health systems, growing collaboration among institutions, and the rapid advancement of genome sequencing technologies, health data has become more accessible than ever before. However, the large-scale nature of health data presents unique challenges for statistics and data science due to its high dimensionality, diverse modalities, complex structures, and more. In this talk, I will highlight several efforts I have undertaken in recent years to address these challenges using statistical, probabilistic, and AI-driven tools, ranging from practical applications to the theoretical foundations of data science. First, I will focus on the theoretically guaranteed learning of a comprehensive medical knowledge graph that integrates heterogeneous relationships from various sources by employing embedding-based neural networks combined with representation learning. Next, I will discuss a method for harmonizing data across multiple institutions using graph neural networks (GNNs), along with a theoretical analysis of GNN performance in graph alignment tasks through random geometric graphs. Finally, I will present a framework for unifying the representation of biomedical concepts and genetic markers by leveraging large language models (LLMs).