Yanjie Fu
-
BYENG 506, Brickyard Engineering 699 S Mill Ave Tempe, AZ 85281
-
Mail code: 8809Campus: Tempe
-
Dr. Yanjie Fu is an associate professor in the School of Computing and AI at the Arizona State University. He received his Ph.D. degree from the Rutgers University in 2016, the B.E. degree from the University of Science and Technology of China (USTC), and the M.E. degree from the Chinese Academy of Sciences (CAS). He has research experience in industry research labs, such as Microsoft Research Asia and IBM Thomas J. Watson Research Center. He has published prolifically in refereed journals and conference proceedings, such as IEEE TKDE, IEEE TMC, ACM TKDD, ACM SIGKDD, ICLR, AAAI, IJCAI, VLDB, IEEE ICDE, WWW, ACM SIGIR. He currently serves as an Associate Editor of ACM Transactions on Knowledge Discovery from Data.
His teaching and research have been recognized by: 1) three junior faculty awards: US NAE Grainger Foundation Frontiers of Engineering early career engineer (2023), US NSF CAREER (2021), and US NSF CRII (2018) awards; 2) several best paper (runner-up, finalist) awards (e.g., IEEE ICDM 2021 best paper finalist, ACM SIGSPATIAL 2020 best runner-up, ACM SIGKDD 2018 best student paper finalist); 3) several community and industrial recognitions: 2024 Stanford Elsevier World’s Top 2% Scientists, 2022 Baidu Scholar global top Chinese young scholars in AI, 2021 Aminer.org AI 2000 Most Influential Scholar Award Honorable Mention in Data Mining, 2016 Microsoft Azure Research Award; 4) several university-level awards: Fulton Engineering Top 5% Teaching Recognition Award, Reach the Stars Award, University System Research Board Award, and University Interdisciplinary Research Award. He is committed to data science education. His graduated Ph.D. students have joined academia as tenure-track faculty members.
He is broadly interested in data mining, AI, and their interdisciplinary applications. His research involves two major efforts: 1) on Data for AI (D4AI), how can the structure knowledge of data guide AI? His lab has contributed projects including: D4AI-spatial, D4AI-timeseries, D4AI-causal outliers. 2) on AI for data (AI4D), how can AI augment, reprogram, and knowledgize data? Several contributed projects includes AI4D-RL, AI4D-Gen, AI4D-LLM. His recent focuses are space-time intelligence, data-centric AI, sim2decision. He also explores emerging topics on multimodal reasoning, LLM and agentic AI with his students. He has been fortunate to work with collaborators from scientific and social domains, including urban and regional planning, earth and environmental science, learning and educational science, disaster and community resilience, social computing.
He is broadly interested in data mining, AI, and their interdisciplinary applications. His research involves two major efforts: 1) on Data for AI (D4AI), how can the structure knowledge of data guide AI? His lab has contributed projects including: D4AI-spatial, D4AI-timeseries, D4AI-causal outliers. 2) on AI for data (AI4D), how can AI augment, reprogram, and knowledgize data? Several contributed projects includes AI4D-RL, AI4D-Gen, AI4D-LLM. His recent focuses are space-time intelligence, data-centric AI, sim2decision. He also explores emerging topics on multimodal reasoning, LLM and agentic AI with his students.
AI ⇄ Data
AI systems, unlike humans, are brittle, not robust, often struggle when faced with novel situations, and highly sensitive to small perturbations, which can lead to catastrophically poor performance. My research aims to develop robust machine intelligence with imperfect and complex data, by building tools to address learning framework, algorithmic, data, and computing challenges. My research involves two major efforts: 1) on the AI side, I study how the structure knowledge of data can guide AI; 2) on the data side, I examine how AI can transform, reprogram, augment, and knowledgize data. I execute two important steps steps towards this vision. The first step (data representation construct) aims to integrate structure knowledge to achieve deep robust representation to fight imperfect and complexity data. The second step (learning strategy construct) aims to integrate robust representations with adaptive and interactive learning to fight uncertain and constrained environments.
A. AI For The Data: AI for Data Augmentation, Reprogramming, Knowlegization
A-1 Data-centric AI: from Reinforcement Decisions to Generative Intelligence
- Key concepts: learning mechanism unknown feature representation space knowledge via generative modeling and continuous latent reasoning for data editing
- Ph.D. Dissertation Student: Dongjie Wang (2020-2024), a tenure-track assistant professor at the University of Kansas
- Key papers:
A-2 Reinforcement Learning and Automated Data Science
- Key concepts: self optimizing feature selection and generation as decision sequences of reinforcement policy networks
- Ph.D. Dissertation Student: Kunpeng Liu (2017-2022), a tenure-track assistant professor at Clemson University
- Key papers:
- Automated Feature Selection: A Reinforcement Learning Perspective (TKDE)
- Interactive Reinforcement Learning for Feature selection with Decision Tree in the Loop. (TKDE)
- Efficient Reinforced Feature Selection via Early Stopping Traverse Strategy. (ICDM)
- Automating Feature Subspace Exploration via Multi-Agent Reinforcement Learning. (KDD)
- Awards
- IEEE ICDM Best Paper Finalist
B. AI By The Data: Data’s Structure Knowledge for Guiding AI
B-1 Deep Time Series Learning
- Key concepts: from time series regularities to time series shifts
- Ph.D. Dissertation Student: Wei Fan (2020-2023), a tenure-track assistant professor at University of Auckland
- Key papers:
B-2 Structure Knowledge Guided Spatial-Temporal Representation Learning
- Key concepts: integrating spatial-temporal knowledge guidance (e.g., collective, peer, dynamic, substructure knowledge) with graph embedding to amplify spatiotemporal representation learning
- Ph.D. Dissertation Student: Pengyang Wang (2017-2021), a tenure-track assistant professor at University of Macau
- Key papers:
- Awards:
- ACM SIGKDD’18 Best Student Paper Finalist
B-3 Time Series and Root Cause Analysis
- Key concepts: learning causal structures from multivariate time series
- Ph.D. Students: Dr. Dongjie Wang, Arun Vignesh
- Key papers:
C. AI Beyond The Data: AI to Reason Beyond Data Distribution and Dynamics as Extrapolation Knowledge for Simulation, Decisions, World Models
- Key concepts: simulation as generative AI; decision making as policy learning or agentic reasoning; building spatial, time, social, and physical knowledge-guided digital copy of complex system distribution and dynamics; reason decision intelligence from multimodal semi-structure hybrid data
- Ph.D. Students: Haoyue Bai
- Key papers:
- Generative Simulation and Iterative Decision Policies for Supply Chain Optimization (INFORMS Winter Simulation Conference 2025)
- Brownian Bridge Augmented Surrogate Simulation and Injection Planning for Geological CO2 Storage (NeurIPS 2025 underreview)
AI Interactions: Embedding AI into Critical Systems and Interdisciplinary Research
- AI and Urban Planning
- Key concepts: urban planning as land-use configuration generation
- Ph.D. Students: Dr. Dongjie Wang
- Community Building: AAAI 25 urban planning workshop, 25 AI and Cities Forum
- Key papers:
- Towards Automated Urban Planning: When Generative and ChatGPT-like AI Meets Urban Planning
- Reimagining city configuration: Automated urban planning via adversarial learning (SIGSPatial 20)
- Human-instructed Deep Hierarchical Generative Learning for Automated Urban Planning (AAAI 23)
- Automated Urban Planning for Reimagining City Configuration via Adversarial Learning: Quantification, Generation, and Evaluation (ACM Trans on SAS 23)
- Deep human-guided conditional variational generative modeling for automated urban planning (IEEE ICDM 23)
- Hierarchical Reinforced Urban Planning: Jointly Steering Region and Block Configurations (SIAM DM 23)
- Awards:
- ACM SIGSpatial best paper runner-up
- Human Mobility Modeling
- Key concepts: human mobility as stochastic processes, word/topic modeling
- Human Mobility Synchronization and Trip Purpose Detection with Mixture of Hawkes Processes (KDD’17)
- Representing Urban Forms: A Collective Learning Model with Heterogeneous Human Mobility Data (TKDE)
- Human Mobility Synchronization and Trip Purpose Detection with Mixture of Hawkes Processes (KDD’17)
- Machine Learning for Urban Vibrancy Analysis with Crowd-sourced Geo-tagged Data
- Machine Learning for In-App Behavior Analysis
- A Multi-Label Multi-View Learning Framework for In-App Service Usage Analysis (TIST)
- Effective and Real-time In-App Activity Analysis in Encrypted Internet Traffic Streams (KDD’17)
- Service Usage Classification with Encrypted Internet Traffic in Mobile Messaging Apps (TMC)
- Service Usage Analysis in Mobile Messaging Apps: A Multi-Label Multi-View Perspective (ICDM’16)
- Machine Learning for Mobile Recommender Systems
Courses
2026 Spring
| Course Number | Course Title |
|---|---|
| CSE 595 | Continuing Registration |
| CSE 792 | Research |
| CSE 795 | Continuing Registration |
| CSE 792 | Research |
| CSE 572 | Data Mining |
| CSE 593 | Applied Project |
| DSE 572 | Data Mining |
2025 Fall
| Course Number | Course Title |
|---|---|
| CSE 599 | Thesis |
| CSE 792 | Research |
| CSE 799 | Dissertation |
| CSE 590 | Reading and Conference |
| CSE 580 | Practicum |
| CSE 790 | Reading and Conference |
| CSE 792 | Research |
| CSE 572 | Data Mining |
2025 Summer
| Course Number | Course Title |
|---|---|
| CSE 584 | Internship |
2025 Spring
| Course Number | Course Title |
|---|---|
| CSE 493 | Honors Thesis |
| CSE 599 | Thesis |
| CSE 792 | Research |
| CSE 799 | Dissertation |
| CSE 792 | Research |
| CSE 790 | Reading and Conference |
| CSE 572 | Data Mining |
2024 Fall
| Course Number | Course Title |
|---|---|
| CSE 492 | Honors Directed Study |
| CSE 792 | Research |
| CSE 580 | Practicum |
| CSE 790 | Reading and Conference |
| CSE 790 | Reading and Conference |
| CSE 792 | Research |
| CSE 572 | Data Mining |
2024 Summer
| Course Number | Course Title |
|---|---|
| CSE 584 | Internship |
| CSE 792 | Research |
2024 Spring
| Course Number | Course Title |
|---|---|
| CSE 792 | Research |
| CSE 790 | Reading and Conference |
| CSE 572 | Data Mining |
2023 Fall
| Course Number | Course Title |
|---|---|
| CSE 792 | Research |
| CSE 580 | Practicum |
| CSE 792 | Research |
| CSE 572 | Data Mining |