Zhichao Cao
-
Mail code: 8809Campus: Tempe
-
Zhichao Cao is an assistant professor in the School of Computing and Augmented Intelligence at Arizona State University. He leads the Intelligent Data Infrastructure Lab (ASU-IDI), where he conducts research in the areas of database systems (e.g., key-value stores, graph databases, and timeseries databases), storage systems (e.g., file systems, cloud storage, and deduplication systems), and next-generation data infrastructure (e.g., disaggregated infrastructure, computing-in-X, and wireless datacenter). His research interests also lie in the design and development of data management systems for new memory and storage technologies, such as SMR, IMR, NVM, CXL, RDMA, ZNS, and DNA. Moreover, his research also encompasses big data systems, with a focus on the development of query engines for large-scale scientific computing in HPC and storage solutions for AI/ML platforms (LLM offloading, vectorDB, and LLM for storage system optimizations).
Zhichao Cao has served as a Program Committee member of several top conferences such as USENIX ATC, USENIX FAST, SIGMOD, VLDB, ICPP, and HotStorage. He has served as the proceeding chair of SIGMOD, publicity chair of MSST, and virtual chair of HotStorage. He is the reviewer of top journals including ACM Transactions on Storage, IEEE Transactions on Computers, and IEEE Transactions on Cloud Computing.
Prior to joining ASU, Zhichao Cao worked as a research scientist at Facebook, where he contributed to storage and database research from 2018 to 2021. He earned his bachelor's degree in Automation from Tsinghua University in 2013 and his doctoral degree in Computer Science from the University of Minnesota (advised by Prof. David H.C. Du), Twin Cities, in 2020.
Hiring!
Recruiting 1-2 fully funded Ph.D. students for Fall 2025, focusing on research areas related to data systems for LLMs (e.g., KV-cache, offloading, vector databases, and hierarchical databases). Currently collaborating closely with Meta and the ORNL National Laboratory in this field.
- Ph.D. Computer Science, University of Minnesota, Twin-Cities, 2020
- M.S. Computer Science, University of Minnesota, Twin-Cities, 2019
- B.S. Automation, Tsinghua University, China 2013
Data Infrastructure: key-value stores (RocksDB, LevelDB, HBase); NoSQL databases (GraphDB); data deduplication; backup and archive system; file system; hierarchical storage system; distributed storage system; compute-storage disaggregation; memory disaggregation; AI/ML for data infrastructure (auto-tuning, tracing, and analyzing); data infrastructure for AI/ML (KV-cache offloading, memory extension, vector databases, and sustainable LLMs);
Storage for Big Data: cloud storage; object storage; storage systems for big graph; storage system in IoT
New Memory and Storage Techniques: Disaggregated Memory (with CXL and RDMA); Non-Volatile Memory (NVM); Shingled Magnetic Recording (SMR); Interlaced Magnetic Recording (SMR); Zoned Namespace SSDs (ZNS SSDs); DNA- and Glass-based storage
[SOSP'24] Shushu Yi, Shaocong Sun, Li Peng, Yingbo Sun, Ming-Chang Yang, Zhichao Cao, Qiao Li, Myoungsoo Jung, Ke Zhou, Jie Zhang. "BIZA: Design of Self-Governing Block-Interface ZNS AFA for Endurance and Performance" The 30th ACM Symposium on Operating Systems Principles (SOSP 2024).
[TC'24] Yixun Wei, Zhichao Cao, David HC Du. “CPI: A Collaborative Partial Indexing Design for Large-Scale Deduplication Systems.” IEEE Transactions on Computers, November 2024
[HotStorage’24] Viraj Thakkar, Madhumitha Sukumar, Jiaxin Dai, Kaushiki Singh, Zhichao Cao. “Can Modern LLMs Tune and Configure LSM-based Key-Value Stores?” 16th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage), 2024. Best Paper Award!
[HotStorage’24] Chongzhuo Yang, Zhang Cao, Chang Guo, Ming Zhao, Zhichao Cao. “Can ZNS SSDs be Better Storage Devices for Persistent Cache?” 16th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage), 2024.
[MSST'24] Zhang Cao, Chang Guo, Ziyuan Lv, Anand Ananthabhotla, Zhichao Cao "SAS-Cache: A Semantic-Aware Secondary Cache for LSM-based Key-Value Stores. The 38th International Conference on Massive Storage Systems and Technology (MSST), Research Track Full Paper, 2024.
[MSST'24] Gaoji Liu, Chongzhuo Yang, Qiaolin Yu, Chang Guo, Wen Xia, Zhichao Cao "Prophet: Optimizing LSM-Based Key-Value Store on ZNS SSDs with File Lifetime Prediction and Compaction Compensation". The 38th International Conference on Massive Storage Systems and Technology (MSST), Research Track Full Paper, 2024.
[SIGMOD'24] Qiaolin Yu, Chang Guo, Jay Zhuang, Viraj Thakkar, Jianguo Wang, Zhichao Cao. "CaaS-LSM: Compaction-as-a-Service for LSM-based Key-Value Stores in Storage-Disaggregated Infrastructure". Proceedings of ACM Conference on Management of Data (SIGMOD), Research Track Full Paper, 2024.
[ICCD'23] Zhichao Cao, Hao Wen, Fenggang Wu, David H.C. Du. "SMRTS: A Performance and Cost-Effectiveness Optimized SSD-SMR Tiered File System with Data Deduplication". The 41st IEEE International Conference on Computer Design (ICCD) (Acceptance rate: 28%), Research Track Full Paper, 2023.
[ICCD'23] Hao Wen, Zhichao Cao, Bingzhe Li, David Du, Ayman Abouelwafa, Doug Voigt, Shiyong Liu, Jim Diehl and Fenggang Wu "K8sES: Optimizing Kubernetes with Enhanced Storage Service-Level Objectives". The 41st IEEE International Conference on Computer Design (ICCD),(Acceptance rate: 28%), Research Track Full Paper, 2023.
[TOS’22] Zhichao Cao, Huibing Dong, Yixun Wei, Shiyong Liu, and David H.C. Du. “IS-HBase: An In-Storage Computing Optimized HBase with I/O Offloading and Self-Adaptive Caching in Compute-Storage Disaggregated Infrastructure.” ACM Transaction on Storage, Volume 18, Issue 2, May 2022.
[TOS’22] Hiwot Tadese Kassa, Jason Akers, Mrinmoy Ghosh, Zhichao Cao, Vaibhav Gogte, Ronald Dres-linski. “Power-optimized Deployment of Key-value Stores Using Storage Class Memory.” ACM Transaction on Storage, Volume 18, Issue 2, May 2022.
[TOS’22] Xiongzi Ge Zhichao Cao, and David H.C. Du. “HintStor: A Framework to Study I/O Hints in Heterogeneous Storage.” ACM Transaction on Storage, Volume 18, Issue 2, May 2022.
[ATC’21] Hiwot Tadese Kassa, Jason Akers, Mrinmoy Ghosh, Zhichao Cao, Vaibhav Gogte, Ronald Dreslin- ski. “Improving Performance of Flash Based Key-Value Stores Using Storage Class Memory as a Volatile Memory Extension.” 2021 USENIX Annual Technical Conference, 2021 (Acceptance rate: 64/341=23% as Full Paper).
[FAST’20] Zhichao Cao, Siying Dong, Sagar Vemuri, and David H.C. Du.. “Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebooke.” 18th USENIX Conference on File and Storage Technologies, 2020 (Acceptance rate: 23/138=17% as Full Paper).
[FAST’19] Zhichao Cao, Shiyong Liu, Fenggang Wu, Guohua Wang, Bingzhe Li, and David H.C. Du. “Sliding Look-Back Window Assisted Data Chunk Rewriting for Improving Deduplication Restore Performance.” 17th USENIX Conference on File and Storage Technologies, 2019 (Acceptance rate: 26/145=18% as Full Paper).
[TOS’19] Zhichao Cao, Hao Wen, Xiongzi Ge, and David H.C. Du. “TDDFS: A Tier-aware Data Deduplication based File System.” ACM Transaction on Storage, 2019.
[FAST’18] Zhichao Cao, Hao Wen, Fenggang Wu, and David H.C. Du. “ALACC: Accelerating Restore Performance of Data Deduplication Systems Using Adaptive Look Ahead Window Assisted Chunk Caching.” 16th USENIX Conference on File and Storage Technologies, 2018 (Acceptance rate: 23/139=17% as Full Paper).
Poster and Working-in-Progress
[FAST’24] Madhumitha Sukumar, Jiaxin Dai, Kaushiki Singh, Vikriti Lokegaonkar, Viraj Thakkar, Zhichao Cao. “LLM-assisted Automatic-Configuration and Tuning Framework for LSM-based Key-Value Stores.” 22th USENIX Conference on File and Storage Technologies, 2024.
[FAST’23] Kritshekhar Jha, Ian Mcdonough, Alexander Sutila, Zhichao Cao, and Ming Zhao.. “DM-ZCache:Zoned Namespace (ZNS) SSD based Caching.” 21th USENIX Conference on File and Storage Technologies, 2023.
[FAST’23] Jinghuan Yu, Yixun Wei, Zhichao Cao, David H.C. Du, and Chun Jason Xue.. “Level-based Shard Migration in Distributed LSM KV Store.” 21th USENIX Conference on File and Storage Technologies, 2023.
[FAST’17] Zhichao Cao, Fenggang Wu, Hao Wen, and David H.C. Du. “Optismr: Restore-Performance Optimization for Deduplication Systems Using SMR Drives.” 16th USENIX Conference on File and Storage Technologies, 2017.
FAST’17] Hao Wen, Zhichao Cao, Yang Zhang, and David H.C. Du. “Guaranteed QoS with Integrated Control for Networked Storage.” 16th USENIX Conference on File and Storage Technologies, 2017.
[SoCC’14] Xiongzi Ge, Zhichao Cao, and David H.C. Du. “OneStore: Integrating Local and Cloud Storage with Access Hints.” ACM Symposium on Cloud Computing, 2014.
Courses
2025 Spring
Course Number | Course Title |
---|---|
CSE 599 | Thesis |
CSE 792 | Research |
CSE 799 | Dissertation |
CSE 580 | Practicum |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
2024 Fall
Course Number | Course Title |
---|---|
CSE 792 | Research |
CSE 799 | Dissertation |
CSE 580 | Practicum |
CSE 511 | Data Processing at Scale |
2024 Summer
Course Number | Course Title |
---|---|
CSE 584 | Internship |
CSE 792 | Research |
CSE 584 | Internship |
CSE 792 | Research |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
2024 Spring
Course Number | Course Title |
---|---|
CSE 599 | Thesis |
CSE 792 | Research |
CSE 580 | Practicum |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
2023 Fall
Course Number | Course Title |
---|---|
CSE 792 | Research |
CSE 580 | Practicum |
CSE 792 | Research |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
2023 Summer
Course Number | Course Title |
---|---|
CSE 584 | Internship |
CSE 584 | Internship |
CSE 599 | Thesis |
2023 Spring
Course Number | Course Title |
---|---|
CSE 599 | Thesis |
CSE 580 | Practicum |
CSE 511 | Data Processing at Scale |
2022 Fall
Course Number | Course Title |
---|---|
CSE 580 | Practicum |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
CSE 330 | Operating Systems |
2022 Spring
Course Number | Course Title |
---|---|
CSE 511 | Data Processing at Scale |
Spring 2022: CSE 520
- Program Committee of USENIX FAST 2026
- Program Committee of ACM SIGMOD 2026
- Program Committee of USENIX ATC 2025
- Program Committee of USENIX FAST 2025
- Program Committee of ACM SIGMOD 2025
- Program Committee of VLDB 2025
- Program Committee of IEEE ICDCS 2025
- Program Committee of ACM HotStorage 2025
- Program Committee of USENIX ATC 2024
- Program Committee of ACM SIGMOD 2024 (Demo track)
- Program Committee of VLDB 2024
- Publicity Co-Chair of MSST 2024
- Program Committee of ACM HotStorage 2024
- Program Committee of ACM SYSTOR 2024
- Proceedings Co-Chair of ACM SIGMOD 2023
- Program Committee of ACM SIGMOD 2023
- Program Committee of ICPP 2023
- Program Committee of ACM HotStorage 2023
- Virtual Chair of ACM HotStorage 2022
- Program Committee of IEEE NAS 2022
- Program Committee of ACM APSys 2022
- Reviewer of ACM Transactions on Storage (TOS), 2022, 2023, 2024
- Reviewer of IEEE Transactions on Computers (TC), 2022
- Reviewer of IEEE/ACM Transactions on Networking (TON), 2022, 2023
- Reviewer of IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2022
- Reviewer of International Journal of Future Generation Computer Systems, 2021
- Reviewer of Transactions on Cloud Computing, 2022
- Reviewer of Computer Communications, 2022
- Reviewer of IEEE Intelligent Systems, 2022
- Volunteer of International Conference on Parallel Processing (ICPP’14)
- Research Scientist Facebook Oct. 2019 - Dec. 2021
- Research Collaborator Facebook Sep. 2018 - Sep. 2019
- Research Intern Facebook Jun. 2018 - Aug. 2018
- Research Intern Veritas Jun. 2016 - Aug. 2016
- Research Intern Hewlett-Packard (HPE) Jun. 2015 - Aug. 2015
- Research Intern Hewlett-Packard (HPE) Jun. 2014 - Aug. 2014