Hello
Shuli Jiang (蒋书礼)
I am fortunate to be advised by Professor Gauri Joshi.
I also work closely with Professor Steven Wu and Professor Pranay Sharma (who will join IIT Bombay in Jan.2025).
Prior to becoming a Ph.D. student, I obtained my B.S. and M.S. in Computer Science at Carnegie Mellon University.
Previously, I also worked with Professor Artur Dubrawski at the Auton Lab, affiliated with the Robotics Institute.
Job Hunting
I am looking for industry research positions. I am interested in topics on differential privacy, distributed optimization, Federated Learning (FL), LLM security and efficient training of LLM!
Please let me know if you have openings! :)
About Me
My research interests lie broadly in theory and applications of federated learning, communication-efficient distributed learning algorithms, differential privacy and security issues of large foundation models.
I am currently working around three topics around secure and privacy-preseving machine learning:
Distributed algorithms for federated learning under limited communication cost
Differentially private learning algorithms that ensure the privacy of user data
Security vulnerabilities of foundation models, including large language models (LLMs)
Previously, I also worked on outlier detection, (distributed) sketching algorithms, multi-view clustering, etc.
Contact
Email: [first name] + j [AT] andrew.cmu.edu
Publications
(αβ: alphabetical order, **: contribution order)
Preprints
(**) Shuli Jiang, Swanand Kadhe, Yi Zhou, Farhan Ahmed, Ling Cai, Nathalie Baracaldo
Turning Generative Models Degenerate: The Power of Data Poisoning Attacks
[arXiv]
Workshop Proceedings:
(**) Shuli Jiang, Swanand Kadhe, Yi Zhou, Ling Cai, Nathalie Baracaldo
Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks
NeurIPS 2023 Workshop on Backdoors in Deep Learning - The Good, the Bad, and the Ugly (Best Poster Award)
[arXiv]
Conference / Journal Proceedings:
(**) Shuli Jiang, Qiuyi (Richard) Zhang, Gauri Joshi
Optimized Tradeoffs for Private Majority Ensembling
(To Appear) Transactions on Machine Learning Research (TMLR 2024)
(**) Shuli Jiang, Pranay Sharma, Gauri Joshi
Correlation Aware Sparsified Mean Estimation Using Random Projection
The Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)
[arXiv]
[Code]
(**) Shuli Jiang, Robson Leonardo Ferreira Cordeiro, Leman Akoglu
D.MCA: Outlier Detection with Explicit Micro-Cluster Assignments
The Twenty-second IEEE International Conference on Data Mining (ICDM 2022)
[Link]
[arXiv]
[Code]
(αβ) Shuli Jiang, Hai Thanh Pham, David P. Woodruff, Qiuyi (Richard) Zhang
Optimal Sketching for Trace Estimation
The Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021 Spotlight)
[Link]
[arXiv]
[Code]
(αβ) Shuli Jiang, Dongyu Li, Irene Mengze Li, Arvind V. Mahankali, David P. Woodruff
Streaming and Distributed Algorithms for Robust Column Subset Selection
The Thirty-eighth International Conference on Machine Learning (ICML 2021)
[Link]
[arXiv]
[Code]
(**) Bohan Zhang, Dana Van Aken, Justin Wang, Tao Dai, Shuli Jiang, Jacky Lao, Siyuan Sheng, Andrew Pavlo, Geoffrey J. Gordon
A Demonstration of the OtterTune Automatic Database Management System Tuning Service
The VLDB Endowment, Vol. 11, No. 12, 2018
[Link] [Code]
Technical Reports:
Shuli Jiang
Communication Efficient and Differentially Private Optimization
Ph.D. Thesis Proposal, 2024
[Link]
(αβ) Theresa Gebert, Shuli Jiang, Jiaxian Sheng.
Characterizing Allegheny County Opioid Overdoses with an Interactive Data Explorer and Synthetic Prediction Tool
HackAuton Best Show Prize, 2018
[arXiv][Code]
Shuli Jiang
Deep Multi-view Clustering Using Local Similarity Graphs
Master Thesis, 2020
Advisor: Prof. Artur Dubrawski
[Link]
Patent
Inventors: Shuli Jiang, Swanand Kadhe, Yi Zhou, Ling Cai, Nathalie Baracaldo
Title: A System and Method to Defend Against Data Poisoning Attacks Targeting Generative LLMs
Reference number: P202303734US01
Filed by IBM Research
April 2024
Work Experience
Google Research (Laser, Foresight)
Student Researcher
Advisor: Walid Krichene, Nicolas Mayoraz
May 2024 ~ August 2024, Mountain View, CA, USA
Focus: Differential privacy, Recommender systems
Design differentially private learning algorithms specifically for training models (Factorization Machine) for ads prediction and recommender sys- tems. One project is on maximizing the privacy-utility trade-offs by mak- ing use of public features in the dataset. The other project is on privatizing an MCMC-sampling algorithm to train Factorization Machines.
IBM Research (Almaden)
Research Summer Intern (AI Security and Privacy Solutions)
Advisor: Swanand Kadhe, Manager: Nathalie Baracaldo
May 2023 ~ August 2023, Almaden, CA, USA
Focus: Large Language Model (LLM) security
Investigate security vulnerabilities of large language models (LLMs) in terms of data poisoning attacks targeting natural language generation (NLG) tasks, including text summarization, text completion, table-to-text generation, etc. Design and develop defense strategies to counter-attack those types of security threats to LLMs.
Morgan Stanley
Technology Analyst (Application Development)
June 2018 ~ August 2018, New York City, NY, USA
Focus: Outlier detection system
Develop a data quality management system which collects real-time trading data from multiple source databases, detects potential anomalies to ensure data quality and visualizes anomalous data.
PreSenso Ltd.
Software Engineer
June 2017 - August 2017, Haifa, Israel
Focus: Outlier detection benchmark
Develope an anomaly detection benchmark for evaluating and comparing the perfor- mances of different anomaly detection algorithms on various patterns of anomalies.
Talks
CMU CyLab Partners Conference
Topic: Differentially Private Incremental Gradient (IG) Methods with Public Data
September 2024
NSF CPS Frontier Annual Review Lightening Talk (3-min)
Topic: Distributed Vector Mean Estimation
February 2024
AI-EDGE Students and Postdocs gathering for AI Research and Knowledge Sharing (SPARKS)
Topic: Federated Learning and Distributed Vector Mean Estimation
September 2023
CMU Robotics Institute Ph.D. Speaking Qualifier Public Talk
Topic: Differential Privacy and Private Majority Ensembling
May 2023
Service
Reviewer:
Conference: SODA 2022, SIGKDD 2023, NeurIPS 2023, ICLR 2024, AISTATS 2024, SDM 2024, ISIT 2024, NeurIPS 2024, ICLR 2025, AISTATS 2025
Workshop: The First Workshop on DL-Hardware Co-Design for AI Acceleration 2023, ICLR Workshop R2-FM 2024, ICML Workshop FM-Wild 2024, AutoML Workshop 2024
Journal: IEEE Transaction on Networking 2023, Data-centric Machine Learning Research (DMLR) 2024
Department Service:
CMU Robotics Institute Ph.D. Admission Committee 2023 and 2024
Teaching Assistantship
16-831 Statistical Techniques in Robotics, Fall 2022
@ Carnegie Mellon University
Instructor: Prof. David Held.
10-725 Convex Optimization, Fall 2020
@ Carnegie Mellon University
Instructor: Prof. Yuanzhi Li.
17-214 Principles of Software Construction, Fall 2017
@ Carnegie Mellon University
Instructor: Prof. Charlie Garrod.
Awards
NeurIPS 2023 Scholar Award, 2023
IEEE ICDM 2022 Student Travel Grant ($700), 2022
Graduate Student Assembly/Provost Conference Travel Grant ($750), 2022
Carnegie Mellon University Undergraduate University Honor, 2019
Carnegie Mellon University Undergraduate Dean's List, 2015 -- 2019
HackAuton Best Show Prize, 2018
Carnegie Mellon University Innovation Scholar, 2017 -- 2019
Buncher Entrepreneurship Award ($10,000), 2017