Trustworthy Machine Learning

Graduate course, Penn State, College of IST, 2023

Overview

Machine learning techniques are widely used to solve real-world problems. However, a key challenge is that they are vulnerable to various security and privacy attacks, e.g., adversarial examples, data poisoning attacks, and membership inference attacks. In this course, we will discuss existing attacks and state-of-the-art defenses against those attacks.

Prerequisites

This course requires the knowledge of basic machine learning (e.g., an undergraduate machine learning course) as well as linear algebra and calculus.

Logistics

Instructor: Jinyuan Jia, jinyuan@psu.edu
Teaching Assistant: Hangfan Zhang, hbz5148@psu.edu
Time: TuTh 3:05 PM - 4:20 PM
Location: Leonhard Bldg 203
Office Hours: Jinyuan Jia: Wednesday 1:00 pm - 2:00 pm, E325 Westgate; Hangfan Zhang: Thursday 1:30-2:30 pm, E301 Westgate

Tentative Schedule (Subject to Change)

Week	Date	Topics	Papers	Notes
1	08/22	Course overview
1	08/24	Adversarial examples in image domain (white-box)	1. Towards Evaluating the Robustness of Neural Networks
2	08/29	Adversarial examples in image domain (black-box)	1. HopSkipJumpAttack: A Query-Efficient Decision-Based Attack 2. (Optional) Delving into Transferable Adversarial Examples and Black-box Attacks
2	08/31	Empirical defenses against adversarial examples	1.Towards Deep Learning Models Resistant to Adversarial Attacks
3	09/05	Certified defenses against adversarial examples	1. Certified Adversarial Robustness via Randomized Smoothing
3	09/07	Adversarial examples in (large) language models and their defenses	1. Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks 2. (Optional) Certified Robustness to Text Adversarial Attacks by Randomized [MASK]
4	09/12	Adversarial examples for good use	1. AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning 2. (Optional) Glaze: Protecting Artists from Style Mimicry by Text-to-Image Models
4	09/14	Data poisoning attacks to classifiers	1. Poisoning Attacks against Support Vector Machines 2. (Optional) Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks
5	09/19	Data poisoning attacks to foundation models	1. PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning 2. (Optional) Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning	Speakers: Sai Naveen Katla, Salika Dave
5	09/21	Model poisoning attacks to federated learning	1. Local Model Poisoning Attacks to Byzantine-Robust Federated Learning 2. (Optional) FLTrust: Byzantine-robust Federated Learning via Trust Bootstrapping	Speakers: Hari Pranav Arun Kumar, Manasa Pisipati
6	09/26	Certified defenses against data poisoning attacks	1. Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks 2. (Optional) Certified Robustness of Nearest Neighbors against Data Poisoning Attacks 3. (Optional) Certified Defenses for Data Poisoning Attacks	Speakers: Wei Zou, Yurui Chang
6	09/28	Backdoor attacks in image domain	1. BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain 2. (Optional) Trojaning Attack on Neural Networks	Speakers: Yilong Wang, Minhua Lin
7	10/03	Backdoor attacks to pre-trained foundation models	1. BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning 2. (Optional) Poisoning and Backdooring Contrastive Learning
7	10/05	Defending against backdoor attacks in image domain	1. Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks 2. (Optional) STRIP: A Defence Against Trojan Attacks on Deep Neural Networks	Speaker: Ruimeng(Raymond) Shao
8	10/10	Backdoor attacks to (large) language models and their defenses	1. Backdoor Pre-trained Models Can Transfer to All 2. (Optional) PICCOLO: Exposing Complex Backdoors in NLP Transformer Models	Speakers: Harish Kolla, Girish Nagarajan
8	10/12	Privacy attacks to image classifiers	1. Membership Inference Attacks against Machine Learning Models 2. (Optional) Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures	Speakers: Srija Akula, Hanzheng Wang
9	10/17	Privacy attacks to federated learning	1. Deep Leakage from Gradients 2. (Optional) Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning	Speakers: Yaopei Zeng and Yuanpu Cao
9	10/19	Privacy attacks to (large) language models and diffusion models	1. Extracting Training Data from Large Language Models 2. (Optional) Extracting Training Data from Diffusion Models	Speaker: Keaton Yukio Kraiger, Vishal Ahir
10	10/24	Defending against privacy attacks	1. Deep Learning with Differential Privacy 2. (Optional) SecureML: A System for Scalable Privacy-Preserving Machine Learning	Speaker: Yanting Wang
10	10/26	Model stealing attacks	1. Stealing Machine Learning Models via Prediction APIs 2. (Optional) Stealing Hyperparameters in Machine Learning
11	10/31	Intellectual property protection	1. Prediction Poisoning: Utility-Constrained Defenses Against Model Stealing Attacks 2. (Optional) Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring 3. (Optional) Certified Neural Network Watermarks with Randomized Smoothing
11	11/02	Image watermarking and its security	1. Hidden: Hiding data with deep networks 2. (Optional) Evading Watermark based Detection of AI-Generated Content
12	11/07	Prompt injection attacks to large language models	1. Ignore Previous Prompt: Attack Techniques For Language Models 2. (Optional) Prompt Injection Attacks and Defenses in LLM-Integrated Applications
12	11/09	Machine generated text detection	1. A Watermark for Large Language Models
13	11/14	Deepfakes	1. Generative Adversarial Nets 2. Stable Diffusion
13	11/16	Safety of LLMs	1. Jailbroken: How Does LLM Safety Training Fail? 2. (Optional) Universal and Transferable Adversarial Attacks on Aligned Language Models	Speaker: Bochuan Cao
14		Thanksgiving
15	11/28	Project Presentation	Group 1: Yurui Chang and Yuanpu Cao
15	11/30	Project Presentation	Group 1: Yanting Wang and Wei Zou Group 2 Hanzheng Wang and Srija Akula Group 3: Yaopei Zeng
16	12/05	Project Presentation	Group 1: Harish Kolla and Nagarajan Girish Group 2: Sai Naveen Katla and Salika Dave Group 3: HARI PRANAV ARUN KUMAR and Manasa Pisipati
16	12/07	Project Presentation	Group 1: Ruimeng Shao Group 2: Yilong Wang and Minhua Lin Group 3 Vishal Ahir and Keaton Kraiger

Paper Review

Deadline: Monday or Wednesday 11:59 pm (EST). One paper per week. Please send your review to this email address: psutrustworthyml@gmail.com, and send it in a single thread (by replying). Note that ChatGPT should not be used to write the review.

Group

Students can form groups of at most 2 students for the lecture and class project.
Lecture: Prepare the slides (or use others’ with proper citations) and give a lecture. Choosing a topic for lecture. A group sends three preferred dates to psutrustworthyml@gmail.com by 11:59 pm (EST), 08/31.
Class project: The project should be related to machine learning (published work cannot be used as the course project).

Grading Policy

50% project
25% reading assignment
10% class participation and quiz
15% class presentation

Final grade cutoff:

A [93%, 100%]
A- [90%, 93%)
B+ [87%, 90%)
B [83%, 87%)
B- [80%, 83%)
C+ [77%, 80%)
C [70%, 77%)
D [60%, 70%)
F [0%, 60%)

Late Submission Policy

10% deduction for every 24 hours late.
No more late submission is accepted after 3 days.
Please email the instructor regarding extensions for special cases.

ACADEMIC INTEGRITY STATEMENT

Academic integrity is the pursuit of scholarly activity in an open, honest and responsible manner. Academic integrity is a basic guiding principle for all academic activity at The Pennsylvania State University, and all members of the University community are expected to act in accordance with this principle. Consistent with this expectation, the University’s Code of Conduct states that all students should act with personal integrity, respect other students’ dignity, rights and property, and help create and maintain an environment in which all can succeed through the fruits of their efforts.

Academic integrity includes a commitment by all members of the University community not to engage in or tolerate acts of falsification, misrepresentation or deception. Such acts of dishonesty violate the fundamental ethical principles of the University community and compromise the worth of work completed by others.

DISABILITY ACCOMMODATION STATEMENT

Penn State welcomes students with disabilities into the University’s educational programs. Every Penn State campus has an office for students with disabilities. Student Disability Resources (SDR) website provides contact information for every Penn State campus (http://equity.psu.edu/sdr/disability-coordinator). For further information, please visit Student Disability Resources website (http://equity.psu.edu/sdr/).

In order to receive consideration for reasonable accommodations, you must contact the appropriate disability services office at the campus where you are officially enrolled, participate in an intake interview, and provide documentation: See documentation guidelines (http://equity.psu.edu/sdr/guidelines). If the documentation supports your request for reasonable accommodations, your campus disability services office will provide you with an accommodation letter. Please share this letter with your instructors and discuss the accommodations with them as early as possible. You must follow this process for every semester that you request accommodations.

COUNSELING AND PSYCHOLOGICAL SERVICES STATEMENT

Many students at Penn State face personal challenges or have psychological needs that may interfere with their academic progress, social development, or emotional wellbeing. The university offers a variety of confidential services to help you through difficult times, including individual and group counseling, crisis intervention, consultations, online chats, and mental health screenings. These services are provided by staff who welcome all students and embrace a philosophy respectful of clients’ cultural and religious backgrounds, and sensitive to differences in race, ability, gender identity and sexual orientation.

Counseling and Psychological Services at University Park (CAPS) (http://studentaffairs.psu.edu/counseling/): 814-863-0395

Counseling and Psychological Services at Commonwealth Campuses (https://senate.psu.edu/faculty/counseling-services-at-commonwealth-campuses/)

Penn State Crisis Line (24 hours/7 days/week): 877-229-6400 Crisis Text Line (24 hours/7 days/week): Text LIONS to 741741

EDUCATIONAL EQUITY/REPORT BIAS STATEMENTS

Consistent with University Policy AD29, students who believe they have experienced or observed a hate crime, an act of intolerance, discrimination, or harassment that occurs at Penn State are urged to report these incidents as outlined on the University’s Report Bias webpage (http://equity.psu.edu/reportbias/)

Share on

Twitter Facebook LinkedIn

Jinyuan Jia