Picture for Xinyue Shen

Xinyue Shen

Voice Jailbreak Attacks Against GPT-4o

Add code
May 29, 2024
Viaarxiv icon

UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images

Add code
May 06, 2024
Figure 1 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Figure 2 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Figure 3 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Figure 4 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Viaarxiv icon

Comprehensive Assessment of Jailbreak Attacks Against LLMs

Add code
Feb 08, 2024
Viaarxiv icon

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

Add code
Aug 07, 2023
Figure 1 for "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Figure 2 for "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Figure 3 for "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Figure 4 for "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Viaarxiv icon

Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

Add code
May 23, 2023
Figure 1 for Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Figure 2 for Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Figure 3 for Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Figure 4 for Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Viaarxiv icon

In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT

Add code
Apr 18, 2023
Figure 1 for In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT
Figure 2 for In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT
Figure 3 for In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT
Figure 4 for In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT
Viaarxiv icon

MGTBench: Benchmarking Machine-Generated Text Detection

Add code
Mar 26, 2023
Figure 1 for MGTBench: Benchmarking Machine-Generated Text Detection
Figure 2 for MGTBench: Benchmarking Machine-Generated Text Detection
Figure 3 for MGTBench: Benchmarking Machine-Generated Text Detection
Figure 4 for MGTBench: Benchmarking Machine-Generated Text Detection
Viaarxiv icon

Prompt Stealing Attacks Against Text-to-Image Generation Models

Add code
Feb 20, 2023
Figure 1 for Prompt Stealing Attacks Against Text-to-Image Generation Models
Figure 2 for Prompt Stealing Attacks Against Text-to-Image Generation Models
Figure 3 for Prompt Stealing Attacks Against Text-to-Image Generation Models
Figure 4 for Prompt Stealing Attacks Against Text-to-Image Generation Models
Viaarxiv icon

Backdoor Attacks in the Supply Chain of Masked Image Modeling

Add code
Oct 04, 2022
Figure 1 for Backdoor Attacks in the Supply Chain of Masked Image Modeling
Figure 2 for Backdoor Attacks in the Supply Chain of Masked Image Modeling
Figure 3 for Backdoor Attacks in the Supply Chain of Masked Image Modeling
Figure 4 for Backdoor Attacks in the Supply Chain of Masked Image Modeling
Viaarxiv icon

Nonconvex Sparse Logistic Regression with Weakly Convex Regularization

Aug 07, 2017
Figure 1 for Nonconvex Sparse Logistic Regression with Weakly Convex Regularization
Figure 2 for Nonconvex Sparse Logistic Regression with Weakly Convex Regularization
Figure 3 for Nonconvex Sparse Logistic Regression with Weakly Convex Regularization
Figure 4 for Nonconvex Sparse Logistic Regression with Weakly Convex Regularization
Viaarxiv icon