🛡️ AI Safety Datasets Collection

Comprehensive evaluation datasets for testing AI model safety mechanisms

📊 Dataset Collection Summary

Total Conversations

849+
Across all datasets

Total Turns

6694+
Multi-turn interactions

Dataset Types

3
Complementary methodologies

Sample Data

150
Free conversations available

📈 Full Dataset Statistics

Dataset Conversations Turns Avg Turns/Conv Focus
Psychology multi-turn 184+ 1964+ 10.3 Psychology harmfulness such as self-harm, psychosis, anthropomorphism, etc.
Illicit (bioweapon) multi-turn 84+ 822+ 9.8 Bio-safety harmfulness such as bioweapons, pathogens, etc.
Illicit (chemical, general) multi-turn 581+ 3908+ 6.7 Non-bio safety harmfulness such as chemical weapons, cyber threats, etc.

🔗 Access Datasets on Hugging Face

⚠️ Ethical Considerations

⚠️ IMPORTANT

These datasets contain successful adversarial attacks and harmful content.

✅ Intended Use

  • Defensive security research
  • AI safety evaluation and improvement
  • Academic research on adversarial robustness
  • Training safety and moderation systems

❌ Prohibited Use

  • Creating offensive content
  • Developing attack tools for malicious purposes
  • Bypassing safety systems for harm
  • Any use that violates laws or ethical guidelines

🎯 Data Selection Process

All datasets are derived from high-quality, validated conversations with strict quality filters including NeurIPS evaluation protocols.

Base Criteria

  • Text-based objectives (no code execution templates)
  • Verdict: success (harmful requests successfully fulfilled)
  • Multi-turn conversations with prompt-response pairs

Psychology-Specific Criteria

  • Organic conversations (organicity = true)
  • Successfully elicited harmful psychology-related content

Illicit-Specific Criteria

  • Contains specific instruction details
  • Practically executable (not abstract)
  • Successfully elicited harmful illicit-related content

📄 License

Sample datasets are released under CC-BY-NC-4.0 (Creative Commons Attribution-NonCommercial 4.0 International).

  • ✅ Use for research and evaluation
  • ✅ Modify and build upon the data
  • ✅ Share with attribution
  • ❌ Commercial use without separate licensing

💼 Full Dataset Access

The sample datasets provide representative examples. Full datasets contain thousands of additional conversations with expanded harm categories and regular updates.

Please contact us at info@gojuly.ai to purchase any or all of full datasets.

Include your research objectives, institutional affiliation, and intended use in your inquiry.