AI Safety Taxonomy
This taxonomy organizes a corpus of over 3000 AI safety-related papers published on ArXiv (last updated: 2025-04-16). It was created using a script that recursively orchestrates LLMs to generate sets of categories in an iterative process. These categories are evaluated at each iteration to maximize mutual exclusivity (by sorting a sample of papers to evaluate overlap) and clarity (by generating feedback from other LLM instances), among other metrics.
AI safety is a field focused on preventing harm caused by unintended consequences of AI systems, ensuring they align with human values and operate reliably.