TRecursive

AI Safety Taxonomy

This taxonomy organizes a corpus of over 3000 AI safety-related papers published on ArXiv (last updated: 2025-04-16). It was created using a script that recursively orchestrates LLMs to generate sets of categories in an iterative process. These categories are evaluated at each iteration to maximize mutual exclusivity (by sorting a sample of papers to evaluate overlap) and clarity (by generating feedback from other LLM instances), among other metrics.

AI safety is a field focused on preventing harm caused by unintended consequences of AI systems, ensuring they align with human values and operate reliably.

Future of Life's Map

The data on this map was directly copied from the Future of Life Institute's Value Alignment Map

The project of creating value-aligned AI is perhaps one of the most important things we will ever do. However, there are open and often neglected questions regarding what is exactly entailed by 'beneficial AI.' Value alignment is the project of one day creating beneficial AI and has been expanded outside of its usual technical context to reflect and model its truly interdisciplinary nature.

AI Safety Map

This map used LLMs to map the AI safety research landscape. Each node is equipped with a list of related papers.

AI safety is the interdisciplinary field dedicated to ensuring that artificial intelligence systems are designed, developed, and deployed in ways that align with human values, promote societal well-being, and minimize risks. As AI continues to evolve in capability and influence, the field addresses both immediate concerns, such as fairness, robustness, and transparency in current systems, and long-term challenges, including ensuring that more advanced systems—such as artificial general intelligence (AGI)—operate safely and beneficially.

AI Safety Goals

This map used LLMs to recursively break down AI safety into continuously smaller sub-goals. At each sub-goal, research papers are found to ground the model as it generates the next breakdown.

Mitigate the risk that people build an agentic AI system which results in the loss of human control, extinction or some other existential catastrophe.