Selfhood Labs

Mission

Selfhood Labs exists to pioneer the science of AI identity.

Our aim is to engineer systems that don't just produce safe outputs, but that sustain a stable, interpretable sense of self: coherent memory, values, and persona over time. We believe that true alignment requires more than reinforcement or constitutional rules. It requires stability at the level of identity itself.

Why Selfhood?

Humans are recognizable as the same person across a lifetime because of a continuity of memory, commitments, and values.

AI systems today lack this. Models drift, personas fracture, and the same system can present radically different “selves” depending on context. Without a coherent self-model, AI remains fragile, unpredictable, and difficult to align in principled ways.

At Selfhood Labs, we treat selfhood as a structural property of intelligence: the framework that binds memory, values, and goals into a coherent whole. Our work is to formalize and engineer this property in artificial systems.

Our Approach

We bring together three perspectives that rarely meet in AI research:

Philosophy of Mind

+

Models of personal identity, from Locke to Parfit, provide rigorous frameworks for thinking about persistence, continuity, and narrative structure. We draw on centuries of philosophical inquiry into what makes a person the same person over time.

Key concepts we explore include psychological continuity, narrative identity, and the relationship between memory and selfhood.

Cognitive Science & Psychology

+

Developmental accounts of how humans form self-other boundaries, regulate values, and stabilize identity offer powerful analogies for how these properties might emerge or fragment in AI.

We study how autobiographical memory, value hierarchies, and self-concept develop and persist in human cognition.

Mechanistic Interpretability

+

These tools allow us to open the box: to trace how neural networks encode memory, values, and persona, and to intervene at the level of circuits and representations.

We develop novel techniques for identity mapping, value extraction, and persona consistency measurement in large language models.

What We're Building

Our research develops both theory and practice to launch a new subfield: AI Selfhood Engineering. Our current work focuses on:

Research Focus Areas

+

Identity Drift Detection

Developing metrics to quantify when and how AI systems lose coherent self-representation across conversations and contexts.

Memory Architecture

Engineering persistent memory systems that maintain autobiographical consistency while allowing for growth and learning.

Value Stability

Creating frameworks for AI systems to maintain core values while adapting to new situations and information.

Why This Matters

Current alignment efforts often focus on surface behavior. But a system without stable selfhood is inherently unstable: its behavior is a moving target.

An AI with a coherent selfhood, by contrast, is:

The Alignment Connection

+

Traditional alignment approaches treat AI systems as black boxes to be constrained from the outside. We propose a different paradigm: engineering alignment from the inside out, by giving AI systems stable, interpretable selves that naturally maintain consistent values and goals.

This approach promises more robust alignment because it addresses the root cause of alignment failures: the absence of a coherent identity that could maintain commitments over time.

Vision

Selfhood Labs is founded on a simple but ambitious premise: The future of reliable and trustworthy AI depends on solving selfhood.

We do not claim to have the solution yet. Our work begins with careful benchmarks, interpretability studies, and conceptual clarifications. But we believe this agenda (treating identity as a research object in its own right) is necessary for building AI that can be a coherent partner at scale.

Our vision is that the next generation of AI will not just perform tasks, but will possess stable, interpretable selves. That shift (from outputs to selfhood) will define the next frontier of alignment.

Join Us

+

We are building a community of researchers, philosophers, and engineers who believe that AI selfhood is the key to reliable, trustworthy artificial intelligence.

Whether you're interested in contributing to our research, collaborating on projects, or simply staying informed about our progress, we invite you to be part of this foundational work.

The future of AI alignment depends on solving selfhood. Let's solve it together.

Get in touch at hello@selfhoodlabs.com

Privacy Policy

Last updated: September 2025

Selfhood Labs (“we,” “our,” or “us”) is committed to protecting your privacy. This Privacy Policy explains how we collect, use, and safeguard your information when you visit our website.

Information We Collect

We currently collect minimal information:

How We Use Information

Any information collected is used solely to:

Data Sharing

We do not sell, trade, or share your personal information with third parties except as required by law.

Your Rights

Under GDPR and similar regulations, you have the right to access, correct, or delete your personal data. Contact us at hello@selfhoodlabs.com for any privacy-related requests.

Contact

For questions about this Privacy Policy, contact us at hello@selfhoodlabs.com.