- Ben's Bites
- Posts
- What if normal people define the rules for AI
What if normal people define the rules for AI
Anthropic and the Collective Intelligence Project ran a public input process to create an AI constitution. They discovered areas of agreement and disagreement with their in-house constitution.
What's going on here?
The public input resulted in a moderately different constitution for AI than Anthropic's internal one.
What does this mean?
There was about 50% overlap between the public and Anthropic constitutions. The public constitution focused more on promoting desired behaviour rather than avoiding undesired behaviour. Some public statements were excluded due to a lack of consensus or being problematic.
Training a model on the public-sourced constitution versus Anthropic's own reduced certain biases, and had similar political opinions without any loss in performance.
Why should I care?
This work on sourcing the rules from the public is great step in building trust around AI models and what they generate. More people getting a sense of belonging and control over the models can increase the adoption as well. At the same time, this still also involves many judgment calls from Anthropic as in participant selection, platform, seed statements, moderation, and more.
Reply