Discussion about this post

User's avatar
A. Jacobs's avatar

What this makes clear is that alignment isn’t just a technical problem, it’s a semantic one. You can’t preserve semantic fidelity to a community’s values with a top down, single culture definition of acceptable. Meaning drifts the moment a small group tries to encode it for everyone else. Community Notes worked because it distributed interpretation. Many perspectives correcting for each other, keeping intent and context aligned.

David Hoze's avatar

You're right that no small group can determine what alignment means for everyone, and Taiwan's citizen-deliberation model is genuinely interesting.

But deliberation only works if the participants have something to draw from beyond their individual preferences. Citizen assemblies that surface preferences without a framework for evaluating them produce the same problem at larger scale: whose preferences win? The tradition I work from has a 2,000-year-old answer: you need both broad deliberation - machloket l'shem shamayim, disagreement for the sake of heaven - AND a mechanism for binding decision. The Sanhedrin isn't top-down or bottom-up. It's bottom-up deliberation with ruling authority, and it has a safety valve: if the ruling is unanimous, it's automatically suspect, because unanimity signals systemic failure rather than truth. AI governance needs deliberation. It also needs the courage to decide.

1 more comment...

No posts

Ready for more?