rational
Issue #237: Full List
6 November, 2022 // View curated list# Epistemic
K-types vs T-types — what priors do you have? // strawberry calm, 8 min
"Normal" is the equilibrium state of past optimization processes // Alex_Altair, 5 min
Gandalf or Saruman? A Soldier in Scout's Clothing // AllAmericanBreakfast, 5 min
Remember to translate your thoughts back again // brook, 3 min
Sequence Reread: Fake Beliefs [plus sequence spotlight meta] // Raemon, 1 min
Sequence Reread: Fake Beliefs // Raemon, 1 min
Sanity-checking in an age of hyperbole // ciprian-elliu-ivanof, 1 min
# Ai
Instead of technical research, more people should focus on buying time // akash-wasil, 17 min
Takeaways from a survey on AI alignment resources // DanielFilan, 6 min
Superintelligent AI is necessary for an amazing future, but far from sufficient // So8res, 49 min
Caution when interpreting Deepmind's In-context RL paper // samuel-marks, 5 min
Clarifying AI X-risk // zkenton, 5 min
All AGI Safety questions welcome (especially basic ones) [~monthly thread] // robert-miles, 3 min
Real-Time Research Recording: Can a Transformer Re-Derive Positional Info? // neel-nanda-1, 1 min
publishing alignment research and infohazards // carado-1, 1 min
A Mystery About High Dimensional Concept Encoding // Fabien, 8 min
Threat Model Literature Review // zkenton, 29 min
"Cars and Elephants": a handwavy argument/analogy against mechanistic interpretability // capybaralet, 2 min
The Slippery Slope from DALLE-2 to Deepfake Anarchy // scasper, 13 min
AI X-risk >35% mostly based on a recent peer-reviewed argument // cocoa, 56 min
love, not competition // carado-1, 1 min
Adversarial Policies Beat Professional-Level Go AIs // sanxiyn, 1 min
Mind is uncountable // Filip Sondej
a casual intro to AI doom and alignment // carado-1, 4 min
Ethan Caballero on Broken Neural Scaling Laws, Deception, and Recursive Self Improvement // mtrazzi, 6 min
For ELK, truth is mostly a distraction // ctrout, 25 min
Mechanistic Interpretability as Reverse Engineering (follow-up to "cars and elephants") // capybaralet, 1 min
Why do we post our AI safety plans on the Internet? // Peter S. Park, 13 min
Embedding safety in ML development // zeshen, 20 min
A newcomer’s guide to the technical AI safety field // zeshen, 11 min
AI as a Civilizational Risk Part 3/6: Anti-economy and Signal Pollution // PashaKamyshev, 16 min
Recommend HAIST resources for assessing the value of RLHF-related alignment research // samuel-marks, 3 min
Toy Models and Tegum Products // adam-jermyn, 4 min
Auditing games for high-level interpretability // paul-colognese, 9 min
Are alignment researchers devoting enough time to improving their research capacity? // Carson Jones, 3 min
AI as a Civilizational Risk Part 5/6: Relationship between C-risk and X-risk // PashaKamyshev, 8 min
AI Safety Needs Great Product Builders // goodgravy
AGI and the future: Is a future with AGI and humans alive evidence that AGI is not a threat to our existence? // LetUsTalk, 1 min
When can a mimic surprise you? Why generative models handle seemingly ill-posed problems // david-johnston, 21 min
Can we predict the abilities of future AI? MLAISU W44 // esben-kran, 3 min
What sorts of systems can be deceptive? // inwaves, 8 min
AI as a Civilizational Risk Part 4/6: Bioweapons and Philosophy of Modification // PashaKamyshev, 9 min
AI as a Civilizational Risk Part 2/6: Behavioral Modification // PashaKamyshev, 12 min
Instrumental ignoring AI, Dumb but not useless. // donald-hobson, 2 min
My summary of “Pragmatic AI Safety” // ea-1, 5 min
Interpreting systems as solving POMDPs: a step towards a formal understanding of agency [paper link] // lahwran, 1 min
AI as a Civilizational Risk Part 6/6: What can be done // PashaKamyshev, 5 min
Don't you think RLHF solves outer alignment? // charbel-raphael-segerie, 1 min
On the correspondence between AI-misalignment and cognitive dissonance using a behavioral economics model // Stijn Bruers, 7 min
Announcing: What Future World? - Growing the AI Governance Community // DavidCorfield
Which Issues in Conceptual Alignment have been Formalised or Observed (or not)? // ojorgensen, 1 min
My (naive) take on Risks from Learned Optimization // artkpv, 6 min
WFW?: Opportunity and Theory of Impact // DavidCorfield
# Meta-ethics
Average utilitarianism is non-local // yair-halberstadt, 1 min
How much should we care about non-human animals? // bokov-1, 2 min
# Longevity
Follow up to medical miracle // pktechgirl, 6 min
Why is fiber good for you? // braces, 1 min
# Decision theory
Humans do acausal coordination all the time // adam-jermyn, 3 min
Information Markets // eva_, 15 min
Unpricable Information and Certificate Hell // eva_, 7 min
Is there a good way to award a fixed prize in a prediction contest? // jchan, 1 min
Information Markets 2: Optimally Shaped Reward Bets // eva_, 3 min
Land Speculators Made U.S. // Robin Hanson, 4 min
# Books
[Book] Interpretable Machine Learning: A Guide for Making Black Box Models Explainable // esben-kran, 1 min
# Ea
Mildly Against Donor Lotteries // jkaufman, 3 min
# Community
The Rational Utilitarian Love Movement (A Historical Retrospective) // caleb-biddulph
The circular problem of epistemic irresponsibility // Roman Leventov, 10 min
A new place to discuss cognitive science, ethics and human alignment // Hominid Dan
# Culture war
Open Letter Against Reckless Nuclear Escalation and Use // MaxTegmark, 1 min
Housing and Transit Thoughts #1 // Zvi, 19 min
Quickly refactoring the U.S. Constitution // lc, 8 min
Could a Supreme Court suit work to solve NEPA problems? // ChristianKl, 1 min
Highlights From The Comments On My California Ballot // Scott Alexander, 11 min
My California Ballot 2022 // Scott Alexander, 17 min
Moderation Is Different From Censorship // Scott Alexander, 3 min
# Misc
Why Aren't There More Schelling Holidays? // johnswentworth, 1 min
Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm) // Davidmanheim, 4 min
Should we “go against nature”? // jasoncrawford, 2 min
Conversations on Alcohol Consumption // jorge-velez, 10 min
Sacred Pains // Robin Hanson, 1 min
Highlights From The Comments On Jhanas // Scott Alexander, 26 min
# Podcasts
Me (Steve Byrnes) on the “Brain Inspired” podcast // steve2152, 1 min
EP 168 Nate Hagens on Collective Futures // The Jim Rutt Show, 86 min
# Videos of the week
[Video] How having Fast Fourier Transforms sooner could have helped with Nuclear Disarmament - Veritaserum // MakoYass