All of Rob Bensinger's Comments + Replies

Four ways learning Econ makes people dumber re: future AI

When Freeman Dyson originally said "Dyson sphere" I believe he had a Dyson swarm in mind, so it strikes me as oddly unfair to Freeman Dyson to treat Dyson "spheres" and "swarms" as disjoint. But "swarms" might be better language, just to avoid the misconception that a "Dyson sphere" is supposed to be a single solid structure.

A Reply to MacAskill on "If Anyone Builds It, Everyone Dies"

Rob Bensinger4mo47-6

Quoting from a follow-up conversation I had with Buck after this exchange:

__________________________________________________________

Buck: So following up on your Will post: It sounds like you genuinely didn't understand that Will is worried about AI takeover risk and thinks we should try to avert it, including by regulation. Is that right?

I'm just so confused here. I thought your description of his views was a ridiculous straw man, and at first I thought you were just being some combination of dishonest and rhetorically sloppy, but now my guess i... (read more)

A Reply to MacAskill on "If Anyone Builds It, Everyone Dies"

Rob Bensinger4mo42

(I also would have felt dramatically more positive about Will's review if he'd kept everything else unchanged but just added the sentence "I definitely think it will be extremely valuable to have the option to slow down AI development in the future." anywhere in his review. XP If he agrees with that sentence, anyway!)

A Reply to MacAskill on "If Anyone Builds It, Everyone Dies"

Rob Bensinger4mo84

I definitely think it will be extremely valuable to have the option to slow down AI development in the future.

What are the mechanisms you find promising for causing this to occur? If we all agree on "it will be extremely valuable to have the option to slow down AI development in the future", then I feel silly for arguing about other things; it seems like the first priority should be to talk about ways to achieve that shared goal, whatever else we disagree about.

(Unless there's a fast/easy way to resolve those disagreements, of course.)

A Reply to MacAskill on "If Anyone Builds It, Everyone Dies"

Rob Bensinger4mo61

banning anyone from having more than 8 GPUs

I assume you know this, but I'll say out loud that this is a straw-man, since I expect this to be a common misunderstanding. The book suggests "[more than] eight of the most advanced GPUs from 2024" as a possible threshold where international monitoring efforts come online and the world starts caring that you aren't using those GPUs to push the world closer to superintelligence, if it's possible to do so.

"More than 8 GPUs" is also potentially confusing because people are likely to anchor to consumer hardware. From... (read more)

A Reply to MacAskill on "If Anyone Builds It, Everyone Dies"

Rob Bensinger4mo5-6

I wasn't exclusively looking at that line; I was also assuming that if Will liked some of the book's core policy proposals but disliked others, then he probably wouldn't have expressed such a strong a blanket rejection. And I was looking at Will's proposal here:

[IABIED skips over] what I see as the crucial period, where we move from the human-ish range to strong superintelligence[1]. This is crucial because it's both the period where we can harness potentially vast quantities of AI labour to help us with the alignment of the next generation of models, and

... (read more)

TurnTrout's shortform feed

Rob Bensinger7mo10

yeah, I left off this part but Nate also said

[people having trouble separating them] does maybe enhance my sense that the whole community is desperately lacking in nate!courage, if so many people have such trouble distinguishing between "try naming your real worry" and "try being brazen/rude". (tho ofc part of the phenomenon is me being bad at anticipating reader confusions; the illusion of transparency continues to be a doozy.)

TurnTrout's shortform feed

Rob Bensinger7mo214

Nate messaged me a thing in chat and I found it helpful and asked if I could copy it over:

fwiw a thing that people seem to me to be consistently missing is the distinction between what i was trying to talk about, namely the advice "have you tried saying what you actually think is the important problem, plainly, even once? ideally without broadcasting signals of how it's a socially shameful belief to hold?", and the alternative advice that i was not advocating, namely "have you considered speaking to people in a way that might be described as 'brazen' or 'r

... (read more)

TurnTrout's shortform feed

Rob Bensinger7mo104

FWIW, as someone who's been working pretty closely with Nate for the past ten years (and as someone whose preferred conversational dynamic is pretty warm-and-squishy), I actively enjoy working with the guy and feel positive about our interactions.

A case for courage, when speaking of AI danger

Rob Bensinger7mo*1913

(Considering how little cherry-picking they did.)

From my perspective, FWIW, the endorsements we got would have been surprising even if they had been maximally cherry-picked. You usually just can't find cherries like those.

New Endorsements for "If Anyone Builds It, Everyone Dies"

Rob Bensinger8mo113

(That was indeed my first thought when Bernanke said he liked the book; no dice, though.)

New Endorsements for "If Anyone Builds It, Everyone Dies"

Rob Bensinger8mo3019

Yep. And equally, the blurbs would be a lot less effective if the title were more timid and less stark.

Hearing that a wide range of respected figures endorse a book called If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All is a potential "holy shit" moment. If the same figures were endorsing a book with a vaguely inoffensive title like Smarter Than Us or The AI Crucible, it would spark a lot less interest (and concern).

New Endorsements for "If Anyone Builds It, Everyone Dies"

Rob Bensinger8mo1511

Yeah, I think people usually ignore blurbs, but sometimes blurbs are helpful. I think strong blurbs are unusually likely to be helpful when your book has a title like If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All.

New Endorsements for "If Anyone Builds It, Everyone Dies"

Rob Bensinger8mo152

Aside from the usual suspects (people like Tegmark), we mostly sent the book to people following the heuristic "would an endorsement from this person be helpful?", much more so than "do we know that this person would like the book?". If you'd asked me individually about Church, Schneier, Bernanke, Shanahan, or Spaulding in advance, I'd have put most of my probability on "this person won't be persuaded by the book (if they read it at all) and will come away strongly disagreeing and not wanting to endorse". They seemed worth sharing the book with anyway, and... (read more)

New Endorsements for "If Anyone Builds It, Everyone Dies"

Rob Bensinger8mo4029

Now, how much is that evidence about the correctness of the book? Extremely little!

It might not be much evidence for LWers, who are already steeped in arguments and evidence about AI risk. It should be a lot of evidence for people newer to this topic who start with a skeptical prior. Most books making extreme-sounding (conditional) claims about the future don't have endorsements from Nobel-winning economists, former White House officials, retired generals, computer security experts, etc. on the back cover.

New Endorsements for "If Anyone Builds It, Everyone Dies"

Rob Bensinger8mo100

We're still working out some details on the preorder events; we'll have an announcement with more info on LessWrong, the MIRI Newsletter, and our Twitter in the next few weeks.

You don't have to do anything special to get invited to preorder-only events. :) In the case of Nate's LessOnline Q&A, it was a relatively small in-person event for LessOnline attendees who had preordered the book; the main events we have planned for the future will be larger and online, so more people can participate without needing to be in the Bay Area.

(Though we're considerin... (read more)

New Endorsements for "If Anyone Builds It, Everyone Dies"

Rob Bensinger8mo20

"Inventor" is correct!

Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies

Rob Bensinger8mo20

Hopefully a German pre-order from a local bookstore will make a difference.

Yep, this counts! :)

Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies

Rob Bensinger8mo40

It's a bit complicated, but after looking into this and weighing this against other factors, MIRI and our publisher both think that the best option is for people to just buy it when they think to buy it -- the sooner, the better.

Whether you're buying on Amazon or elsewhere, on net I think it's a fair bit better to buy now than to wait.

Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies

Rob Bensinger9mo*404

Yeah, I think the book is going to be (by a very large margin) the best resource in the world for this sort of use case. (Though I'm potentially biased as a MIRI employee.) We're not delaying; this is basically as fast as the publishing industry goes, and we expected the audience to be a lot smaller if we self-published. (A more typical timeline would have put the book another 3-20 months out.)

If Eliezer and Nate could release it sooner than September while still gaining the benefits of working with a top publishing house, doing a conventional media tour, ... (read more)

Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies

Rob Bensinger9mo*287

In my experience, "normal" folks are often surprisingly open to these arguments, and I think the book is remarkably normal-person-friendly given its topic. I'd mainly recommend telling your friends what you actually think, and using practice to get better at it.

Context: One of the biggest bottlenecks on the world surviving, IMO, is the amount (and quality!) of society-wide discourse about ASI. As a consequence, I already thought one of the most useful things most people can do nowadays is to just raise the alarm with more people, and raise the bar on the q... (read more)

Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies

Rob Bensinger9mo80

There's a professional Russian translator lined up for the book already, though we may need volunteer help with translating the online supplements. I'll keep you (and others who have offered) in mind for that -- thanks, Tapatakt. :)

Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies

Rob Bensinger9mo30

Yep! This is the first time I'm hearing the claim that hardcover matters more for bestseller lists; but I do believe hardcover preorders matter a bit more than audiobook preorders (which matters a bit more than ebook preorders). I was assuming the mechanism for this is that they provide different amounts of evidence about print demand, and thereby influence the print run a bit differently. AFAIK all the options are solidly great, though; mostly I'd pick the one(s) that you actually want the most.

The Sun is big, but superintelligences will not spare Earth a little sunlight

Rob Bensinger1y20

I didn't cross-post it, but I've poked EY about the title!

How do you feel about LessWrong these days? [Open feedback thread]

Rob Bensinger2y1713

I feel pretty frustrated at how rarely people actually bet or make quantitative predictions about existential risk from AI. EG my recent attempt to operationalize a bet with Nate went nowhere. Paul trying to get Eliezer to bet during the MIRI dialogues also went nowhere, or barely anywhere--I think they ended up making some random bet about how long an IMO challenge would take to be solved by AI. (feels pretty weak and unrelated to me. lame. but huge props to Paul for being so ready to bet, that made me take him a lot more seriously.)

This paragrap... (read more)

How do you feel about LessWrong these days? [Open feedback thread]

Rob Bensinger2y*2119

If I was misreading the blog post at the time, how come it seems like almost no one ever explicitly predicted at the time that these particular problems were trivial for systems below or at human-level intelligence?!?

Quoting the abstract of MIRI's "The Value Learning Problem" paper (emphasis added):

Autonomous AI systems' programmed goals can easily fall short of programmers' intentions. Even a machine intelligent enough to understand its designers' intentions would not necessarily act as intended. We discuss early ideas on how one might design smarte

... (read more)

MIRI's June 2024 Newsletter

Rob Bensinger2y110

But the benefit of a Pause is that you use the extra time to do something in particular. Why wouldn't you want to fiscally sponsor research on problems that you think need to be solved for the future of Earth-originating intelligent life to go well?

MIRI still sponsors some alignment research, and I expect we'll sponsor more alignment research directions in the future. I'd say MIRI leadership didn't have enough aggregate hope in Agent Foundations in particular to want to keep supporting it ourselves (though I consider its existence net-positive).

My mo... (read more)

Response to Aschenbrenner's "Situational Awareness"

Rob Bensinger2y52

I don't find this convincing. I think the target "dumb enough to be safe, honest, trustworthy, relatively non-agentic, etc., but smart enough to be super helpful for alignment" is narrow (or just nonexistent, using the methods we're likely to have on hand).
Even if this exists, verification seems extraordinarily difficult: how do we know that the system is being honest? Separately, how do we verify that its solutions are correct? Checking answers is sometimes easier than generating them, but only to a limited degree, and alignment seems like a case where ch

Rob Bensinger2y136

one positive feature it does have, it proposes to rely on a multitude of "limited weakly-superhuman artificial alignment researchers" and makes a reasonable case that those can be obtained in a form factor which is alignable and controllable.

I don't find this convincing. I think the target "dumb enough to be safe, honest, trustworthy, relatively non-agentic, etc., but smart enough to be super helpful for alignment" is narrow (or just nonexistent, using the methods we're likely to have on hand).

Even if this exists, verification seems extraordinarily difficu... (read more)

Response to Aschenbrenner's "Situational Awareness"

Rob Bensinger2y31

As a start, you can prohibit sufficiently large training runs. This isn't a necessary-and-sufficient condition, and doesn't necessarily solve the problem on its own, and there's room for debate about how risk changes as a function of training resources. But it's a place to start, when the field is mostly flying blind about where the risks arise; and choosing a relatively conservative threshold makes obvious sense when failing to leave enough safety buffer means human extinction. (And when algorithmic progress is likely to reduce the minimum dangerous train... (read more)

Response to Aschenbrenner's "Situational Awareness"

Rob Bensinger2y60

Alternatively, they either don't buy the perils or believes there's a chance the other chance may not?

If they "don't buy the perils", and the perils are real, then Leopold's scenario is falsified and we shouldn't be pushing for the USG to build ASI.

If there are no perils at all, then sure, Leopold's scenario and mine are both false. I didn't mean to imply that our two views are the only options.

Separately, Leopold's model of "what are the dangers?" is different from mine. But I don't think the dangers Leopold is worried about are dramatically easier to und... (read more)

Response to Aschenbrenner's "Situational Awareness"

Rob Bensinger2y20

Yep, I had in mind AI Forecasting: One Year In.

Response to Aschenbrenner's "Situational Awareness"

Rob Bensinger2y102

Why? 95% risk of doom isn't certainty, but seems obviously more than sufficient.

For that matter, why would the USG want to build AGI if they considered it a coinflip whether this will kill everyone or not? The USG could choose the coinflip, or it could choose to try to prevent China from putting the world at risk without creating that risk itself. "Sit back and watch other countries build doomsday weapons" and "build doomsday weapons yourself" are not the only two options.

Response to Aschenbrenner's "Situational Awareness"

Rob Bensinger2y50

Leopold's scenario requires that the USG come to deeply understand all the perils and details of AGI and ASI (since they otherwise don't have a hope of building and aligning a superintelligence), but then needs to choose to gamble its hegemony, its very existence, and the lives of all its citizens on a half-baked mad science initiative, when it could simply work with its allies to block the tech's development and maintain the status quo at minimal risk.

Success in this scenario requires a weird combination of USG prescience with self-destructiveness: enough... (read more)

Response to Aschenbrenner's "Situational Awareness"

Rob Bensinger2y196

Responding to Matt Reardon's point on the EA Forum:

Leopold's implicit response as I see it:
Convincing all stakeholders of high p(doom) such that they take decisive, coordinated action is wildly improbable ("step 1: get everyone to agree with me" is the foundation of many terrible plans and almost no good ones)
Still improbable, but less wildly, is the idea that we can steer institutions towards sensitivity to risk on the margin and that those institutions can position themselves to solve the technical and other challenges ahead
Maybe the key insight is that

Rob Bensinger2y1611

(Though he also has an incentive to not die.)

MIRI 2024 Communications Strategy

Rob Bensinger2y90

As is typical for Twitter, we also signal-boosted a lot of other people's takes. Some non-MIRI people whose social media takes I've recently liked include Wei Dai, Daniel Kokotajlo, Jeffrey Ladish, Patrick McKenzie, Zvi Mowshowitz, Kelsey Piper, and Liron Shapira.

MIRI 2024 Communications Strategy

Rob Bensinger2y70

The stuff I've been tweeting doesn't constitute an official MIRI statement -- e.g., I don't usually run these tweets by other MIRI folks, and I'm not assuming everyone at MIRI agrees with me or would phrase things the same way. That said, some recent comments and questions from me and Eliezer:

May 17: Early thoughts on the news about OpenAI's crazy NDAs.
May 24: Eliezer flags that GPT-4o can now pass one of Eliezer's personal ways of testing whether models are still bad at math.
May 29: My initial reaction to hearing Helen's comments on the TED AI podcast. Inc

... (read more)

MIRI 2024 Communications Strategy

Rob Bensinger2y84

Every protest I've witnessed seemed to be designed to annoy and alienate its witnesses, making it as clear as possible that there was no way to talk to these people, that their minds were on rails. I think most people recognize that as cult shit and are alienated by that.

In the last year, I've seen a Twitter video of an AI risk protest (I think possibly in continental Europe?) that struck me as extremely good: calm, thoughtful, accessible, punchy, and sensible-sounding statements and interview answers. If I find the link again, I'll add it here as a model ... (read more)

MIRI 2024 Communications Strategy

Rob Bensinger2y40

Could we talk about a specific expert you have in mind, who thinks this is a bad strategy in this particular case?

AI risk is a pretty weird case, in a number of ways: it's highly counter-intuitive, not particularly politically polarized / entrenched, seems to require unprecedentedly fast and aggressive action by multiple countries, is almost maximally high-stakes, etc. "Be careful what you say, try to look normal, and slowly accumulate political capital and connections in the hope of swaying policymakers long-term" isn't an unconditionally good strategy, i... (read more)

Non-Disparagement Canaries for OpenAI

Rob Bensinger2y22

MIRI 2024 Communications Strategy

Rob Bensinger2y160

Two things:

For myself, I would not feel comfortable using language as confident-sounding as "on the default trajectory, AI is going to kill everyone" if I assigned (e.g.) 10% probability to "humanity [gets] a small future on a spare asteroid-turned-computer or an alien zoo or maybe even star". I just think that scenario's way, way less likely than that.
- I'd be surprised if Nate assigns 10+% probability to scenarios like that, but he can speak for himself.
- I think some people at MIRI have significantly lower p(doom)? And I don't expect those people to u

... (read more)

MIRI 2024 Communications Strategy

Rob Bensinger2y40

Note that "everyone will be killed (or worse)" is a different claim from "everyone will be killed"! (And see Oliver's point that Ryan isn't talking about mistreated brain scans.)

MIRI 2024 Communications Strategy

Rob Bensinger2y122

Some of the other things you suggest, like future systems keeping humans physically alive, do not seem plausible to me.

I agree with Gretta here, and I think this is a crux. If MIRI folks thought it were likely that AI will leave a few humans biologically alive (as opposed to information-theoretically revivable), I don't think we'd be comfortable saying "AI is going to kill everyone". (I encourage other MIRI folks to chime in if they disagree with me about the counterfactual.)

I also personally have maybe half my probability mass on "the AI just doesn't stor... (read more)

Ilya Sutskever and Jan Leike resign from OpenAI [updated]

Rob Bensinger2y184

FWIW I do think "don't trust this guy" is warranted; I don't know that he's malicious, but I think he's just exceptionally incompetent relative to the average tech reporter you're likely to see stories from.

Like, in 2018 Metz wrote a full-length article on smarter-than-human AI that included the following frankly incredible sentence:

During a recent Tesla earnings call, Mr. Musk, who has struggled with questions about his company's financial losses and concerns about the quality of its vehicles, chastised the news media for not focusing on the deaths that a

Rob Bensinger2y5516

FWIW, Cade Metz was reaching out to MIRI and some other folks in the x-risk space back in January 2020, and I went to read some of his articles and came to the conclusion in January that he's one of the least competent journalists -- like, most likely to misunderstand his beat and emit obvious howlers -- that I'd ever encountered. I told folks as much at the time, and advised against talking to him just on the basis that a lot of his journalism is comically bad and you'll risk looking foolish if you tap him.

This was six months before Metz caused SSC to shu... (read more)

Paul Christiano named as US AI Safety Institute Head of AI Safety

Rob Bensinger2y82

Sounds like a lot of political alliances! (And "these two political actors are aligned" is arguably an even weaker condition than "these two political actors are allies".)

At the end of the day, of course, all of these analogies are going to be flawed. AI is genuinely a different beast.

Paul Christiano named as US AI Safety Institute Head of AI Safety

Rob Bensinger2y2013

It's pretty sad to call all of these end states you describe alignment as alignment is an extremely natural word for "actually terminally has good intentions".

Aren't there a lot of clearer words for this? "Well-intentioned", "nice", "benevolent", etc.

(And a lot of terms, like "value loading" and "value learning", that are pointing at the research project of getting good intentions into the AI.)

To my ear, "aligned person" sounds less like "this person wishes the best for me", and more like "this person will behave in the right ways".

If I hear that Russia an... (read more)

When is a mind me?

Rob Bensinger2y32

"Should" in order to achieve a certain end? To meet some criterion? To boost a term in your utility function?

In the OP: "Should" in order to have more accurate beliefs/expectations. E.g., I should anticipate (with high probability) that the Sun will rise tomorrow in my part of the world, rather than it remaining night.

When is a mind me?

Rob Bensinger2y4-10

Why would the laws of physics conspire to vindicate a random human intuition that arose for unrelated reasons?

We do agree that the intuition arose for unrelated reasons, right? There's nothing in our evolutionary history, and no empirical observation, that causally connects the mechanism you're positing and the widespread human hunch "you can't copy me".

If the intuition is right, we agree that it's only right by coincidence. So why are we desperately searching for ways to try to make the intuition right?

It also doesn't force us to believe that a bunch of w

... (read more)