Posts

Should we maximize the Geometric Expectation of Utility? 2024-04-17T10:37:24.759Z
Nash Bargaining between Subagents doesn't solve the Shutdown Problem 2024-01-25T10:47:11.877Z
A Pedagogical Guide to Corrigibility 2024-01-17T11:45:06.075Z
A Land Tax For Britain 2024-01-06T15:52:14.942Z
Will 2024 be very hot? Should we be worried? 2023-12-29T11:22:50.200Z
A Question about Corrigibility (2015) 2023-11-27T12:05:51.659Z
UK Government publishes "Frontier AI: capabilities and risks" Discussion Paper 2023-10-26T13:55:16.841Z
Optimized for Something other than Winning or: How Cricket Resists Moloch and Goodhart's Law 2023-07-05T12:33:07.166Z
Alignment as Function Fitting 2023-05-06T11:38:04.245Z
Is Constructor Theory a useful tool for AI alignment? 2022-11-29T12:35:07.462Z

Comments

Comment by A.H. (AlfredHarwood) on Should we maximize the Geometric Expectation of Utility? · 2024-04-19T20:50:44.198Z · LW · GW

Well, you can't have some states as "avoid at all costs" and others as "achieve at all costs", because having them in the same lottery leads to nonsense, no matter what averaging you use. And allowing only one of the two seems arbitrary. So it seems cleanest to disallow both.

Fine. But the purpose of exploring different averaging methods is to see whether it expands the richness of the kind of behaviour we want to describe. The point is that using arithmetic averaging is a choice which limits the kind of behaviour we can get. Maybe we want to describe behaviours which can't be described under expected utility. Having an 'avoid at all costs state' is one such behaviour which finds natural description using a non-arithmetic averaging which can't be described in more typical VNM terms. 

If your position is 'I would never want to describe normative ethics using anything other than expected utility' then that's fine, but some people (like me) are interested in looking at what alternatives to expected utility might be. That's why I wrote this post. As it stands, I didn't find geometric averaging very satisfactory (as I wrote in the post), but I think things like this are worth exploring.

But geometric averaging wouldn't let you do that either, or am I missing something?

You are right. Geometric averaging on its own doesn't give allow violations of independence. But some other protocol for deciding over lotteries does. It's described more in the Garrabrant post linked above.

Comment by A.H. (AlfredHarwood) on Should we maximize the Geometric Expectation of Utility? · 2024-04-19T20:02:12.990Z · LW · GW

(apologies for taking a couple of days to respond, work has been busy)

I think your robot example nicely demonstrates the difference between our intuitions. As cubefox pointed out in another comment, what representation you want to use depends on what you take as basic.

There are certain types of preferences/behaviours which cannot be expressed using arithmetic averaging. These are the ones which violate VNM, and I think violating VNM axioms isn't totally crazy. I think its worth exploring these VNM-violating preferences and seeing what they look like when more fleshed out. That's what I tried to do in this post.

If I wanted a robot that violated one of the VNM axioms, then I wouldn't be able to describe it by 'nailing down the averaging method to use ordinary arithmetic averaging and assigning goodness values'. For example, if there were certain states of the world which I wanted to avoid at all costs (and thus violate the continuity axiom), I could assign zero utility to it and use geometric averaging. I couldn't do this with arithmetic averaging and any finite utilities [1].

A better example is Scott Garrabrant's argument regarding abandoning the VNM axiom of independence. If I wanted to program a robot which sometimes preferred lotteries to any definite outcome, I wouldn't be able to program the robot using arithmetic averaging over goodness values.

I think that these examples show that there is at least some independence between averaging methods and utility/goodness. 

  1. ^

    (ok, I guess you could assign 'negative infinity' utility to those states if you wanted. But once you're doing stuff like that, it seems to me that geometric averaging is a much more intuitive way to describe these preferences. )

Comment by A.H. (AlfredHarwood) on Should we maximize the Geometric Expectation of Utility? · 2024-04-18T09:16:27.380Z · LW · GW

Thanks for pointing this out, I missed a word. I have added it now.

Comment by A.H. (AlfredHarwood) on Should we maximize the Geometric Expectation of Utility? · 2024-04-17T11:52:51.948Z · LW · GW

Without wishing to be facetious: how much (if any) of the post did you read?  If you disagree with me, that's fine, but I feel like I'm answering questions which I already addressed in the post!

Are you arguing that we ought to (1) assign some "goodness" values to outcomes, and then (2) maximize the geometric expectation of "goodness" resulting from our actions?

I'm not arguing that we ought to maximize the geometric expectation of "goodness" resulting from our actions. I'm exploring what it might look like if we did. In the conclusion, (and indeed, many other parts of the post) I'm pretty ambivalent. 

But then wouldn't any argument for (2) depend on the details of how (1) is done? For example, if "goodnesses" were logarithmic in the first place, then wouldn't you want to use arithmetic averaging?

I don't think so. I think you could have a preference ordering over 'certain' world states and the you are still left with choosing a method for deciding between lotteries where the outcome is uncertain. I describe that this is my position in the section titled 'Geometric Expectation   Logarithmic Utility'. 

Is there some description of how we should assign goodnesses in (1) without a kind of firm ground that VNM gives?

This is what philosophers of normative ethics do! People disagree on the how exactly to do it, but that doesn't stop them from trying! My post tries to be agnostic as to what exactly it is we care about and how we assign utility to different world states, since I'm focusing on the difference between averaging methods.

Comment by A.H. (AlfredHarwood) on Should we maximize the Geometric Expectation of Utility? · 2024-04-17T11:02:43.993Z · LW · GW

The word 'utility' can be used in two different ways: normative and descriptive. 

You are describing 'utility' in the descriptive sense. I am using it in the normative sense. These are explained in the first paragraph of the Wikipedia page for 'utility'.

As I explained in the opening paragraph, I'm using the word 'utility' to mean the goodness/desirability/value of an outcome. This is normative: if an outcome is 'good' then there is the implication that you ought to pursue it.

Comment by AlfredHarwood on [deleted post] 2024-04-05T13:12:08.016Z

This looks like an exact duplicate of something you posted a month ago.

Is this intentional?

Comment by A.H. (AlfredHarwood) on A Pedagogical Guide to Corrigibility · 2024-01-18T22:23:26.579Z · LW · GW

Thanks for the comment. Naively, I agree that this sounds like a good idea, but I need to know more about it.

Do you know if anyone has explicitly written down the value learning solution to the corrigibility problem and treated it a bit more rigorously ?

Comment by A.H. (AlfredHarwood) on A Land Tax For Britain · 2024-01-12T11:05:07.995Z · LW · GW

Thanks for this comment-it explains your view very clearly and I understand what you are getting at now.

I think its a fair criticism. I've added footnotes within the post, linking people to your comment.

Comment by A.H. (AlfredHarwood) on A Land Tax For Britain · 2024-01-07T09:55:39.572Z · LW · GW

I still think it's a problem that this argument rests on the idea that investors are irrationally not renting land they own, but you don't provide any evidence for that.

I disagree. Firstly, even if, they were renting out their land, this would still be bad, for reasons described in the article (landlords extract land rent without doing anything productive etc.)

The section of the post which argues about empty homes rests on the fact that there are empty homes and a land tax would reduce them. I then provide evidence that there are, indeed, a significant number of empty homes in the UK. I do not speculate about the rationality/irrationality of the people holding them because it is irrelevant to the argument. Do you disagree on the 700,000 figure? 

Is your view something like 'I find it hard to believe that people would leave houses empty because they are leaving money on the table. Therefore I disbelieve the 700,000 empty homes figure.'? If so, I guess I'm not super interested in disputing the government figures.

Or is your view something like 'The argument in the post hinges on the fact that people are irrationally leaving homes empty and the author of the post needs to explain why they aren't behaving rationally in order to make the argument work.'? If so, hopefully I explained why the argument rests on the fact that empty homes exist, not the rationality/irrationality of the people holding them.

For what its worth, I find it pretty easy to believe that people are leaving homes empty. People often behave 'irrationally' when it comes to money: a lot of people gamble and most people have their savings in a low-interest account. I know at least two middle class families who own a flat in a city where they don't live. They visit it maybe once a year and occasionally let friends visit. They aren't super interested in squeezing it for every penny its worth and don't want to sell it for sentimental reasons and they know the price will go up so they keep it as is. They think that renting it out would cause them too much stress and they don't feel that they need any more money and they enjoy visiting it so they don't rent it out . Is this irrational? Maybe from the point of view of optimizing their income, but they are just optimizing other aspects of their life.

Comment by A.H. (AlfredHarwood) on A Land Tax For Britain · 2024-01-06T20:12:23.625Z · LW · GW

The second link is to a Scottish political campaign that doesn't claim to know way the houses are empty (at least on this page) and doesn't contain the 700,000 number in the link text (the linked political campaign claims 46,000 in Scotland and doesn't seem to say anything about the UK).

The phrase '700,000 empty homes throughout' the UK has different links for each word: one for England, Northern Ireland, Scotland, and Wales. If you follow the link on 700,000, you will be taken to this page which gives a figure of 676,304 empty homes in England. Add this to the Scotland, Northern Ireland and Wales figures and you get a total over 700,000. As explained in the link for English data (which makes up most of the total), the figure comes from Council tax data (council taxes are paid by the owners of a property and charges different rates depending on whether the home is occupied or unoccupied).

Hope this helps!

Comment by A.H. (AlfredHarwood) on Will 2024 be very hot? Should we be worried? · 2024-01-06T16:49:30.273Z · LW · GW

Interesting, thanks for sharing! I hadn't heard of this.

From Wikipedia:

An El Nino during the winter of 1998 produced above-average rainfall, which enabled extensive growth of underbrush and vegetation in the state's forests. In early April, however, the rains came to an abrupt halt, and the ensuing drought lasted until July.[2] These months of continuing dry conditions saw the drought index rise to 700 (out of 800), indicating wildfire potential similar to that usually found in western states.

I would assume that the drought was also exacerbated by El Nino, but its interesting that the main contributer is implied to be the rainfall in winter, rather than the heat the next summer.

Comment by A.H. (AlfredHarwood) on Will 2024 be very hot? Should we be worried? · 2023-12-30T11:05:01.884Z · LW · GW

Eyeballing it, doesn't it imply that while 2024 will be hotter than 2023, the difference between 2024 and 2023 will be smaller than the difference between 2023 and 2022? Because the slope of the various lines is decreasing and in no case increasing?

Yeah, that sounds right I think.

Or is the y-axis measuring YoY impact rather than impact-relative-to-some-fixed-beginning? If so then I'm confused why the global warming section looks the way it does.

I agree, I don't think that YoY interpretation makes sense. I realise now that it's not made completely clear but I think its impact relative to some past value. That's the only way that that I can make sense of the man-made global warming section being a straight line.

Comment by A.H. (AlfredHarwood) on Will 2024 be very hot? Should we be worried? · 2023-12-29T17:46:12.898Z · LW · GW

Yeah, thanks for highlighting this. I started writing about it but realised I was out of my depth (even further out of my depth than for the rest of the post!) so I scrapped it. 

Thanks for clarifying with Robert Rohde!

I reached roughly the conclusion you did. When water vapour is injected into the troposphere (the lowest level of the atmosphere) it is quickly rained out, as you point out. However, the power of the Hunga-Tonga explosion meant that the water vapour was injected much higher, into the stratosphere (what the diagram calls the 'upper atmosphere'). For some reason, water vapour in the stratosphere doesn't move back down and get rained out as easily so it sits there. Which is why 'upper atmosphere' water vapour levels are still elevated almost two years after the explosion.

Comment by A.H. (AlfredHarwood) on Enhancing intelligence by banging your head on the wall · 2023-12-13T21:27:13.798Z · LW · GW

plenty of people are very good at math but never produce any technical writing on scientific journals

Fair enough! Its just that, unless they produce technical results, or pass graduate exams or do something else tangible its quite hard to distinguish people who are very good at math from people who are not.

his story seems to strongly imply that his past self wouldn't have been able to pass those math classes

Obviously its hard to tell from that interview, but he seems to suggest that the reason he didn't pass his classes was because he spent time partying, bodybuilding and 'chasing girls' rather than studying. It doesn't necessarily seem like he would have been unable to pass the classes, just unwilling to put in the work. Even after he became interested in math, he still admitted to struggling with some of the classes, but he had the willpower to put in the work to understand it.

I think that your description of it being a change in 'math attitude' is a good one. It seems like his attitude (and willingness to persevere) changed, but not necessarily his ability.

Just to be clear: I think its super interesting that someone can have this kind of a change and it is interesting to study it! I'm just not convinced that it is a change in math ability.

Comment by A.H. (AlfredHarwood) on Enhancing intelligence by banging your head on the wall · 2023-12-13T13:27:21.471Z · LW · GW

Sorry to be a party pooper, but I find the story of Jason Padgett (the guy who 'banged his head and become a math genius') completely unconvincing. From the video that you cite, here is the 'evidence' that he is 'math genius':

  • He tells us, with no context, 'the inner boundary of pi is f(x)=x sin(pi/x)'. Ok!
  • He makes 'math inspired' drawings (some of which admittedly are pretty cool but they're not exactly original) and sells them on his website
  • He claims that a physicist (who is not named or interviewed) saw him drawing in the mall, and, on the basis of this, suggested that he study physics.
  • He went to 'school' and studied math and physics. He says started with basic algebra and calculus and apparently 'aced all the classes', but doesn't tell us what level he reached. Graduate? Post-graduate?
  • He was 'doing integrals with triangles instead of integrals with rectangles' 
  • He tells us 'every shape in the universe is a fractal'
  • Some fMRI scans were done on his brain which found 'he had conscious access to parts of the brain we don't normally have access to'.

As far as I can tell, he hasn't published any technical math/physics writings (peer-reviewed or otherwise). He wrote a book but as far as I can tell, this is mostly a memoir, with a bit of pop-math thrown in. From his website, this is what he's working on:

His [sic] is currently studying how all fractals arise from limits and how E=MC2 is itself a fractal.

...

His drawing of E=MC^2 is based on the structure of space time at the quantum level and is based on the concept that there is a physical limit to observation which is the Planck length and the geometry of Hawking Radiation at the quantum level and its possible connection to describing the Holographic Universe Principle. It also shows and agrees with the holographic principle that at the smallest level, the structure of space time is a fractal.

Yep, those certainly are physics-y words! Good luck to him making progress in this area!

Its suspicious to me that he doesn't have any writings on his mathematical works, so we are not able to judge what he's doing. (Even time-cube man posted his writings on the internet)

This is my summary of the story:

  • He was hit on the head and experienced some changes after this. (Very believable )
  • These changes were significant enough to be visible on fMRI scans. (Also very believable )
  • He experienced OCD-like symptoms and other personality changes after the injury (I'm not a neuroscientist, but this seems plausible)
  • He experienced seeing visual distortions and 'fractal-like' images after the injury. (Again, I'm not a neuroscientist, but this seems plausible)
  • The visual hallucinations and personality changes caused him to be more interested in fractals, geometric designs, art, and math. 
  • He took a few entry-level math classes and did ok in them.
  • The mugging took place in 2002 but apparently he hasn't produced any technical writing in the subjects of math and physics.
  • In 2015, he was running futon stores in Washington State

This all leads me to believe that he is not 'math genius'. I am agnostic about whether he is delusional about his abilities or whether he is a con-man.

Comment by A.H. (AlfredHarwood) on Enhancing intelligence by banging your head on the wall · 2023-12-13T12:19:45.222Z · LW · GW

I don't find it convincing that what you experienced has any relation to sudden savant syndrome. It sounds like you had a waking dream where you believed you can play the piano.

You did not actually play the piano and produce music though, right?

I have had dreams where I have believed I could do all kinds of things (play the guitar, lift heavy weights, fly etc.), but they didn't overflow in any way to real life. (I've even had dreams where I've thought to myself 'I know that I am dreaming, but this is definitely going to work when I wake up')

If I ask you to imagine a beautiful painting of a mountain, you could probably conjure up a fairly vivid mental image of one. But if I then gave your brushes and paints and asked you to recreate the picture on canvas, you would probably struggle, unless you were already an experienced painter. In dreams, the distinction between imagining and doing doesn't exist so strongly. If you can dream/imagine a beautiful painting, you can also dream/imagine putting a paintbrush in your hands, waving it over a canvas and producing the painting. In a dream, these experiences are equally convincing to the dreamer. But sadly, in my experience, real life doesn't work like that :(

Comment by A.H. (AlfredHarwood) on A Question about Corrigibility (2015) · 2023-12-03T14:47:29.865Z · LW · GW

Yes, I too am more concerned from a 'maybe this framing isn't super useful as it fails to capture important distinctions between corrigible and non-corrigible' point of view rather than a 'we might outlaw some good actions' point of view.

Thanks for the links, they look interesting!

Comment by A.H. (AlfredHarwood) on What is wrong with this "utility switch button problem" approach? · 2023-09-28T16:49:05.380Z · LW · GW

For , this is the policy that is optimal when  which has . Then 

 

Please could you explain how you get  when ?

Possibly a dumb question but I don't have a good intuition for what it means to differentiate an expected value with respect to an expected value.

I can see that this is the case when  is positive (as expected for a utility function) and uncorrelated with , but is is true in general? Even when  is strongly correlated (or anti-correlated) with ? They would presumably be correlated in some way  since they both depend on the policy pursued.  

Also, how would this work for a utility function A which is negative? In theory we should be able to apply an affine shift to a utility function so that it has a negative expected value. If A was uncorrelated with K but has a negative expected value then  , right? Can we do a similar affine shift to B to ensure that there won't necessarily be a value of q where 

Comment by A.H. (AlfredHarwood) on Vegan Nutrition Testing Project: Interim Report · 2023-09-22T13:31:14.447Z · LW · GW

I find that the walkinlabs.com domain does not give any results. I think the correct url is www.walkinlab.com (no 's' in the url). Is this the one you used?

Comment by A.H. (AlfredHarwood) on Do agents with (mutually known) identical utility functions but irreconcilable knowledge sometimes fight? · 2023-08-23T22:41:51.520Z · LW · GW

Good point! Noticeably, some of your examples are 'one-way': one party updated while the other did not. In the case of Google/Twitter and the museum, you updated but they didn't, so this sounds like standard Bayesian updating, not specifically Aumann-like (though maybe this distinction doesn't matter, as the latter is a special case of the former).  

When I wrote the answer, I guess I was thinking about Aumann updating where both parties end up changing their probabilities (ie. Alice starts with a high probability of some proposition P and Bob starts with a low probability for P and, after discussing their disagreement, they converge to a middling probability). This didn't seem to me to be as common among humans. 

In the example with your Dad, it also seems one-way: he updated and you didn't. However, maybe the fact he didn't know there was a flood would have caused you to update slightly, but this update would be so small that it was negligible. So I guess you are right and that would count as an Aumann agreement!

Your last paragraph is really good. I will ponder it...

Comment by A.H. (AlfredHarwood) on Do agents with (mutually known) identical utility functions but irreconcilable knowledge sometimes fight? · 2023-08-23T10:50:45.748Z · LW · GW

even when all parties are acting in good faith, they know that they wont be able to reconcile about certain disagreements, and it may seem to make sense, from some perspectives, to try to just impose their own way, in those disputed regions.

Aumann's agreement theorem which is discussed in the paper 'Are Disagreements Honest?' by Hanson and Cowen suggests that perfectly rational agents (updating via Bayes theorem) should not disagree in this fashion, even if their life experiences were different, provided that their opinions on all topics are common knowledge and they have common priors.  This is often framed as saying that such agents cannot 'agree to disagree'. 

I'm a bit hazy on the details, but broadly, two agents with common priors but different evidence (ie. different life experiences or expertise) can share their knowledge and mutually update based on their different knowledge, eventually converging on an agreed probability distribution.

Of course, humans are not perfectly rational so this rarely happens (this is discussed in the Hanson/Cowen paper). There are some results which seems to suggest you can relax some assumptions of Aumann's theorem to have more realistic assumptions and still get similar results. Scott Aaronson showed that Aumann's theorem holds (to a high degree) even when the agreement of agents over priors isn't perfect and the agents can exchange only limited amounts of information. 

Maybe the agents who are alive in the future will not be perfectly rational, but I guess we can hope that they might be rational enough to converge close enough to agreement that they don't fight on important issues.

Comment by A.H. (AlfredHarwood) on Underwater Torture Chambers: The Horror Of Fish Farming · 2023-07-26T09:34:55.591Z · LW · GW

Oh damn, you're right. That was a stupid mistake. 

Yes, so the 3-8 billion fish per day does overstate the number of farmed fish killed. The real number of farmed fish killed per day is somewhere between 0.1 billion and 0.5 billion, which is a lot less than the wild fish killed per day.

Comment by A.H. (AlfredHarwood) on Underwater Torture Chambers: The Horror Of Fish Farming · 2023-07-26T09:19:14.142Z · LW · GW

EDIT: AS POINTED OUT BY LOCALDEITY THIS COMMENT IS WRONG - I CONFUSED ANNUAL AND DAILY FISH DEATHS. HOWEVER, IT IS THE CASE THAT THIS POST OVERSTATES THE NUMBER OF FISH KILLED FROM FISH FARMS. SEE COMMENTS BELOW FOR CLARIFICATION.

I was going to call you out for a bit of a bait-and-switch in the paragraph starting 'Lewis Bollard notes...'

Lewis Bollard notes “The fishing industry alone kills 3-8 billion animals every day, most by slow suffocation, crushing, or live disemboweling.” So roughly the same number of fish are killed in horrifying, inhumane ways every few days as there are people on earth.

because the 3-8 billion number is for *wild* fish killed each year, not farmed fish (Lewis Bollard cites this article which explicitly states 'Marine invertebrates and farmed fish are not included in this estimate'). I can think of several reasons why eating wild caught fish is less bad than farmed fish (eg. the fish were not brought into existence by demand for them, wild conditions are less bad (?) than farm conditions, they were going to die anyway and dying a 'natural' death for a wild fish is plausibly no better than dying after being caught etc.)

However, after a cursory bit of background research it seems like the 3-8 billion figure massively *understates* your case. Two estimates that I found here and here suggest that the actual figure for annual farmed fish killed per year is somewhere between 50 and 170 billion.

Your paragraph using the 3-8 billion figure is still misleading, but the truth only strengthens your case.

Comment by A.H. (AlfredHarwood) on How do low level hypotheses constrain high level ones? The mystery of the disappearing diamond. · 2023-07-12T13:57:04.859Z · LW · GW

I see, thanks for taking the time to explain!

Comment by A.H. (AlfredHarwood) on Optimized for Something other than Winning or: How Cricket Resists Moloch and Goodhart's Law · 2023-07-11T23:15:16.597Z · LW · GW

sometimes the batter near the bowler starts to run before the bowler has actually thrown.

Yes!

In the rules of cricket, that gives the bowler the chance to get them out instead of throwing the ball like they normally would.

There is a line drawn on the floor known as the 'crease', about a metre pasts the stumps. If the batter has run past this line while the bowler still has the ball, the bowler can tap the stumps with the ball and get the batter out.

It's in the spirit of cricket for the bowler to say "hey, if you do that I'm gonna try to get you out". It's not in the spirit of cricket for the bowler to simply get them out without warning.

Yes, the bowler would say something like 'you ran too far and were out of the crease. I could have got you out then.'

How common is this run-before-throw thing? Is it deliberate, or careless?

Running or walking as the bowler is about to bowl is deliberate and common (happens pretty much every ball) and is known as 'backing up'. Batters do it both to get a head start and to be 'on their toes' and ready to run. However, if you run too far and step out of the crease, this is careless.

Does the batter ever try to run even after being warned, and if so, how often do they survive?

Yes, sometimes, and often they do not survive. I don't know the numbers, but this non-comprehensive list gives two examples where they got out after being warned.

Would it be in the spirit of cricket for the bowler to warn before every throw, or are they expected to tolerate a certain amount of run-before-throw?

As a bowler, you tolerate it provided that the batter is not running out of his crease. As a bowler, you provide a warning by stopping your run up, and pointing out to the batsman that he has run out of his crease. If you cared and were interested in Mankading (for example, because you felt that the batter was getting an unfair advantage by running too far) you could warn them the first (or second or third...) time you noticed them running out of the crease.

If exactly one team took every possible opportunity to Mankad, would that give them an advantage, or would people simply stop giving them the opportunity?

In the short term, they would have an advantage. Then (I guess) other teams would adapt their playstyles (for example, they would be more cautious when backing up, and probably as a result running between the wickets less) and the advantage would be negated. On the other hand, if a team knew that their opponents would *not* Mankad, regardless of how egregious their backing up was, they would be able to exploit this, by running far down the wicket before the bowler had bowled.

Comment by A.H. (AlfredHarwood) on How do low level hypotheses constrain high level ones? The mystery of the disappearing diamond. · 2023-07-11T22:30:36.943Z · LW · GW

Thanks, that makes sense! And to be clear, would an 'explanation' be a program which could generate the data 3,1,4,1,5,9? And a good explanation would be one which took up fewer bits of information than just the list 3,1,4,1,5,9? 

Comment by A.H. (AlfredHarwood) on How do low level hypotheses constrain high level ones? The mystery of the disappearing diamond. · 2023-07-11T20:25:54.376Z · LW · GW

This seems very interesting but I'm having trouble understanding something. Can you specify what is meant by:

An explanation is good if it is smaller than just hard-coding the answer.

What does 'just hard-coding the answer' mean and look like?

Comment by A.H. (AlfredHarwood) on Optimized for Something other than Winning or: How Cricket Resists Moloch and Goodhart's Law · 2023-07-07T19:13:02.488Z · LW · GW

The purpose of participating in a game is to maximize performance, think laterally, exploit mistakes, and do everything you can, within the explicit rules, to win. Doing that is what makes games fun to play. Watching other people do that, at a level that you could never hope to reach is what makes spectator sports fun to watch.

I don't know if you read the rest of the piece, but the point I was trying to make is that sometimes this isn't true! Sometimes if each team does everything within the rules to win then the game becomes less fun to watch and play (you may disagree, but many sports fans feel this way). I already gave some examples where this happens in other sports, so I don't see the need for your list of hypotheticals (and I feel like they are strawmen anyway).

For what its worth, I agree with you on Bairstow/Carey but which side you take on it is irrelevant (though I can see you are quite passionate about it!).  The piece was about the 'meta' aspects of games which try to address these kind of issues.

Comment by A.H. (AlfredHarwood) on Optimized for Something other than Winning or: How Cricket Resists Moloch and Goodhart's Law · 2023-07-06T14:35:11.393Z · LW · GW

I think the two-player-game-but-player2-gets-to-modify-the-rules is not a fair analogy here. Like I said it's the cricket-loving public that decides, not player 2.

Broadly, I agree with Richard Ngo's characterisation. You are right that the 'cricket loving public' plays some part in determining what counts as 'within the spirit' but it is the decision of the players themselves that often is most important.

How is this different from games with a referee? A foul is what the referee says it is; the spirit of cricket is what the cricket-lovers say it is. In both cases a savvy optimizer would start modelling the relevant humans and predicting what they would and wouldn't judge illegal.

I agree that different rules or optimization targets have different complexity levels, and the spirit of cricket seems more complicated than ordinary fouls which are more complicated than "did the ball hit the pegs."

I agree with you that the complexity is an important factor. I think you are correct that in principle this can still be Goodharted, but in practice it doesn't seem to happen as it is much harder than Goodharting the written rules of the game, due to the increased complexity. There is nothing to prevent a superintelligent player from brainwashing the opposing team and general public to agreeing that their actions are legitimate. It's just that doing this is a lot harder than normal ways of 'gaming the system'. This is why I used the term 'resists Goodharts law' as opposed to 'defeats Goodharts law' or something similar. 

Comment by A.H. (AlfredHarwood) on Optimized for Something other than Winning or: How Cricket Resists Moloch and Goodhart's Law · 2023-07-05T13:44:30.018Z · LW · GW

Interesting, thanks for sharing! Its cool to see how different games manage the conflict between coming up with innovative tactics (which for me is all part of the fun of sports) and exploiting the rules in a way that makes the game boring.

Also thanks for the link to David Sirlin. I haven't heard of him and the website looks interesting!

Comment by A.H. (AlfredHarwood) on What money-pumps exist, if any, for deontologists? · 2023-06-29T20:35:34.239Z · LW · GW

Not a money pump unless there's some path back to "trust me enough that I can extort you again", but that's unlikely related to ethical framework.

I don't understand this. Why would paying out to an extortionist once make you disbelieve them when they threatened you a second time?

Comment by A.H. (AlfredHarwood) on What money-pumps exist, if any, for deontologists? · 2023-06-29T20:28:39.313Z · LW · GW

The "give me money otherwise I'll kill you" money pump is arguably not a money pump

I'm not sure how you mean this. I think that it is a money pump when combined with the assumption that you want to stay alive. You pay money to end up in the same position you started in (presuming you want to stay alive). When back in the position you started, someone can then threaten you again in the same way and get more money from you. It just has fewer steps than the standard money pump. Sure, you could reject the 'I want to stay alive' assumption but then you end up dead, which I think is worse than being money-pumped.

it's waaaaaay more of a problem for consequentialists than deontologists.

Interesting. How so?

Comment by A.H. (AlfredHarwood) on What money-pumps exist, if any, for deontologists? · 2023-06-29T09:03:07.232Z · LW · GW

Aren't you susceptible to the "give me money otherwise I'll kill you" money pump in a way that you wouldn't be if the person threatening you knew that there was some chance you would retaliate and kill them?

If I was some kind of consequentialist, I might say that there is a point at which losing some amount of money is more valuable than the life of the person who is threatening me, so it would be consistent to kill them to prevent this happening.

This is only true if it is public knowledge that you will never kill anyone. It's a bit like a country having an army (or nuclear weapons) and publicly saying that you will never use them to fight.

Comment by A.H. (AlfredHarwood) on Geometric Rationality is Not VNM Rational · 2023-05-04T10:27:08.330Z · LW · GW

I am confused about something. You write that a preference ordering  is geometrically rational if.

This is compared to VNM rationality which favours  if and only if 

Why, in the the definition of geometric rationality, do we have both the geometric average and the arithmetic average? Why not just say "an ordering is geometrically rational if it favours  if and only if  " ?

As I understand it, this is what Kelly betting does. It doesn't favour lotteries over either outcome, but it does reject the VNM continuity axiom, rather than the independence axiom.

Comment by A.H. (AlfredHarwood) on [New] Rejected Content Section · 2023-05-04T08:24:31.855Z · LW · GW

I think this is a good idea, thanks for implementing!

Very minor but the link lesswrong.com/moderation#rejected-comments just goes to the same page as lesswrong.com/moderation#rejected-posts (the written address is correct but the hyperlink goes to the wrong page)

Comment by A.H. (AlfredHarwood) on Harsanyi's Social Aggregation Theorem and what it means for CEV · 2023-04-23T18:18:41.557Z · LW · GW

The link to Harsanyi's paper doesn't work for me. Here is a link that does, if anyone is looking for one: 

https://hceconomics.uchicago.edu/sites/default/files/pdf/events/Harsanyi_1955_JPE_v63_n4.pdf 

Comment by A.H. (AlfredHarwood) on The Geometric Expectation · 2023-02-16T19:02:32.807Z · LW · GW

The infinite non-uniform discrete case is not much more difficult. If  is a finite or countably infinite set,  assigns a nonnegative value to each , and  is a probability distribution on , then 

Very minor, but shouldn't this read " is a probability distribution on " not ?

Comment by A.H. (AlfredHarwood) on On expected utility, part 2: Why it can be OK to predictably lose · 2022-12-24T12:58:08.030Z · LW · GW

Thanks for writing this. I think that the arguments in parts III and IV are particularly compelling and well-written.

Comment by A.H. (AlfredHarwood) on Is Constructor Theory a useful tool for AI alignment? · 2022-11-30T10:38:40.337Z · LW · GW

Thanks for writing this. I wanted to write something about how Deutsch performs a bit a of motte-and-bailey argument (motte:'there are some problems in physics which are hard to solve using the dynamical laws approach'. bailey:'these problems can be solved using constructor theory specifically, rather than other approaches'). Your comment does a good job of making this case. In the end I didn't include it, as the piece was already too long. I just wrote the sentence 

Pointing out problems in the dynamical laws approach to physics and trying to find solutions is useful, even if constructor theory turns out not to be the best solution to them. 

and left it at that.

Comment by AlfredHarwood on [deleted post] 2022-08-11T14:29:59.891Z

I didn’t use ‘modal’ because that is used to refer to logical possibility/impossibility, whereas I am interested in referring to physical possibility/impossibility. Depending on your philosophical views, those two things may or may not be the same.


 

Comment by AlfredHarwood on [deleted post] 2022-08-11T14:27:32.048Z

The form of a counterfactual law ("your perpetual motion machine won't work even if you make that screw longer or do anything else different") seems to be "A, no matter which parameter you change".


I don’t think this is right. As I am using it, ‘counterfactual’ refers to a statement about whether something is possible or impossible. Statements of the form "A, no matter which parameter you change" are not always like this. For example if A=’this ball has a mass of 10kg’. This is not a statement about what is possible or impossible. You could frame it as ‘it is impossible for this ball to have a mass other than 10kg, no matter which parameter you change’, but doesn’t give us any new information compared to the original statement. 

Another important feature is that the impossibility/possibility is not restricted to specific dynamical laws. In your example ‘F=ma, even if the frictionless sphere is blue’, this statement is only true when Newton’s laws apply. But the statement ‘it is impossible to build a perpetual motion machine’ refers, in principle, to all dynamical laws-even ones we haven’t discovered yet-which is why principles like this may help guide our search for new laws.

Comment by AlfredHarwood on [deleted post] 2022-08-10T09:55:15.766Z

Glad that confusion is removed!

I think that it is the best word to use. When used as an adjective Collins defines 'counterfactual' as 'expressing what has not happened but could, would, or might under differing conditions '. I think that this fits the way I was talking about it (eg. when referring to 'counterfactual laws'). In the first post, I talk about whether the lamp 'could would, or might' have been in a different state. In this post, we talk about whether a perpetual motion machine  'could would, or might' work if it was made using a different configuration. (maybe some of the confusion comes from using 'counterfactual' as both an adjective and a noun?)

Though if you have any suggestions on other words that might be clearer, let me know.

Comment by AlfredHarwood on [deleted post] 2022-08-09T14:49:59.227Z

Hi, thanks for the question. I am using the term 'counterfactual' (admittedly somewhat loosely) to describe facts that refer to whether things are possible or impossible, regardless of whether they actually happen. 

In the first post, I claimed that it is only meaningful to say that the lamp transmits information if it is possible for the lamp to be a in a different state. Conversely, if the lamp was broken, then it is impossible for the lamp to be in a different state, and information does not get transmitted. If you just describe the system in terms of what actually happens (ie. 'the lamp is on'), you miss out on this fact. In the first post, I called statements about what actually happens in the system 'factual statements', and statements about what is possible/impossible 'counterfactual statements'.

Similarly, in the case of the perpetual motion machine, you can make a factual statement about what actually happens (ie, some gears turn around and eventually the machine stops moving, failing to achieve perpetual motion), or you can make a counterfactual statement (that it is impossible to make perpetual motion machine, regardless of the specifications of that machine). In this post, I again claimed that just making the factual statement misses out on the important counterfactual claim.

Of course, in the first post, when the lamp is broken, and we say it is 'impossible' to send another signal, this is specified by the parameters of the thought experiment, rather than the laws of physics (in practice, the laws of physics might not prevent you from fixing the lamp, for example). Whereas in the this this post, when we say it is 'impossible' to build a perpetual motion machine, the restriction does come from the laws of physics.

Hope this helps clear things up!

Comment by AlfredHarwood on [deleted post] 2022-08-09T11:04:57.910Z

I think this was well-written and clear, so good job there! I also happen to disagree with the contents.

Thanks for your comment!

First off, I'm highly suspicious of any definition of a "prevailing conception" of physics that excludes the second law of thermodynamics! It seems like in actual practice, sometimes people make predictions by simulation, (the "PC") sometimes they make predictions by generalizing about the character of physical law (the quantum gravity example), and sometimes they do something in between those things and make abstractions/generalizations but then treat those abstractions as tools to do simulation (condensed matter theorists I see you).

Yeah, the term 'prevailing conception' is Deutsch's. It refers specifically to formulating things in terms of initial conditions and dynamical law. I agree its not a great term, as it implies that all current physics comes under its umbrella, which, as you pointed out, is not true.

And so what does it mean to recast physics in a different picture? Does this mean people are going to be rendered unable to do simple simulations about what actually happens when you shoot a particle at a barrier?

The idea isn't to throw away the dynamical laws picture, but to provide a different angle of attack on some problems that seem intractable when expressed in the PC.

So then does it mean that we're going to be able to make new exciting arguments that physicists were unable to make before?

That's the hope!

I mean, I'd love it if this were true, but I'm skeptical. My cynical side expects that there will be few new sorts of arguments, but plenty of flag-planting on old sorts of arguments.

Fair enough. I'm skeptical as well. Constructor theory has produced a couple of interesting results, but as far as I can see, nothing world-changing yet. But I am still convinced that the problems described here (eg. the incompatibility of reversible dynamics with irreversibility of the 2nd law) are real problems. Even if counterfactuals/constructor theory don't work (who knows?), we will need something new to address them!

Comment by AlfredHarwood on [deleted post] 2022-08-04T18:14:46.440Z

I think I disagree with your characterisation of the split between 'objective' Shannon information and information as meaning, which requires interpretation.

As you point at the end of your comment, Shannon information requires you to know the probability distribution from which your data is drawn. And probabilities are reflections of your own state of knowledge, which is subjective. (Or at least subjectively objective, if you are using 'objective' in that sense, then I guess I agree.) For example, if Alice sends Bob a string '11111', we might be tempted to say that she has sent Bob 5 bits of information, but if Bob knows that Alice can only send two possible strings '00000' or '11111', then he would say that she has only sent one bit. All signals, not just what you call 'information as meaning' require some degree of interpretation. And this interpretation, I argue, requires knowing the possible signals that could be sent, even if they are not actually sent. These possible signals are what I am calling counterfactuals. 

I'm not sure I understand your point about conditionals vs counterfactuals.

AFAICS, that's just a special case of the inverse relationship between probability and(Shannon) information. If the lamp is stuck "on", the probability of an "on" signal is 1.000 and the information content is 0.000. So it's not fundamentally about counterfactuals at all.

I kind of agree with this, but it doesn't tell the whole story. Consider the case where, instead of being stuck 'on', the lamp flickers randomly and is on 50% of the time and off 50% of the time. In this case, you would not be able to use the lamp to send information, even though the probability of an 'on' signal is 0.5 and, in one sense, the Shannon entropy would be maximal. To send information requires that it is possible for you to change the signal sent by the lamp. This is what I was trying to get at in this post. Another way of thinking about it is to say that you must have a causal effect on the state of the signal. In both the case where the lamp is stuck on and the case where it is flickering uncontrollably, you have no causal link to the state of the signal. I tried to explain the link between counterfactuals, information and causality in the subsequent post

Comment by AlfredHarwood on [deleted post] 2022-08-04T17:39:13.303Z

I agree with your example and think that it touches on something important. However, in this post, I did not claim that the counterfactual condition was the only condition required for information transfer. You are correct to say that the lamp signal would not constitute information to someone who was unaware of the plan. But this is because, in that situation, there are other conditions that have not been met. Since the other person seeing the lamp signal would not react differently to the different signals, there is no causal link between the signal and them. This is also required for information transfer. I tried to explain this idea a bit more in my subsequent post

If you don't like the idea of information being physical, rather than epistemological, then maybe you can think of this post as asking the question 'what are the physical conditions that a system must satisfy in order to transmit epistemological information?'

Comment by AlfredHarwood on [deleted post] 2022-08-02T14:50:25.080Z

Hi, thanks for your question. I have a big piece covering all of this in more detail which I plan to post in a couple of days once I've finished writing it. In the meantime, please accept this 'teaser' of a few problems in the prevailing conception (PC):

  1. Dealing with hybrid systems. If we are operating in a regime where there are two contradictory sets of dynamical laws, we do not know what kind of evolution the system will follow. An example of such a system is one where both gravity (as governed by general relativity) and quantum mechanics are relevant. In such a cases, under the PC, it is difficult to make any predictions of what kind of behaviour systems will exhibit, since we lack the dynamical laws governing the system. However, by appealing to general counterfactual principles (the interoperability principle and the principle of locality), which cannot be stated in the PC, we can make predictions about such systems, even if we don't know the form of the dynamical laws.
  2. The 2nd Law of Thermodynamics. Under the PC, the 2nd is difficult to express precisely, since all dynamical laws are reversible in time, but the 2nd law implies irreversible dynamics. This is normally dealt with by introducing some degree of imprecision or anthropocentrism (eg. through averaging or coarse graining, or describing the 2nd law in terms of our state of knowledge of the system). However, the 2nd law can be stated precisely as a counterfactual statement along the lines of 'it is impossible to engineer a cyclic process which converts heat entirely into work'.
  3. The initial state problem. Under the PC, the state of a system can be explained in terms of its evolution, according to dynamical laws, from a previous state at an earlier time. This makes it difficult to explain early states of the universe: if a state can only be explained in terms of earlier states, then either the universe has an initial state, which we cannot explain (since there are no earlier state), or the universe does not have an initial state we have an infinite regress, explaining each state in terms of earlier states, going on forever. Neither of these options seem satisfactory.
Comment by AlfredHarwood on [deleted post] 2022-08-02T08:56:22.647Z

I presume you are talking about the post What is Evidence?

Yes, the ideas in that post are closely related to this one. I think that the main difference is that, in that post Eliezer is interested in epistemology whereas here I am interested in how the process works from a physics point of view. For example, in my post, there is no need for 'information' (as I am using the word) to be correlated with true beliefs.