Posts

Misalignment Harms Can Be Caused by Low Intelligence Systems 2022-10-11T13:39:18.674Z
DialecticEel's Shortform 2021-05-04T23:03:16.240Z

Comments

Comment by DialecticEel on DialecticEel's Shortform · 2023-07-29T18:04:54.405Z · LW · GW

Global Wealth Redistribution to Mitigate Arms Races

Is it irrational for North Korea to try and build nuclear weapons? Maybe. However if your country is disenfranchised and in poverty, it does seem like one route to having a say in global affairs and a better life. There are certainly other routes, and South Korea offers an example of what countries can achieve. However, as the world does not have a version of a 'safety net' for poor countries, there remains some incentive to race for power. In other words: if you are not confident that those in power are looking out for your interests, it might make sense to start seeking power through one mechanism or another.

Therefore the case that I would state is that by not having convincing mechanisms to make the world fair in terms of justice and economic opportunity, the global situation is made far more dangerous for all actors. If we are talking about extinction in the face of arms races, then global enfranchisement and global wealth redistribution is something to seriously consider in order to take the edge off arms races. To reassure everyone that their interests will be considered, that their needs will be met, and it isn't just about who wins (even if winning destroys humanity and whoever 'won').

Comment by DialecticEel on Mr. Meeseeks as an AI capability tripwire · 2023-05-19T18:50:36.573Z · LW · GW

This is in itself a relatively benign failure mode no? Obviously in practice if this happened it may just be re-tried until it fails in a different mode or fail catastrophically on the first try

Comment by DialecticEel on Can we, in principle, know the measure of counterfactual quantum branches? · 2022-12-19T15:43:17.329Z · LW · GW

Hmm, I mean when we are talking about these kind of counterfactuals, we obviously aren't working with the wavefunction directly, but that's an interesting point. Do you have a link to any writings on that specifically?

We can perform counterfactual reasoning about the result of a double slit experiment, including predicting the wavefunction, but perhaps that isn't quite what you mean.

Comment by DialecticEel on Can we, in principle, know the measure of counterfactual quantum branches? · 2022-12-18T22:38:11.792Z · LW · GW

An interesting point here is that when talking about future branches, I think you mean that they are probabilities conditioned on the present. However as a pure measure of existence, I don't see why it would need to be conditioned on the present at all. The other question is then, what would count as WW2? A planetary conflict that has occured after another planetary conflict? A conflict called World War 2?

Perhaps you are talking about branches conditioned on a specific point in the past i.e. the end of WW1 as it happened in our past. In which case, I don't see why you couldn't estimate those probabilities, though it's a super complex and chaotic system which you would be applying estimates to and therefore best to take with a pinch of bayesian salt imo.

Comment by DialecticEel on Misalignment Harms Can Be Caused by Low Intelligence Systems · 2022-10-13T19:05:51.741Z · LW · GW

They are a tiny part of the search and development of an intervention.

I agree that there is complexity in healthcare that is not explained by a simple statistical model, my point is that the final layer often is a simple statistical model that drives a lot of the complexity and the outcomes. Making a drug is much more complex than deciding which drug to give, but that decision ultimately drives the outcomes.

Also incorrect.  It would almost certainly require a complex model to find/create that content, possibly anew for each susceptible human.

Same point as above. If there were a piece of content that worked for all humans then a simple model would suffice (multi-armed bandit for example) and if the content doesn't exist, people are incentivised to create it

Comment by DialecticEel on Alignment is hard. Communicating that, might be harder · 2022-09-04T09:50:19.303Z · LW · GW

I see, that's a great point, thanks for your response. It does seem realistic that it would become political, and it's clear that a co-ordinated response is needed.

On that note I think it's a mistake to neglect that our epistemic infrastructure optimises for profit which is an obvious misalignment now. Like facebook and google are already optimising for profit at the expense of civil discourse, they are already misaligned and causing harm. Only focusing on the singularity allows tech companies to become even more harmful, with the vague promise that they'll play nice once they are about to create superintelligence.

Both are clearly important and the control problem specifically deserves a tonne of dedicated resources, but in addition it would be good to have some effort on getting approximate alignment now or at least better than profit maximisation. This obviously wouldn't make progress on the control problem, but it might help society move to a state where it is more likely to do so.

Comment by DialecticEel on Alignment is hard. Communicating that, might be harder · 2022-09-01T19:03:17.070Z · LW · GW

"The EA consensus is roughly that being blunt about AI risks in the broader public would cause social havoc."

I find this odd and patronising to the general public. Why would this not also apply to climate change? Climate change is also a not-initially-obvious threat, yet the bulk of the public now has a reasonable understanding and it's driven a lot of change.

Or would nuclear weapons be a better analogy? Then at least nuclear weapons being publicly understood brought gravity to the conversation. Or could part of the reason to avoid public awareness be avoiding having to bear the weight of that kind of responsibility on our consciences? If the public is clueless, we appear pro-active. If the public is knowledgeable, we appear unprepared and the field of AI reckless, which we are and it is.

Also, lesswrong is a public forum. Eliezer's dying with dignity post was definitely newsworthy for example. Is it even accurate to suggest that we have significant control over the spread of these ideas in the public conciousness at the moment as there is so little attention on it, and we don't control the sorting functions of these media platforms?

Comment by DialecticEel on Robert Long On Why Artificial Sentience Might Matter · 2022-08-29T11:45:15.465Z · LW · GW

I agree, it still wouldn't be strong evidence for or against. No offence to any present or future sentient machines out there, but self-honesty isn't really clearly defined for AIs just yet.

My personal feeling is that LSTMs and transformers with attention on past states would explicitly have a form of self-awareness, by definition. Then I think this bears ethical significance according to something like the compression ratio of the inputs.

As a side note, I enjoy Iain M Banks representation of how AIs could communicate emotions in future in addition to language - by changing colour across a rich field of hues. This doesn't try to make a direct analogy to our emotions and in that sense makes the problem clearer as, in a sense, a clustering of internal states.

Comment by DialecticEel on DialecticEel's Shortform · 2021-05-01T19:17:58.106Z · LW · GW

I've given a more thorough background to this idea in a presentation here https://docs.google.com/presentation/d/1VLUdV8ZFvS_GJdfQC-k7-kMhUrF0kzvm6y-HLEaHoCU and I am trying to work it through more thoroughly. The essential point is to consider mutualistic agency as a potentially desired and even critical feature of systems that could be considered 'friendly' to humans and self-determination as an important form of agency that lends itself to a mathematical analysis via conditional transfer entropy. This is very much an early stage analysis, however what I do think is that our capacities to affect the world are increasing much faster than how precisely we understand what it is we want. In some sense I think it's necessary to understand very precisely the things that are already really obviously important to all humans. Otherwise in our industrial exuberance it seems quite likely we may end up with and engineer worlds that literally no-one wants.