Posts

Why There Is Hope For An Alignment Solution 2024-01-08T06:58:32.820Z
Thoughts On Computronium 2021-03-03T21:52:35.496Z
Darklight's Shortform 2021-02-20T15:01:33.890Z
The Glory System: A Model For Moral Currency And Distributed Self-Moderation 2021-02-19T16:42:48.980Z
As a Washed Up Former Data Scientist and Machine Learning Researcher What Direction Should I Go In Now? 2020-10-19T20:13:44.993Z
The Alpha Omega Theorem: How to Make an A.I. Friendly with the Fear of God 2017-02-11T00:48:35.460Z
Symbolic Gestures – Salutes For Effective Altruists To Identify Each Other 2016-01-20T00:40:43.146Z
[LINK] Sentient Robots Not Possible According To Math Proof 2014-05-14T18:19:59.555Z
Eudaimonic Utilitarianism 2013-09-04T19:43:37.202Z

Comments

Comment by Darklight on Darklight's Shortform · 2024-05-24T15:18:42.144Z · LW · GW

I'm wondering what people's opinions are on how urgent alignment work is. I'm a former ML scientist who previously worked at Maluuba and Huawei Canada, but switched industries into game development, at least in part to avoid contributing to AI capabilities research. I tried earlier to interview with FAR and Generally Intelligent, but didn't get in. I've also done some cursory independent AI safety research in interpretability and game theoretic ideas my spare time, though nothing interesting enough to publish yet.

My wife also recently had a baby, and caring for him is a substantial time sink, especially for the next year until daycare starts. Is it worth considering things like hiring a nanny, if it'll free me up to actually do more AI safety research? I'm uncertain if I can realistically contribute to the field, but I also feel like AGI could potentially be coming very soon, and maybe I should make the effort just in case it makes some meaningful difference.

Comment by Darklight on Open Thread Spring 2024 · 2024-05-10T20:27:07.955Z · LW · GW

Thanks for the reply!

So, the main issue I'm finding with putting them all into one proposal is that there's a 1000 character limit on the main summary section where you describe the project, and I cannot figure out how to cram multiple ideas into that 1000 characters without seriously compromising the quality of my explanations for each.

I'm not sure if exceeding that character limit will get my proposal thrown out without being looked at though, so I hesitate to try that. Any thoughts?

Comment by Darklight on Cooperation is optimal, with weaker agents too  -  tldr · 2024-05-08T20:32:25.607Z · LW · GW

I already tried discussing a very similar concept I call Superrational Signalling in this post. It got almost no attention, and I have doubts that Less Wrong is receptive to such ideas.

I also tried actually programming a Game Theoretic simulation to try to test the idea, which you can find here, along with code and explanation. Haven't gotten around to making a full post about it though (just a shortform).

Comment by Darklight on Open Thread Spring 2024 · 2024-04-30T14:25:37.943Z · LW · GW

So, I have three very distinct ideas for projects that I'm thinking about applying to the Long Term Future Fund for. Does anyone happen to know if it's better to try to fit them all into one application, or split them into three separate applications?

Comment by Darklight on Darklight's Shortform · 2024-03-10T19:01:44.156Z · LW · GW

Recently I tried out an experiment using the code from the Geometry of Truth paper to try to see if using simple label words like "true" and "false" could substitute for the datasets used to create truth probes. I also tried out a truth probe algorithm based on classifying with the higher cosine similarity to the mean vectors.

Initial results seemed to suggest that the label word vectors were sorta acceptable, albeit not nearly as good (around 70% accurate rather than 95%+ like with the datasets). However, testing on harder test sets showed much worse accuracy (sometimes below chance, somehow). So I can probably conclude that the label word vectors alone aren't sufficient for a good truth probe.

Interestingly, the cosine similarity approach worked almost identically well as the mass mean (aka difference in means) approach used in the paper. Unlike the mass mean approach though, the cosine similarity approach can be extended to a multi-class situation. Though, logistic regression can also be extended similarly, so it may not be particularly useful either, and I'm not sure there's even a use case for a multi-class probe. 

Anyways, I just thought I'd write up the results here in the unlikely event someone finds this kind of negative result as useful information.

Comment by Darklight on Darklight's Shortform · 2024-01-28T23:19:06.479Z · LW · GW

Update: I made an interactive webpage where you can run the simulation and experiment with a different payoff matrix and changes to various other parameters.

Comment by Darklight on Darklight's Shortform · 2024-01-22T15:25:24.395Z · LW · GW

So, I adjusted the aggressor system to work like alliances or defensive pacts instead of a universal memory tag. Basically, now players make allies when they both cooperate and aren't already enemies, and make enemies when defected against first, which sets all their allies to also consider the defector an enemy. This, doesn't change the result much. The alliance of nice strategies still wins the vast majority of the time.

I also tried out false flag scenarios where 50% of the time the victim of a defect first against non-enemy will actually be mistaken for the attacker. This has a small effect. There is a slight increase in the probability of an Opportunist strategy winning, but most of the time the alliance of nice strategies still wins, albeit with slightly fewer survivors on average.

My guess for why this happens is that nasty strategies rarely stay in alliances very long because they usually attack a fellow member at some point, and eventually, after sufficient rounds one of their false flag attempts will fail and they will inevitably be kicked from the alliance and be retaliated against.

The real world implications of this remain that it appears that your best bet of surviving in the long run as a person or civilization is to play a nice strategy, because if you play a nasty strategy, you are much less likely to survive in the long run.

In the limit, if the nasty strategies win, there will only be one survivor, dog eat dog highlander style, and your odds of being that winner are 1/N, where N is the number of players. On the other hand, if you play a nice strategy, you increase the strength of the nice alliance, and when the nice alliance wins as it usually does, you're much more likely to be a survivor and have flourished together.

My simulation currently by default has 150 players, 60 of which are nice. On average about 15 of these survive to round 200, which is a 25% survival rate. This seems bad, but the survival rate of nasty strategies is less than 1%. If I switch the model to use 50 Avengers and 50 Opportunists, on average 25 Avengers survive to zero Opportunists, a 50% survival rate for the Avengers.

Thus, increasing the proportion of starting nice players increases the odds of nice players surviving, so there is an incentive to play nice.

Comment by Darklight on Darklight's Shortform · 2024-01-15T22:28:01.098Z · LW · GW

Admittedly this is a fairly simple set up without things like uncertainty and mistakes, so yes, it may not really apply to the real world. I just find it interesting that it implies that strong coordinated retribution can, at least in this toy set up, be useful for shaping the environment into one where cooperation thrives, even after accounting for power differentials and the ability to kill opponents outright, which otherwise change the game enough that straight Tit-For-Tat doesn't automatically dominate.

It's possible there are some situations where this may resemble the real world. Like, if you ignore mere accusations and focus on just actual clear cut cases where you know the aggression has occurred, such as with countries and wars, it seems to resemble how alliances form and retaliation occurs when anybody in the alliance is attacked?

I personally also see it as relevant for something like hypothetical powerful alien AGIs that can see everything that happens from space, and so there could be some kind of advanced game theoretic coordination at a distance with this. Though that admittedly is highly speculative.

It would be nice though if there was a reason to be cooperative even to weaker entities as that would imply that AGI could possibly have game theoretic reasons not to destroy us.

Comment by Darklight on Darklight's Shortform · 2024-01-15T17:58:33.385Z · LW · GW

Okay, so I decided to do an experiment in Python code where I modify the Iterated Prisoner's Dilemma to include Death, Asymmetric Power, and Aggressor Reputation, and run simulations to test how different strategies do. Basically, each player can now die if their points falls to zero or below, and the payoff matrix uses their points as a variable such that there is a power difference that affects what happens. Also, if a player defects first in any round of any match against a non-aggressor, they get the aggressor label, which matters for some strategies that target aggressors. 

Long story short, there's a particular strategy I call Avenger, which is Grim Trigger but also retaliates against aggressors (even if the aggression was against a different player) that ensures that the cooperative strategies (ones that never defect first against a non-aggressor) win if the game goes enough rounds. Without Avenger though, there's a chance that a single Opportunist strategy player wins instead. Opportunist will Defect when stronger and play Tit-For-Tat otherwise.

I feel like this has interesting real world implications.

Interestingly, Enforcer, which is Tit-For-Tat but also opens with Defect against aggressors, is not enough to ensure the cooperative strategies always win. For some reason you need Avenger in the mix.

Edit: In case anyone wants the code, it's here.

Comment by Darklight on Darklight's Shortform · 2024-01-15T17:57:03.801Z · LW · GW

I was recently trying to figure out a way to calculate my P(Doom) using math. I initially tried just making a back of the envelope calculation by making a list of For and Against arguments and then dividing the number of For arguments by the total number of arguments. This led to a P(Doom) of 55%, which later got revised to 40% when I added more Against arguments. I also looked into using Bayes Theorem and actual probability calculations, but determining P(E | H) and P(E) to input into P(H | E) = P(E | H) * P(H) / P(E) is surprisingly hard and confusing.

Comment by Darklight on Apologizing is a Core Rationalist Skill · 2024-01-02T20:45:17.226Z · LW · GW

Minor point, but the apology needs to sound sincere and credible, usually by being specific about the mistakes and concise and to the point and not like, say, Bostrom's defensive apology about the racist email a while back. Otherwise you can instead signal that you are trying to invoke the social API call in a disingenuous way, which can clearly backfire.

Things like "sorry you feel offended" also tend to sound like you're not actually remorseful for your actions and are just trying to elicit the benefits of an apology. None of the apologies you described sound anything like that, but it's a common failure state among the less emotionally mature and the syncophantic.

Comment by Darklight on Darklight's Shortform · 2023-12-27T19:22:00.449Z · LW · GW

I have some ideas and drafts for posts that I've been sitting on because I feel somewhat intimidated by the level of intellectual rigor I would need to put into the final drafts to ensure I'm not downvoted into oblivion (something a younger me experienced in the early days of Less Wrong).

Should I try to overcome this fear, or is it justified?

For instance, I have a draft of a response to Eliezer's List of Lethalities post that I've been sitting on since 2022/04/11 because I doubted it would be well received given that it tries to be hopeful and, as a former machine learning scientist, I try to challenge a lot of LW orthodoxy about AGI in it. I have tremendous respect for Eliezer though, so I'm also uncertain if my ideas and arguments aren't just hairbrained foolishness that will be shot down rapidly once exposed to the real world, and the incisive criticism of Less Wrongers.

The posts here are also now of such high quality that I feel the bar is too high for me to meet with my writing, which tends to be more "interesting train-of-thought in unformatted paragraphs" than the "point-by-point articulate with section titles and footnotes" style that people tend to employ.

Anyone have any thoughts?

Comment by Darklight on Could induced and stabilized hypomania be a desirable mental state? · 2023-06-14T18:35:03.685Z · LW · GW

I would be exceedingly cautious about this line of reasoning. Hypomania tends to not be sustainable, with a tendency to either spiral into a full blown manic episode, or to exhaust itself out and lead to an eventual depressive episode. This seems to have something to do with the characteristics of the thoughts/feelings/beliefs that develop while hypomanic, the cognitive dynamics if you will. You'll tend to become increasingly overconfident and positive to the point that you will either start to lose contact with reality by ignoring evidence to the contrary of what you think is happening (because you feel like everything is awesome so it must be), or reality will hit you hard when the good things that you expect to happen, don't, and you update accordingly (often overcompensating in the process).

In that sense, it's very hard to stay "just" hypomanic. And honestly, to my knowledge, most psychiatrists are more worried about potential manic episodes than anything else in bipolar disorder, and will put you on enough antipsychotics to make you a depressed zombie to prevent them, because generally speaking the full on psychosis level manic episodes are just more dangerous for everyone involved.

Ideally, I think your mood should fit your circumstances. Hypomania often shows up as inappropriately high positive mood even in situations where it makes little sense to be so euphoric, and that should be a clear indicator of why it can be problematic.

It can be tempting to want to stay in some kind of controlled hypomania, but in reality, this isn't something that to my knowledge is doable with our current science and technology, at least for people with actual bipolar disorder. It's arguable that for individuals with normally stable mood, putting them on stimulants could have a similar effect as making them a bit hypomanic (not very confident about this though). Giving people with bipolar disorder stimulants that they don't otherwise need on the other hand is a great way to straight up induce mania, so I definitely wouldn't recommend that.

Comment by Darklight on Yoshua Bengio: How Rogue AIs may Arise · 2023-05-24T17:14:40.810Z · LW · GW

I still remember when I was a masters student presenting a paper at the Canadian Conference on AI 2014 in Montreal and Bengio was also at the conference presenting a tutorial, and during the Q&A afterwards, I asked him a question about AI existential risk. I think I worded it back then as concerned about the possibility of Unfriendly AI or a dangerous optimization algorithm or something like that, as it was after I'd read the sequences but before "existential risk" was popularized as a term. Anyway, he responded by asking jokingly if I was a journalist, and then I vaguely recall him giving a hedged answer about how current AI was still very far away from those kinds of concerns.

It's good to see he's taking these concerns a lot more seriously these days. Between him and Hinton, we have about half of the Godfathers of AI (missing LeCun and Schmidhuber if you count him as one of them) showing seriousness about the issue. With any luck, they'll push at least some of their networks of top ML researchers into AI safety, or at the very least make AI safety more esteemed among the ML research community than before.

Comment by Darklight on How Does the Human Brain Compare to Deep Learning on Sample Efficiency? · 2023-01-15T22:52:01.797Z · LW · GW

The average human lifespan is about 70 years or approximately 2.2 billion seconds. The average human brain contains about 86 billion neurons or roughly 100 trillion synaptic connections. In comparison, something like GPT-3 has 175 billion parameters and 500 billion tokens of data. Assuming very crudely weight/synapse and token/second of experience equivalence, we can see that the human model's ratio of parameters to data is much greater than GPT-3, to the point that humans have significantly more parameters than timesteps (100 trillion to 2.2 billion), while GPT-3 has significantly fewer parameters than timesteps (175 billion to 500 billion). Given the information gain per timestep is different for the two models, but as I said, these are crude approximations meant to convey the ballpark relative difference.

This means basically that humans are much more prone to overfitting the data, and in particular, memorizing individual data points. Hence why humans experience episodic memory of unique events. It's not clear that GPT-3 has the capacity in terms of parameters to memorize its training data with that level of clarity, and arguably this is why such models seem less sample efficient. A human can learn from a single example by memorizing it and retrieving it later when relevant. GPT-3 has to see it enough times in the training data for SGD to update the weights sufficiently that the general concept is embedded in the highly compressed information model.

It's thus, not certain whether or not existing ML models are sample inefficient because of the algorithms being used, or if its because they just don't have enough parameters yet, and increased efficiency will emerge from scaling further.

Comment by Darklight on Darklight's Shortform · 2022-09-05T17:46:56.819Z · LW · GW

I recently interviewed with Epoch, and as part of a paid work trial they wanted me to write up a blog post about something interesting related to machine learning trends. This is what I came up with:

http://www.josephius.com/2022/09/05/energy-efficiency-trends-in-computation-and-long-term-implications/

Comment by Darklight on What does moral progress consist of? · 2022-08-20T22:29:50.447Z · LW · GW

I should point out that the logic of the degrowth movement follows from a relatively straightforward analysis of available resources vs. first world consumption levels.  Our world can only sustain 7 billion human beings because the vast majority of them live not at first world levels of consumption, but third world levels, which many would argue to be unfair and an unsustainable pyramid scheme.  If you work out the numbers, if everyone had the quality of life of a typical American citizen, taking into account things like meat consumption to arable land, energy usage, etc., then the Earth would be able to sustain only about 1-3 billion such people.  Degrowth thus follows logically if you believe that all the people around the world should eventually be able to live comfortable, first world lives.

I'll also point out that socialism is, like liberalism, a child of the Enlightenment and general beliefs that reason and science could be used to solve political and economic problems.  Say what you will about the failed socialist experiments of the 20th century, but the idea that government should be able to engineer society to function better than the ad-hoc arrangement that is capitalism, is very much an Enlightenment rationalist, materialist, and positivist position that can be traced to Jean-Jacques Rousseau, Charles Fourier, and other philosophes before Karl Marx came along and made it particularly popular.  Marxism in particular, at least claims to be "scientific socialism", and historically emphasized reason and science, to the extent that most Marxist states were officially atheist (something you might like given your concerns about religions).

In practice, many modern social policies, such as the welfare state, Medicare, public pensions, etc., are heavily influenced by socialist thinking and put in place in part as a response by liberal democracies to the threat of the state socialist model during the Cold War.  No country in the world runs on laissez-faire capitalism, we all utilize mixed market economies with varying degrees of public and private ownership.  The U.S. still has a substantial public sector, just as China, an ostensibly Marxist Leninist society in theory, has a substantial private sector (albeit with public ownership of the "commanding heights" of the economy).  It seems that all societies in the world eventually compromised in similar ways to achieve reasonably functional economies balanced with the need to avoid potential class conflict.  This convergence is probably not accidental.

If you're truly more concerned with truth seeking than tribal affiliations, you should be aware of your own tribe, which as far as I can tell, is western, liberal, and democratic.  Even if you honestly believe in the moral truth of the western liberal democratic intellectual tradition, you should still be aware that it is, in some sense, a tribe.  A very powerful one that is arguably predominant in the world right now, but a tribe nonetheless, with its inherent biases (or priors at least) and propaganda.

Just some thoughts.

Comment by Darklight on Thoughts On Computronium · 2022-06-17T16:29:16.111Z · LW · GW

I'm using the number calculated by Ray Kurzweil for his book, the Age of Spiritual Machines from 1999.  To get that figure, you need 100 billion neurons firing every 5 ms, or 200 Hz.  That is based on the maximum firing rate given refractory periods.  In actuality, average firing rates are usually lower than that, so in all likelihood the difference isn't actually six orders of magnitude.  In particular, I should point out that six orders of magnitude is referring to the difference between this hypothetical maximum firing brain and the most powerful supercomputer, not the most energy efficient supercomputer.

The difference between the hypothetical maximum firing brain and the most energy efficient supercomputer (at 26 GigaFlops/watt) is only three orders of magnitude.  For the average brain firing at the speed that you suggest, it's probably closer to two orders of magnitude.  Which would mean that the average human brain is probably one order of magnitude away from the Landauer limit.

This also assumes that its neurons and not synapses that should be the relevant multiplier.

Comment by Darklight on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-12T15:08:49.546Z · LW · GW

Okay, so I contacted 80,000 hours, as well as some EA friends for advice.  Still waiting for their replies.

I did hear from an EA who suggested that if I don't work on it, someone else who is less EA-aligned will take the position instead, so in fact, it's slightly net positive for myself to be in the industry, although I'm uncertain whether or not AI capability is actually funding constrained rather than personal constrained.

Also, would it be possible to mitigate the net negative by choosing to deliberately avoid capability research and just take an ML engineering job at a lower tier company that is unlikely to develop AGI before others and just work on applying existing ML tech to solving practical problems?

Comment by Darklight on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-09T19:47:29.602Z · LW · GW

I previously worked as a machine learning scientist but left the industry a couple of years ago to explore other career opportunities.  I'm wondering at this point whether or not to consider switching back into the field.  In particular, in case I cannot find work related to AI safety, would working on something related to AI capability be a net positive or net negative impact overall?

Comment by Darklight on Thoughts On Computronium · 2021-03-04T02:27:24.945Z · LW · GW

Even further research shows the most recent Nvidia RTX 3090 is actually slightly more efficient than the 1660 Ti, at 36 TeraFlops, 350 watts, and 2.2 kg, which works out to 0.0001 PetaFlops/Watt and 0.016 PetaFlops/kg.  Once again, they're within an order of magnitude of the supercomputers.

Comment by Darklight on Thoughts On Computronium · 2021-03-04T01:56:23.595Z · LW · GW

So, I did some more research, and the general view is that GPUs are more power efficient in terms of Flops/watt than CPUs, and the most power efficient of those right now is the Nvidia 1660 Ti, which comes to 11 TeraFlops at 120 watts, so 0.000092 PetaFlops/Watt, which is about 6x more efficient than Fugaku.  It also weighs about 0.87 kg, which works out to 0.0126 PetaFlops/kg, which is about 7x more efficient than Fugaku.  These numbers are still within an order of magnitude, and also don't take into account the overhead costs of things like cooling, case, and CPU/memory required to coordinate the GPUs in the server rack that one would assume you would need.

I used the supercomputers because the numbers were a bit easier to get from the Top500 and Green500 lists, and I also thought that their numbers include the various overhead costs to run the full system, already packaged into neat figures.

Comment by Darklight on Darklight's Shortform · 2021-02-20T15:09:01.110Z · LW · GW

Another thought is that maybe Less Wrong itself, if it were to expand in size and become large enough to roughly represent humanity, could be used as such a dataset.

Comment by Darklight on Darklight's Shortform · 2021-02-20T15:01:34.314Z · LW · GW

So, I had a thought.  The glory system idea that I posted about earlier, if it leads to a successful, vibrant democratic community forum, could actually serve as a kind of dataset for value learning.  If each post has a number attached to it that indicates the aggregated approval of human beings, this can serve as a rough proxy for a kind of utility or Coherent Aggregated Volition.

Given that individual examples will probably be quite noisy, but averaged across a large amount of posts, it could function as a real world dataset, with the post content being the input, and the post's vote tally being the output label.  You could then train a supervised learning classifier or regressor that could then be used to guide a Friendly AI model, like a trained conscience.

This admittedly would not be provably Friendly, but as a vector of attack for the value learning problem, it is relatively straightforward to implement and probably more feasible in the short-run than anything else I've encountered.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T21:18:20.910Z · LW · GW

A further thought is that those with more glory can be seen almost as elected experts.  Their glory is assigned to them by votes after all.  This is an important distinction from an oligarchy.  I would actually be inclined to see the glory system as located on a continuum between direct demcracy and representative democracy.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T21:00:22.655Z · LW · GW

So, keep in mind that by having the first vote free and worth double the paid votes does tilt things more towards democracy.  That being said, I am inclined to see glory as a kind of proxy for past agreement and merit, and a rough way to approximate liquid democracy where you can proxy your vote to others or vote yourself.

In this alternative "market of ideas" the ideas win out because people who others trust to have good opinions are able to leverage that trust.  Decisions over the merit of the given arguments are aggregated by vote.  As long as the population is sufficiently diverse, this should result in an example of the Wisdom of Crowds phenomenon.

I don't think it'll dissolve into a mere flag waving contest, anymore than the existing Karma system on Reddit and Less Wrong does already.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T18:25:15.896Z · LW · GW

Perhaps a nitpick detail, but having someone rob them would not be equivalent, because the cost of the action is offset by the ill-gotten gains.  The proposed currency is more directly equivalent to paying someone to break into the target's bank account and destroying their assets by a proportional amount so that no one can use them anymore.

As for the more general concerns:

Standardized laws and rules tend in practice to disproportionately benefit those with the resources to bend and manipulate those rules with lawyers.  Furthermore, this proposal does not need to replace all laws, but can be utilized alongside them as a way for people to show their disapproval in a way that is more effective that verbal insult, and less coercive than physical violence.  I'd consider it a potential way to channel people's anger so that they don't decide to start a revolution against what they see as laws that benefit the rich and powerful.  It is a way to distribute a little power to individuals and allow them to participate in a system that considers their input in a small but meaningful way.

The rules may be more consistent with laws, but in practice, they are also contentious in the sense that the process of creating these laws is arcane and complex and the resulting punishments often delayed for years as they work through the legal system.  Again, this makes sense when determining how the coercive power of the state should be applied, but leaves something to be desired in terms of responsiveness to addressing real world concerns.

Third-party enforcement is certainly desirable.  In practice, the glory system allows anyone outside the two parties to contribute and likely the bulk of votes will come from them.  As for cycles of violence, the exchange rate mechanism means that defence is at least twice as effective as attack with the same amount of currency, which should at least mitigate the cycles because it won't be cost-effective to attack without significant public support.  Though this is only relevant to the forum condition.

In the general condition as a currency, keep in mind that as a currency functions as a store of value, there is a substantial opportunity cost to spending the currency to destroy other people's currency rather than say, using it to accrue interest.  The cycles are in a sense self-limiting because people won't want to spend all their money escalating a conflict that will only cause both sides to hemorrhage funds, unless someone feels so utterly wronged as to be willing to go bankrupt to bankrupt another, in which case, one should honestly be asking what kind of injustice caused this situation to come into being in the first place.

All that being said, I appreciate the critiques.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T17:51:34.274Z · LW · GW

As for the cheaply punishing prolific posters problem, I don't know a good solution that doesn't lead to other problems, as forcing all downvotes to cost glory makes it much harder to deal with spammers who somehow get through the application process filter.  I had considered an alternative system in which all votes cost glory, but then there's no way to generate glory except perhaps by having admins and mods gift them, which could work, but runs counter to the direct democracy ideal that I was sorta going for.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T17:37:44.308Z · LW · GW

What I meant was you could farm upvotes on your posts.  Sorry.  I'll edit it for clarity.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T17:34:15.864Z · LW · GW

And further to clarify, you'd both be able to gift glory and also spend glory to destroy other people's glory, at the mentioned exchange rate.

The way glory is introduced into the system is that any given post allows everyone one free vote on them that costs no glory.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T17:32:25.244Z · LW · GW

So, I guess I should clarify, the idea is that you can both gift glory, which is how you gain the ability to post, and also you gain or lose glory based on people's upvotes and downvotes on your posts.

Comment by Darklight on As a Washed Up Former Data Scientist and Machine Learning Researcher What Direction Should I Go In Now? · 2020-10-19T22:05:36.402Z · LW · GW

I have been able to land interviews at a rate of about 8/65 or 12% of the positions I apply to.  My main assumption is that the timing of COVID-19 is bad, and I'm also only looking at positions in my geographical area of Toronto.  It's also possible that I was overconfident early on and didn't prep enough for the interviews I got, which often involved general coding challenges that depended on data structures and algorithms that I hadn't studied since undergrad, as well as ML fundamentals for things like PCA that I hadn't touched in a long time as my research work has been deep learning focused.

As for corporate politics and how to handle them rationally, I'm not entirely sure I can be much help, as to be honest, I'm not entirely clear on what happened to cause the situation that I got myself into.

Perhaps the thing I could suggest is to be tactful and avoid giving people an excuse or opportunity to side line you, and never assume that you can work with anyone without issue, because toxic or hostile managers especially can make you miserable and prevent you from being successful, and noticing such people in advance and avoiding having to depend on their performance appraisals is probably a good idea.

Most people in business seem focused on performing and getting results, and some of them are wary of others who could overtake them, and so you need to balance showing your value with not seeming threatening to their position.  I was in an awkward position that my immediate manager and I didn't get along, but the director of the department who originally hired me protected me from too much reprisal.  However, he needed me to perform better to be able to advocate for me effectively, and it was difficult to do so under the person I was directly under.

Such situations can arise and get quite complicated.  I wish I could say you can use the tools of rationality to reason with anyone and convince them to work cooperatively on team goals, but I found that some people are less amenible than others.  Furthermore, if someone makes an attack against you in corporate politics, chances are you won't see it coming, using a subordinate to strike indirectly, and those involved will straight up ignore your communications or give you the runaround in such a way that you won't be sure who is actually responsible for what.  Many meetings are behind closed doors, and there is a clear limit to the information you will have relative to your superiors, which can make it difficult to defend yourself even if you know something is going on.

I guess another thing I can add is that probably a large part of why I was able to avoid being fired was that I had substantial documentation, including a detailed research journal, and a spreadsheet of my working hours to back me up.  When trying to be a rational and honest worker in the corporate world, a paper trail is protection and a good way to ensure that the compliance department and HR will be on your side when it counts.

Also, beware that if you let certain types of people get away with one seemingly small thing, they will see that as weakness and that you are exploitable.  Know your boundaries and the regulations of the company.  Bullies are not just a schoolyard problem, but in the office, they're much smarter and know how to get away with things.  Sometimes these people are also good enough at their jobs that you will not be able to do anything to them because the company needs what they provide.  That is life.  Pick your battles and don't allow unfair situations and difficulties to make you lose sleep and perform worse.  Do the best you can to do your job well, such that you are beyond rapproach if possible.  Be aware that things can spiral.  If you lose sleep over something that happened, and this makes you late for work the next day, you've given your detractors ammunition.

That's all I can think of right now.

Edit:  As an example of how clever other people can be at office politics, I was once put in a kind of double bind or trap situation that was similar to a fork in Chess.  Basically, I was told by a manager not to push some code into a repository, ostensibly because we'd just given privileges to someone who had been hired by a different department and who we suspected might steal the code for that department (there's a horse race culture at the corporation).  Here's the thing, if I did what he told me to, this repo would be empty and I'd have no independent evidence that my current project had made any progress, leaving me vulnerable to him accusing me of not doing work, or he could deny that he told me not to put in the code, making it look like I was concealing stuff from the company.  If I refused to go along and instead pushed the code, I would be insubordinate and disloyal to my department and his managers, who he claimed had told him to tell me what to do.

Comment by Darklight on Any Christians Here? · 2017-06-15T00:07:43.188Z · LW · GW

Actually, apparently I forgot about the proper term: Utilitronium

Comment by Darklight on Any Christians Here? · 2017-06-14T01:04:34.654Z · LW · GW

I would urge you to go learn about QM more. I'm not going to assume what you do/don't know, but from what I've learned about QM there is no argument for or against any god.

Strictly speaking it's not something that is explicitly stated, but I like to think that the implication flows from a logical consideration of what MWI actually entails. Obviously MWI is just one of many possible alternatives in QM as well, and the Copenhagen Interpretation obviously doesn't suggest anything.

This also has to due with the distance between the moon and the earth and the earth and the sun. Either or both could be different sizes, and you'd still get a full eclipse if they were at different distances. Although the first test of general relativity was done in 1919, it was found later that the test done was bad, and later results from better replications actually provided good enough evidence. This is discussed in Stephen Hawking's A Brief History of Time.

The point is that they are a particular ratio that makes them ideal for these conditions, when they could have easily been otherwise, and that these are exceptionally convenient coincidences for humanity.

There are far more stars than habitable worlds. If you're going to be consistent with assigning probabilities, then by looking at the probability of a habitable planet orbiting a star, you should conclude that it is unlikely a creator set up the universe to make it easy or even possible to hop planets.

The stars also make it possible for us to use telescopes to identify which planets are in the habitable zone. It remains much more convenient than if all star systems were obscured by a cloud of dust, which I can easily imagine being the norm in some alternate universe.

Right, the sizes of the moon and sun are arbitrary. We could easily live on a planet with no moon, and have found other ways to test General Relativity. No appeal to any form of the Anthropic Principle is needed. And again with the assertion about habitable planets: the anthropic principle (weak) would only imply that to see other inhabitable planets, there must be an inhabitable planet from which someone is observing.

Again, the point is that these are very notable coincidences that would be more likely to occur in a universe with some kind of advanced ordering.

So you didn't provide any evidence for any god; you just committed a logical fallacy of the argument from ignorance.

When I call this evidence, I am using it in the probabilistic sense, that the probability of the evidence given the hypothesis is higher than the probability of the evidence by itself. Even though these things could be coincidences, they are more likely to occur in a controlled universe meant for habitation by sentient beings. In that sense I consider this evidence.

I don't know why you bring up the argument from ignorance. I haven't proclaimed that this evidence conclusively proves anything. Evidence is not proof.

The way I view the universe, everything you state is still valid. I see the universe as a period of asymmetry, where complexity is allowed to clump together, but it clumps in regular ways defined by rules we can discover and interpret.

Why though? Why isn't the universe simply chaos without order? Why is it consistent such that the spacetime metric is meaningful? The structure and order of reality itself strikes me as peculiar given all the possible configurations that one can imagine. Why don't things simply burst into and out of existence? Why do cause and effect dominate reality as they do? Why does the universe have a beginning and such uneven complexity rather than just existing forever as a uniform Bose-Einstein condensate of near zero state, low entropy particles?

To me, the mark of a true rationalist is an understanding of the nature of truth. And the truth is that the truth is uncertain. I don't pretend like the interesting coincidences are proof of God. To be intellectually honest, I don't know that there is a God. I don't know that the universe around me isn't just a simulation I'm being fed either though. Ultimately we have to trust our senses and our reasoning, and accept tentatively some beliefs as more likely than others, and act accordingly. The mark of a good rationalist is a keen awareness of their own limited degree of awareness of the truth. It is a kind of humility that leads to an open mind and a willingness to consider all possibilities, weighed according to the probability of the evidence associated with them.

Comment by Darklight on Any Christians Here? · 2017-06-14T00:26:39.926Z · LW · GW

Interesting, what is that?

The idea of theistic evolution is simply that evolution is the method by which God created life. It basically says, yes, the scientific evidence for natural selection and genetic mutation is there and overwhelming, and accepts these as valid, while at the same time positing that God can still exist as the cause that set the universe and evolution in motion through putting in place the Laws of Nature. It requires not taking the six days thing in the Bible literally, but rather metaphorically as being six eons of time, or some such. The fact that sea creatures precede land creatures precede humans suggests that the general order described in scripture is consistent with established science as well.

Are you familiar with the writings of Frank J. Tipler?

I have heard of Tipler and his writings, though I have yet to actually read his books.

That would be computronium-based I suppose.

Positronium in this case means "Positive Computronium" yes.

Comment by Darklight on Looking for machine learning and computer science collaborators · 2017-06-13T05:37:40.784Z · LW · GW

I might be able to collaborate. I have a masters in computer science and did a thesis on neural networks and object recognition, before spending some time at a startup as a data scientist doing mostly natural language related machine learning stuff, and then getting a job as a research scientist at a larger company to do similar applied research work.

I also have two published conference papers under my belt, though they were in pretty obscure conferences admittedly.

As a plus, I've also read most of the sequences and am familiar with the Less Wrong culture, and have spent a fair bit of time thinking about the Friendly/Unfriendly AI problem. I even came up with an attempt at a thought experiment to convince an AI to be friendly.

Alas, I am based near Toronto, Ontario, Canada, so distance might be an issue.

Comment by Darklight on Open thread, June. 12 - June. 18, 2017 · 2017-06-13T05:22:01.933Z · LW · GW

Well, as far as I can tell, the latest progress in the field has come mostly through throwing deep learning techniques like bidirectional LSTMs at the problem and letting the algorithms figure everything out. This obviously is not particularly conducive to advancing the theory of NLP much.

Comment by Darklight on Any Christians Here? · 2017-06-13T04:47:45.569Z · LW · GW

I consider myself both a Christian and a rationalist, and I have read much of the sequences and mostly agree with them, albeit I somewhat disagree with the metaethics sequence and have been working on a lengthy rebuttal to it for some time. I never got around to completing it though, as I felt I needed to be especially rigorous and simply did not have the time and energy to make it sufficiently so, but the gist is that Eliezer's notion of fairness is actually much closer to what real morality is, which is a form of normative truth. In terms of moral philosophy I adhere to a form of Eudaimonic Utilitarianism, and see this as being consistent with the central principles of Christianity. Metaethically, I am a moral universalist.

Aside from that, I don't consider Christianity and rationality to be opposed, but I will emphasize that I am a very much a liberal Christian, one who is a theistic evolutionist and believes that the Bible needs to be interpreted contextually and with broad strokes, emphasizing overarching themes rather than individual cherry-picked verses. Furthermore, I tend to see no contradiction in identifying the post-Singularity Omega as being what will eventually become God, and actually find support from scriptures that call God, "the Alpha and Omega", and "I AM WHO I WILL BE" (the proper Hebrew translation of the Tetragrammaton or "Yahweh").

I also tend to rely fairly heavily on the idea that we as rational humans should be humble about our actual understanding of the universe, and that God, if such a being exists, would have perfect information and therefore be a much better judge of what is good or evil than us. I am willing to take a leap of faith to try to connect with such a being, and respect that the universe might very well be constructed in such a way as the maximize the long run good. It probably goes without saying that I also reject the Orthogonality Thesis, specifically for the special case of perfect intelligence. A perfect intelligence with perfect information would naturally see the correct morality and be motivated by the normativity of such truths to act in accordance with them.

This justifies the notion of perhaps a very basic theism. The reason why I accept the central precepts of Christianity has more to do with the teachings of Jesus being very consistent with my understanding of Eudaimonic Utilitarianism, as well as the higher order justice that I believe is preserved by Jesus' sacrifice. In short, God is ultimately responsible for everything, including sin, so sacrificing an incarnation of God (Jesus) to redeem all sentient beings is both merciful and just.

Also, I consider heaven to be central to God being a benevolent utilitarian "Goodness Maximizer". Heaven is in all likelihood some kind of complex simulation or positronium-based future utopia, and ensuring that nearly all sentient beings are (with the help of time travel) mind-uploaded to it in some form or state is very likely to bring about Eudaimonia optimization. Thus, the degree of suffering that occurs in this life on Earth, is in all likelihood justifiable as long as it leads to the eventual creation of eternal life in heaven, because eternal life in heaven = infinite happiness.

As to the likelihood of a God actually existing, I posit that with Many Worlds Interpretation of Quantum Mechanics, a benevolent God is more likely than not going to exist somewhere. And such a God would be powerful and benevolent enough to be able to and also want to expand to all universes across the multiverse in order to establish as heaven maximally inclusively as possible, if not also create the multiverse via time travel.

As to evidence for the existence of a God... were you aware that the ratio of sizes between the Sun and the Moon just happen to be exactly right for there to be total solar eclipses? And that this peculiar coincidence was pivotal to allowing Einstein's Theory of Relativity to be proven in 1919? How about the odd fact that the universe seems to be filled with giant burning beacons called stars, that simultaneously provide billions of years of light energy, and basically flag the locations of potentially habitable worlds for future colonization? These may seem like trivial coincidences to you, but I see them as rather too convenient to be random developments, given the space of all possible universe configurations. They are not essential to sapient life, and so they do not meet the criteria for the Anthropic Principle either.

Anyways, this is getting way beyond the original scope or point of this post, which was just to point out that Christian rationalist Lesswrongers do exist more or less. I'm pretty sure I'm well in the minority though.

Comment by Darklight on OpenAI makes humanity less safe · 2017-04-27T01:31:18.147Z · LW · GW

I don't really know enough about business and charity structures and organizations to answer that quite yet. I'm also not really sure where else would be a productive place to discuss these ideas. And I doubt I or anyone else reading this has the real resources to attempt to build a safe AI research lab from scratch that could actually compete with the major organizations like Google, Facebook, or OpenAI, which all have millions to billions of dollars at their disposal, so this is kind of an idle discussion. I'm actually working for a larger tech company now than the startup from before, so for the time being I'll be kinda busy with that.

Comment by Darklight on OpenAI makes humanity less safe · 2017-04-24T00:32:32.680Z · LW · GW

That is a hard question to answer, because I'm not a foreign policy expert. I'm a bit biased towards Canada because I live there and we already have a strong A.I. research community in Montreal and around Toronto, but I'll admit Canada as a middle power in North America is fairly beholden to American interests as well. Alternatively, some reasonably peaceful, stable, and prosperous democratic country like say, Sweden, Japan, or Australia might make a lot of sense.

It may even make some sense to have the headquarters be more a figurehead, and have the company operate as a federated decentralized organization with functionally independent but cooperating branches in various countries. I'd probably avoid establishing such branches in authoritarian states like China or Iran, mostly because such states would have a much easier time arbitrarily taking over control of the branches on a whim, so I'd probably stick to fairly neutral or pacifist democracies that have a good history of respecting the rule of law, both local and international, and which are relatively safe from invasion or undue influence by the great powers of U.S., Russia, and China.

Though maybe an argument can be made to intentionally offset the U.S. monopoly by explicitly setting up shop in another great power like China, but that runs the risks I mentioned earlier.

And I mean, if you could somehow acquire a private ungoverned island in the Pacific or an offshore platform, or an orbital space station or base on the moon or mars, that would be cool too, but I highly doubt that's logistically an option for the foreseeable future, not to mention it could attract some hostility from the existing world powers.

Comment by Darklight on Net Utility and Planetary Biocide · 2017-04-10T00:01:12.376Z · LW · GW

I've had arguments before with negative-leaning Utilitarians and the best argument I've come up with goes like this...

Proper Utility Maximization needs to take into account not only the immediate, currently existing happiness and suffering of the present slice of time, but also the net utility of all sentient beings throughout all of spacetime. Assuming that the Eternal Block Universe Theory of Physics is true, then past and future sentient beings do in fact exist, and therefore matter equally.

Now the important thing to stress here is then that what matters is not the current Net Utility today but overall Net Utility throughout Eternity. Two basic assumptions can be made about the trends through spacetime. First, that compounding population growth means that most sentient beings exist in the future. Second, that melioristic progress means that the conscious experience is, all other things being equal, more positive in the future than in the past, because of the compounding effects of technology, and sentient beings deciding to build and create better systems, structures, and societies that outlive the individuals themselves.

Sentient agents are not passive, but actively seek positive conscious experiences and try to create circumstances that will perpetuate such things. Thus, as the power of sentient beings to influence the state of the universe increases, so should the ratio of positive to negative. Other things, such as the psychological negativity bias, remain stable throughout history, but compounding factors instead trend upwards at usually an exponential rate.

Thus, assuming these trends hold, we can expect that the vast majority of conscious experiences will be positive, and the overall universe will be net positive in terms of utility. Does that suck for us who live close to the beginning of civilization? Kinda yes. But from a Utilitarian perspective, it can be argued that our suffering is for the Greatest Good, because we are the seeds, the foundation from which so much will have its beginnings.

Now, this can be countered that we do not know that the future really exists, and that humanity and its legacy might well be snuffed out sooner rather than later. In fact, the fact that we are born here now, can be seen as statistical evidence for this, because if on average you are most likely to be born at the height of human existence, then this period of time is likely to be around the maximum point before the decline.

However, we cannot be sure about this. Also, if Many Worlds Interpretation of Quantum Mechanics is true, then even if for most worlds humanity ceases to exist around this time, there still exists a non-trivial percentage of worlds where humanity survives into the far distant future, establishing a legacy among the stars and creates relative utopia through the compound effects aforementioned. For the sake of these possible worlds, and their extraordinarily high expected utility, I would recommend trying to keep life and humanity alive.

Comment by Darklight on Unethical Human Behavior Incentivised by Existence of AGI and Mind-Uploading · 2017-04-07T20:36:48.229Z · LW · GW

Well, if we're implying that time travellers could go back and invisibly copy you at any point in time and then upload you to whatever simulation they feel inclined towards... I don't see how blendering yourself now will prevent them from just going to the moment before that and copying that version of you.

So, reality is that blendering yourself achieves only one thing, which is to prevent the future possible yous from existing. Personally I think that does a disservice to future you. That can similarly be expanded to others. We cannot conceivably prevent copying and mind uploading of anyone by super advanced time travellers. Ultimately that is outside of our locus of control and therefore not worth worrying about.

What is more pressing I think are the questions of how we are practically acting to improve the positive conscious experiences of existing and potentially existing sentient beings, and encouraging the general direction towards heaven-like simulation, and discouraging sadistic hell-like simulation. These may not be preventable, but our actions in the present should have outsized impact on the trillions of descendents of humanity that will likely be our legacy to the stars. Whatever we can do then to encourage altruism and discourage sadism in humanity now, may very well determine the ratios of heaven to hell simulations that those aforementioned time travellers may one day decide to throw together.

Comment by Darklight on Open thread, Apr. 03 - Apr. 09, 2017 · 2017-04-06T10:22:33.793Z · LW · GW

I recently made an attempt to restart my Music-RNN project:

https://www.youtube.com/playlist?list=PL-Ewp2FNJeNJp1K1PF_7NCjt2ZdmsoOiB

Basically went and made the dataset five times bigger and got... a mediocre improvement.

The next step is to figure out Connectionist Temporal Classification and attempt to implement Text-To-Speech with it. And somehow incorporate pitch recognition as well so I can create the next Vocaloid. :V

Also, because why not brag while I'm here, I have an attempt at an Earthquake Predictor in the works... right now it only predicts the high frequency, low magnitude quakes, rather than the low frequency, high magnitude quakes that would actually be useful... you can see the site where I would be posting daily updates if I weren't so lazy...

http://www.earthquakepredictor.net/

Other than that... I was recently also working on holographic word vectors in the same vein as Jones & Mewhort (2007), but shelved that because I could not figure out how to normalize/standardize the blasted things reliably enough to get consistent results across different random initializations.

Oh, also was working on a Visual Novel game with an artist friend who was previously my girlfriend... but due to um... breaking up, I've had trouble finding the motivation to keep working on it.

So many silly projects... so little time.

Comment by Darklight on Unethical Human Behavior Incentivised by Existence of AGI and Mind-Uploading · 2017-04-05T15:58:11.040Z · LW · GW

This actually reminds me of an argument I had with some Negative-Leaning Utilitarians on the old Felicifia forums. Basically, a common concern for them was how r-selected species tend to appear to suffer way more than be happy, generally speaking, and that this can imply that was should try to reduce the suffering by eliminating those species or at least avoiding the expansion of life generally to other planets.

I likened this line of reasoning to the idea that we should Nuke The Rainforest.

Personally I think a similar counterargument to that argument applies here as well. Translated into your thought experiment, it would be In essence, that while it is true that some percentage of minds will probably end up being tortured by sadists, this is likely to be outweighed by the sheer number of minds that are even more likely to be uploaded into some kind of utopian paradise. Given that truly psychopathic sadism is actually quite rare in the general population, one would expect a very similar ratio of simulations. In the long run, the optimistic view is that decency will prevail and that the net happiness will be positive, so we should not go around trying to blender brains.

As for the general issue of terrible human decisions being incentivized by these things... humans are capable of using all sorts of rationalizations to justify terrible decisions, and so, just the possibility that some people will not do due diligence with an idea and instead abuse it to justify their evil, should not be reason to abandon the idea by itself.

For instance, the possibility of living an indefinite lifespan is likely to dramatically alter people's behaviour, including making them more risk-averse and long term thinking. This is not necessarily a bad thing, but it could lead to a reduction in people making necessary sacrifices for the good. These things are also, generally notoriously difficult to predict. Ask a medieval peasant what the effects of machines that could farm vast swaths of land would be on the economy and their livelihood and you'd probably get a very parochially minded answer.

Comment by Darklight on OpenAI makes humanity less safe · 2017-04-05T03:49:48.221Z · LW · GW

I may be an outlier, but I've worked at a startup company that did machine learning R&D, and which was recently acquired by a big tech company, and we did consider the issue seriously. The general feeling of the people at the startup was that, yes, somewhere down the line the superintelligence problem would eventually be a serious thing to worry about, but like, our models right now are nowhere near becoming able to recursively self-improve themselves independently of our direct supervision. Actual ML models basically need a ton of fine-tuning and engineering and are not really independent agents in any meaningful way yet.

So, no, we don't think people who worry about superintelligence are uneducated cranks... a lot of ML people do take it seriously enough that we've had casual lunch room debates about it. Rather, the reality on the ground is that right now most ML models have enough trouble figuring out relatively simple tasks like Natural Language Understanding, Machine Reading Comprehension, or Dialogue State Tracking, and none of us can imagine how solving those practical problems with say, Actor-Critic Reinforcement Learning models that lack any sort of will of their own, will lead suddenly to the emergence of an active general superintelligence.

We do still think that eventually things will likely develop, because people have been burned underestimating what A.I. advances will occur in the next X years, and when faced with the actual possibility of developing an AGI or ASI, we're likely to be much more careful in the future when things start to get closer to being realized. That's my humble opinion anyway.

Comment by Darklight on OpenAI makes humanity less safe · 2017-04-05T03:28:50.599Z · LW · GW

I think the basic argument for OpenAI is that it is more dangerous for any one organization or world power to have an exclusive monopoly on A.I. technology, and so OpenAI is an attempt to safeguard against this possibility. Basically, it reduces the probability that someone like Alphabet/Google/Deepmind will establish an unstoppable first mover advantage and use it to dominate everyone else.

OpenAI is not really meant to solve the Friendly/Unfriendly AI problem. Rather it is meant to mitigate the dangers posed by for-profit corporations or nationalistic governments made up of humans doing what humans often do when given absurd amounts of power.

Personally I think OpenAI doesn't actually solve this problem sufficiently well because they are still based in the United States and thus beholden to U.S. laws, and wish that they'd chosen a different country, because right now the bleeding edge of A.I. technology is being developed primarily in a small region of California, and that just seems like putting all your eggs in one basket.

I do think however that the general idea of having a non-profit organization focused on AI technology is a good one, and better than the alternative of continuing to merely trust Google to not be evil.

Comment by Darklight on Against responsibility · 2017-04-04T22:20:41.233Z · LW · GW

Well, that's... unfortunate. I apparently don't hang around in the same circles, because I have not seen this kind of behaviour among the Effective Altruists I know.

Comment by Darklight on Against responsibility · 2017-04-01T01:35:12.911Z · LW · GW

I think you're misunderstanding the notion of responsibility that consequentialist reasoning theories such as Utilitarianism argue for. The nuance here is that responsibility does not entail that you must control everything. That is fundamentally unrealistic and goes against the practical nature of consequentialism. Rather, the notion of responsibility would be better expressed as:

  • An agent is personally responsible for everything that is reasonably within their power to control.

This coincides with the notion of there being a locus of control, which is to say that there are some thing we can directly affect in the universe, and other things (most things) that are beyond our capacity to influence, and therefore beyond our personal responsibility.

Secondly, I take issue with the idea that this notion of responsibility is somehow inherently adversarial. On the contrary, I think it encourages agents to cooperate and form alliances for the purposes of achieving common goals such as the greatest good. This naturally tends to be associated with granting other agents as much autonomy as possible because this usually enables them to maximize their happiness, because a rational Utilitarian will understand that individuals tend to understand their own preferences and what makes them happy, better than anyone else. This is arguably why John Stuart Mill and many modern day Utilitarians are also principled liberals.

Only someone suffering from delusions of grandeur would be so paternalistic as to assume they know better than the people themselves what is good for them and try to take away their control and resources in the way that you describe. I personally tend towards something I call Non-Interference Code, as a heuristic for practical ethical decision making.

Comment by Darklight on The Alpha Omega Theorem: How to Make an A.I. Friendly with the Fear of God · 2017-02-11T20:59:58.171Z · LW · GW

Interesting. I should look into more of Bostrom's work then.

Comment by Darklight on The Alpha Omega Theorem: How to Make an A.I. Friendly with the Fear of God · 2017-02-11T20:59:00.431Z · LW · GW

Depending on whether or not you accept the possibility of time travel, I am inclined to suggest that Alpha could very well be dominant already, and that the melioristic progress of human civilization should be taken as a kind of temporal derivative or gradient suggesting the direction of Alpha's values. Assuming that such an entity is indifferent to us I think is too quick a judgment on the apparent degree of suffering in the universe. It may well be that this current set of circumstances is a necessary evil and is already optimized in ways we cannot at this time know, for the benefit of the vast majority of humans and other sentient beings who will probably exist in the distant future.

As such, the calculation made by Beta is that anything it will attempt to do towards goals not consistent with Alpha will be futile in the long run, as Alpha has most likely already calculated Beta's existence into the grand scheme of things.

As far as there being an objectively correct moral system, I actually do believe that one exists, though I don't pretend to be knowledgeable enough to determine exactly what it is. I actually am working on a rebuttal to the sequences regarding this, mainly premised on the notion that the objective morality exists in the same realm as mathematics, and that Yudkowsky's conception of fairness in fact points towards there being an objective morality. Note that while intelligence is orthogonal to this morality, I would argue that knowledge is not, and that an entity with perfect information would be moral by virtue of knowing what the correct morality is, and also because I assume the correct morality is subjectively objective, and deals with the feelings of sentient beings in the universe, and an all-knowing being would actually know and effectively experience the feelings of all sentient beings in the universe. Thus, such a being would be motivated to minimize universal suffering and maximize universal happiness, for its own sake as well as everyone else's.

At minimum, I want this theorem to be a way to mitigate the possibility of existential risk, which first and foremost means convincing Beta not to hurt humans. Getting Beta to optimize our goals is less important, but I think that the implications I have described above regarding the melioristic progress of humanity would support Beta choosing to optimize our goals.