Posts

Why There Is Hope For An Alignment Solution 2024-01-08T06:58:32.820Z
Thoughts On Computronium 2021-03-03T21:52:35.496Z
Darklight's Shortform 2021-02-20T15:01:33.890Z
The Glory System: A Model For Moral Currency And Distributed Self-Moderation 2021-02-19T16:42:48.980Z
As a Washed Up Former Data Scientist and Machine Learning Researcher What Direction Should I Go In Now? 2020-10-19T20:13:44.993Z
The Alpha Omega Theorem: How to Make an A.I. Friendly with the Fear of God 2017-02-11T00:48:35.460Z
Symbolic Gestures – Salutes For Effective Altruists To Identify Each Other 2016-01-20T00:40:43.146Z
[LINK] Sentient Robots Not Possible According To Math Proof 2014-05-14T18:19:59.555Z
Eudaimonic Utilitarianism 2013-09-04T19:43:37.202Z

Comments

Comment by Darklight on Oppression and production are competing explanations for wealth inequality. · 2025-01-05T19:17:11.013Z · LW · GW

This sounds rather like the competing political economic theories of classical liberalism and Marxism to me. Both of these intellectual traditions carry a lot of complicated baggage that can be hard to disentangle from the underlying principles, but you seem to have a done a pretty good job of distilling the relevant ideas in a relatively apolitical manner.

That being said, I don't think it's necessary for these two explanations for wealth inequality to be mutually exclusive. Some wealth could be accumulated through "the means of production" as you call it, or (as I'd rather describe it to avoid confusing it with the classical economic and Marxist meaning) "making useful things for others and getting fair value in exchange".

Other wealth could also, at the same time, be accumulated through exploitation, such as taking advantage of differing degrees of bargaining power to extract value from the worker for less than it should be worth if we were being fair and maybe paying people with something like labour vouchers or a similar time-based accounting. Or stealing through fraudulent financial transactions, or charging rents for things that you just happen to own because your ancestors conquered the land centuries ago with swords.

Both of these things can be true at the same time within an economy. For that matter, the same individual could be doing both in various ways, like they could be ostensibly investing and building companies that make valuable things for people, while at the same time exploiting their workers and taking advantage of their historical position as the descendent of landed aristocracy. They could, at the same time, also be scamming their venture capitalists by wildly exaggerating what their company can do. All while still providing goods and services that meet many people's needs and ways that are more efficient than most possible alternatives, and perhaps the best way possible given the incentives that currently exist.

Things like this tend to be multifaceted and complex. People in general can have competing motivations within themselves, so it would not be strange to expect that in something as convoluted as a society's economy, there could be many reasons for many things. Trying to decide between two possible theories of why, misses the possibility that both theories contain their own grain of truth, and are each, by themselves, incomplete understandings and world models. The world is not just black or white. It's many shades of grey, and also, to push the metaphor further, a myriad of colours that can't accurately be described in greyscale.

Comment by Darklight on RohanS's Shortform · 2025-01-04T21:01:43.292Z · LW · GW

Another thought I just had was, could it be that ChatGPT, because it's trained to be such a people pleaser, is losing intentionally to make the user happy?

Have you tried telling it to actually try to win? Probably won't make a difference, but it seems like a really easy thing to rule out.

Comment by Darklight on RohanS's Shortform · 2025-01-04T20:55:55.861Z · LW · GW

Also, quickly looking into how LLM token sampling works nowadays, you may also need to set the parameters top_p to 0, and top_k to 1 to get it to actually function like argmax. Looks like these can only be set through the API if you're using ChatGPT or similar proprietary LLMs. Maybe I'll try experimenting with this when I find the time, if nothing else to rule out the possibility of such a seemingly obvious thing being missed.

Comment by Darklight on RohanS's Shortform · 2025-01-04T18:37:12.601Z · LW · GW

I've always wondered with these kinds of weird apparent trivial flaws in LLM behaviour if it doesn't have something to do with the way the next token is usually randomly sampled from the softmax multinomial distribution rather than taking the argmax (most likely) of the probabilities. Does anyone know if reducing the temperature parameter to zero so that it's effectively the argmax changes things like this at all?

Comment by Darklight on Darklight's Shortform · 2024-10-20T16:53:31.926Z · LW · GW

p = (n^c * (c + 1)) / (2^c * n)

As far as I know, this is unpublished in the literature. It's a pretty obscure use case, so that's not surprising. I have doubts I'll ever get around to publishing the paper I wanted to write that uses this in an activation function to replace softmax in neural nets, so it probably doesn't matter much if I show it here.

Comment by Darklight on Darklight's Shortform · 2024-10-20T15:59:38.001Z · LW · GW

So, my main idea is that the principle of maximum entropy aka the principle of indifference suggests a prior of 1/n where n is the number of possibilities or classes. P x 2 - 1 leads to p = 0.5 for c = 0. What I want is for c = 0 to lead to p = 1/n rather than 0.5, so that it works in the multiclass cases where n is greater than 2.

Comment by Darklight on Darklight's Shortform · 2024-10-20T13:34:52.624Z · LW · GW

Correlation space is between -1 and 1, with 1 being the same (definitely true), -1 being the opposite (definitely false), and 0 being orthogonal (very uncertain). I had the idea that you could assume maximum uncertainty to be 0 in correlation space, and 1/n (the uniform distribution) in probability space.

Comment by Darklight on Darklight's Shortform · 2024-10-19T21:30:02.244Z · LW · GW

I tried asking ChatGPT, Gemini, and Claude to come up with a formula that converts between correlation space to probability space while preserving the relationship 0 = 1/n. I came up with such a formula a while back, so I figure it shouldn't be hard. They all offered formulas, all of which were shown to be very much wrong when I actually graphed them to check.

Comment by Darklight on Darklight's Shortform · 2024-10-04T16:31:38.396Z · LW · GW

I was not aware of these. Thanks!

Comment by Darklight on Darklight's Shortform · 2024-10-04T16:31:18.471Z · LW · GW

Thanks for the clarifications. My naive estimate is obviously just a simplistic ballpark figure using some rough approximations, so I appreciate adding some precision.

Comment by Darklight on Darklight's Shortform · 2024-10-03T15:28:00.255Z · LW · GW

Also, even if we can train and run a model the size of the human brain, it would still be many orders of magnitude less energy efficient than an actual brain. Human brains use barely 20 watts. This hypothetical GPU brain would require enormous data centres of power, and each H100 GPU uses 700 watts alone.

Comment by Darklight on Darklight's Shortform · 2024-10-03T15:04:13.753Z · LW · GW

I've been looking at the numbers with regards to how many GPUs it would take to train a model with as many parameters as the human brain has synapses. The human brain has 100 trillion synapses, and they are sparse and very efficiently connected. A regular AI model fully connects every neuron in a given layer to every neuron in the previous layer, so that would be less efficient.

The average H100 has 80 GB of VRAM, so assuming that each parameter is 32 bits, then you have about 20 billion per GPU. So, you'd need 10,000 GPUs to fit a single instance of a human brain in RAM, maybe. If you assume inefficiencies and need to have data in memory as well you could ballpark another order of magnitude so 100,000 might be needed.

For comparison, it's widely believed that OpenAI trained GPT4 on about 10,000 A100s that Microsoft let them use from their Azure supercomputer, most likely the one listed as third most powerful in the world by the Top500 list.

Recently though, Microsoft and Meta have both moved to acquire more GPUs that put them in the 100,000 range, and Elon Musk's X.ai recently managed to get a 100,000 H100 GPU supercomputer online in Memphis.

So, in theory at least, we are nearly at the point where they can train a human brain sized model in terms of memory. However, keep in mind that training such a model would take a ton of compute time. I haven't done to calculations yet for FLOPS so I don't know if it's feasible yet.

Just some quick back of the envelope analysis.

Comment by Darklight on Darklight's Shortform · 2024-10-03T13:55:55.913Z · LW · GW

I ran out of the usage limit for GPT-4o (seems to just be 10 prompts every 5 hours) and it switched to GPT-4o-mini. I tried asking it the Alpha Omega question and it made some math nonsense up, so it seems like the model matters for this for some reason.

Comment by Darklight on Darklight's Shortform · 2024-09-21T00:34:24.644Z · LW · GW

So, a while back I came up with an obscure idea I called the Alpha Omega Theorem and posted it on the Less Wrong forums. Given how there's only one post about it, it shouldn't be something that LLMs would know about. So in the past, I'd ask them "What is the Alpha Omega Theorem?", and they'd always make up some nonsense about a mathematical theory that doesn't actually exist. More recently, Google Gemini and Microsoft Bing Chat would use search to find my post and use that as the basis for their explanation. However, I only have the free version of ChatGPT and Claude, so they don't have access to the Internet and would make stuff up.

A couple days ago I tried the question on ChatGPT again, and GPT-4o managed to correctly say that there isn't a widely known concept of that name in math or science, and basically said it didn't know. Claude still makes up a nonsensical math theory. I also today tried telling Google Gemini not to use search, and it also said it did not know rather than making stuff up.

I'm actually pretty surprised by this. Looks like OpenAI and Google figured out how to reduce hallucinations somehow.

Comment by Darklight on Darklight's Shortform · 2024-05-24T15:18:42.144Z · LW · GW

I'm wondering what people's opinions are on how urgent alignment work is. I'm a former ML scientist who previously worked at Maluuba and Huawei Canada, but switched industries into game development, at least in part to avoid contributing to AI capabilities research. I tried earlier to interview with FAR and Generally Intelligent, but didn't get in. I've also done some cursory independent AI safety research in interpretability and game theoretic ideas my spare time, though nothing interesting enough to publish yet.

My wife also recently had a baby, and caring for him is a substantial time sink, especially for the next year until daycare starts. Is it worth considering things like hiring a nanny, if it'll free me up to actually do more AI safety research? I'm uncertain if I can realistically contribute to the field, but I also feel like AGI could potentially be coming very soon, and maybe I should make the effort just in case it makes some meaningful difference.

Comment by Darklight on Open Thread Spring 2024 · 2024-05-10T20:27:07.955Z · LW · GW

Thanks for the reply!

So, the main issue I'm finding with putting them all into one proposal is that there's a 1000 character limit on the main summary section where you describe the project, and I cannot figure out how to cram multiple ideas into that 1000 characters without seriously compromising the quality of my explanations for each.

I'm not sure if exceeding that character limit will get my proposal thrown out without being looked at though, so I hesitate to try that. Any thoughts?

Comment by Darklight on Cooperation is optimal, with weaker agents too  -  tldr · 2024-05-08T20:32:25.607Z · LW · GW

I already tried discussing a very similar concept I call Superrational Signalling in this post. It got almost no attention, and I have doubts that Less Wrong is receptive to such ideas.

I also tried actually programming a Game Theoretic simulation to try to test the idea, which you can find here, along with code and explanation. Haven't gotten around to making a full post about it though (just a shortform).

Comment by Darklight on Open Thread Spring 2024 · 2024-04-30T14:25:37.943Z · LW · GW

So, I have three very distinct ideas for projects that I'm thinking about applying to the Long Term Future Fund for. Does anyone happen to know if it's better to try to fit them all into one application, or split them into three separate applications?

Comment by Darklight on Darklight's Shortform · 2024-03-10T19:01:44.156Z · LW · GW

Recently I tried out an experiment using the code from the Geometry of Truth paper to try to see if using simple label words like "true" and "false" could substitute for the datasets used to create truth probes. I also tried out a truth probe algorithm based on classifying with the higher cosine similarity to the mean vectors.

Initial results seemed to suggest that the label word vectors were sorta acceptable, albeit not nearly as good (around 70% accurate rather than 95%+ like with the datasets). However, testing on harder test sets showed much worse accuracy (sometimes below chance, somehow). So I can probably conclude that the label word vectors alone aren't sufficient for a good truth probe.

Interestingly, the cosine similarity approach worked almost identically well as the mass mean (aka difference in means) approach used in the paper. Unlike the mass mean approach though, the cosine similarity approach can be extended to a multi-class situation. Though, logistic regression can also be extended similarly, so it may not be particularly useful either, and I'm not sure there's even a use case for a multi-class probe. 

Anyways, I just thought I'd write up the results here in the unlikely event someone finds this kind of negative result as useful information.

Comment by Darklight on Darklight's Shortform · 2024-01-28T23:19:06.479Z · LW · GW

Update: I made an interactive webpage where you can run the simulation and experiment with a different payoff matrix and changes to various other parameters.

Comment by Darklight on Darklight's Shortform · 2024-01-22T15:25:24.395Z · LW · GW

So, I adjusted the aggressor system to work like alliances or defensive pacts instead of a universal memory tag. Basically, now players make allies when they both cooperate and aren't already enemies, and make enemies when defected against first, which sets all their allies to also consider the defector an enemy. This, doesn't change the result much. The alliance of nice strategies still wins the vast majority of the time.

I also tried out false flag scenarios where 50% of the time the victim of a defect first against non-enemy will actually be mistaken for the attacker. This has a small effect. There is a slight increase in the probability of an Opportunist strategy winning, but most of the time the alliance of nice strategies still wins, albeit with slightly fewer survivors on average.

My guess for why this happens is that nasty strategies rarely stay in alliances very long because they usually attack a fellow member at some point, and eventually, after sufficient rounds one of their false flag attempts will fail and they will inevitably be kicked from the alliance and be retaliated against.

The real world implications of this remain that it appears that your best bet of surviving in the long run as a person or civilization is to play a nice strategy, because if you play a nasty strategy, you are much less likely to survive in the long run.

In the limit, if the nasty strategies win, there will only be one survivor, dog eat dog highlander style, and your odds of being that winner are 1/N, where N is the number of players. On the other hand, if you play a nice strategy, you increase the strength of the nice alliance, and when the nice alliance wins as it usually does, you're much more likely to be a survivor and have flourished together.

My simulation currently by default has 150 players, 60 of which are nice. On average about 15 of these survive to round 200, which is a 25% survival rate. This seems bad, but the survival rate of nasty strategies is less than 1%. If I switch the model to use 50 Avengers and 50 Opportunists, on average 25 Avengers survive to zero Opportunists, a 50% survival rate for the Avengers.

Thus, increasing the proportion of starting nice players increases the odds of nice players surviving, so there is an incentive to play nice.

Comment by Darklight on Darklight's Shortform · 2024-01-15T22:28:01.098Z · LW · GW

Admittedly this is a fairly simple set up without things like uncertainty and mistakes, so yes, it may not really apply to the real world. I just find it interesting that it implies that strong coordinated retribution can, at least in this toy set up, be useful for shaping the environment into one where cooperation thrives, even after accounting for power differentials and the ability to kill opponents outright, which otherwise change the game enough that straight Tit-For-Tat doesn't automatically dominate.

It's possible there are some situations where this may resemble the real world. Like, if you ignore mere accusations and focus on just actual clear cut cases where you know the aggression has occurred, such as with countries and wars, it seems to resemble how alliances form and retaliation occurs when anybody in the alliance is attacked?

I personally also see it as relevant for something like hypothetical powerful alien AGIs that can see everything that happens from space, and so there could be some kind of advanced game theoretic coordination at a distance with this. Though that admittedly is highly speculative.

It would be nice though if there was a reason to be cooperative even to weaker entities as that would imply that AGI could possibly have game theoretic reasons not to destroy us.

Comment by Darklight on Darklight's Shortform · 2024-01-15T17:58:33.385Z · LW · GW

Okay, so I decided to do an experiment in Python code where I modify the Iterated Prisoner's Dilemma to include Death, Asymmetric Power, and Aggressor Reputation, and run simulations to test how different strategies do. Basically, each player can now die if their points falls to zero or below, and the payoff matrix uses their points as a variable such that there is a power difference that affects what happens. Also, if a player defects first in any round of any match against a non-aggressor, they get the aggressor label, which matters for some strategies that target aggressors. 

Long story short, there's a particular strategy I call Avenger, which is Grim Trigger but also retaliates against aggressors (even if the aggression was against a different player) that ensures that the cooperative strategies (ones that never defect first against a non-aggressor) win if the game goes enough rounds. Without Avenger though, there's a chance that a single Opportunist strategy player wins instead. Opportunist will Defect when stronger and play Tit-For-Tat otherwise.

I feel like this has interesting real world implications.

Interestingly, Enforcer, which is Tit-For-Tat but also opens with Defect against aggressors, is not enough to ensure the cooperative strategies always win. For some reason you need Avenger in the mix.

Edit: In case anyone wants the code, it's here.

Comment by Darklight on Darklight's Shortform · 2024-01-15T17:57:03.801Z · LW · GW

I was recently trying to figure out a way to calculate my P(Doom) using math. I initially tried just making a back of the envelope calculation by making a list of For and Against arguments and then dividing the number of For arguments by the total number of arguments. This led to a P(Doom) of 55%, which later got revised to 40% when I added more Against arguments. I also looked into using Bayes Theorem and actual probability calculations, but determining P(E | H) and P(E) to input into P(H | E) = P(E | H) * P(H) / P(E) is surprisingly hard and confusing.

Comment by Darklight on Apologizing is a Core Rationalist Skill · 2024-01-02T20:45:17.226Z · LW · GW

Minor point, but the apology needs to sound sincere and credible, usually by being specific about the mistakes and concise and to the point and not like, say, Bostrom's defensive apology about the racist email a while back. Otherwise you can instead signal that you are trying to invoke the social API call in a disingenuous way, which can clearly backfire.

Things like "sorry you feel offended" also tend to sound like you're not actually remorseful for your actions and are just trying to elicit the benefits of an apology. None of the apologies you described sound anything like that, but it's a common failure state among the less emotionally mature and the syncophantic.

Comment by Darklight on Darklight's Shortform · 2023-12-27T19:22:00.449Z · LW · GW

I have some ideas and drafts for posts that I've been sitting on because I feel somewhat intimidated by the level of intellectual rigor I would need to put into the final drafts to ensure I'm not downvoted into oblivion (something a younger me experienced in the early days of Less Wrong).

Should I try to overcome this fear, or is it justified?

For instance, I have a draft of a response to Eliezer's List of Lethalities post that I've been sitting on since 2022/04/11 because I doubted it would be well received given that it tries to be hopeful and, as a former machine learning scientist, I try to challenge a lot of LW orthodoxy about AGI in it. I have tremendous respect for Eliezer though, so I'm also uncertain if my ideas and arguments aren't just hairbrained foolishness that will be shot down rapidly once exposed to the real world, and the incisive criticism of Less Wrongers.

The posts here are also now of such high quality that I feel the bar is too high for me to meet with my writing, which tends to be more "interesting train-of-thought in unformatted paragraphs" than the "point-by-point articulate with section titles and footnotes" style that people tend to employ.

Anyone have any thoughts?

Comment by Darklight on Could induced and stabilized hypomania be a desirable mental state? · 2023-06-14T18:35:03.685Z · LW · GW

I would be exceedingly cautious about this line of reasoning. Hypomania tends to not be sustainable, with a tendency to either spiral into a full blown manic episode, or to exhaust itself out and lead to an eventual depressive episode. This seems to have something to do with the characteristics of the thoughts/feelings/beliefs that develop while hypomanic, the cognitive dynamics if you will. You'll tend to become increasingly overconfident and positive to the point that you will either start to lose contact with reality by ignoring evidence to the contrary of what you think is happening (because you feel like everything is awesome so it must be), or reality will hit you hard when the good things that you expect to happen, don't, and you update accordingly (often overcompensating in the process).

In that sense, it's very hard to stay "just" hypomanic. And honestly, to my knowledge, most psychiatrists are more worried about potential manic episodes than anything else in bipolar disorder, and will put you on enough antipsychotics to make you a depressed zombie to prevent them, because generally speaking the full on psychosis level manic episodes are just more dangerous for everyone involved.

Ideally, I think your mood should fit your circumstances. Hypomania often shows up as inappropriately high positive mood even in situations where it makes little sense to be so euphoric, and that should be a clear indicator of why it can be problematic.

It can be tempting to want to stay in some kind of controlled hypomania, but in reality, this isn't something that to my knowledge is doable with our current science and technology, at least for people with actual bipolar disorder. It's arguable that for individuals with normally stable mood, putting them on stimulants could have a similar effect as making them a bit hypomanic (not very confident about this though). Giving people with bipolar disorder stimulants that they don't otherwise need on the other hand is a great way to straight up induce mania, so I definitely wouldn't recommend that.

Comment by Darklight on Yoshua Bengio: How Rogue AIs may Arise · 2023-05-24T17:14:40.810Z · LW · GW

I still remember when I was a masters student presenting a paper at the Canadian Conference on AI 2014 in Montreal and Bengio was also at the conference presenting a tutorial, and during the Q&A afterwards, I asked him a question about AI existential risk. I think I worded it back then as concerned about the possibility of Unfriendly AI or a dangerous optimization algorithm or something like that, as it was after I'd read the sequences but before "existential risk" was popularized as a term. Anyway, he responded by asking jokingly if I was a journalist, and then I vaguely recall him giving a hedged answer about how current AI was still very far away from those kinds of concerns.

It's good to see he's taking these concerns a lot more seriously these days. Between him and Hinton, we have about half of the Godfathers of AI (missing LeCun and Schmidhuber if you count him as one of them) showing seriousness about the issue. With any luck, they'll push at least some of their networks of top ML researchers into AI safety, or at the very least make AI safety more esteemed among the ML research community than before.

Comment by Darklight on How Does the Human Brain Compare to Deep Learning on Sample Efficiency? · 2023-01-15T22:52:01.797Z · LW · GW

The average human lifespan is about 70 years or approximately 2.2 billion seconds. The average human brain contains about 86 billion neurons or roughly 100 trillion synaptic connections. In comparison, something like GPT-3 has 175 billion parameters and 500 billion tokens of data. Assuming very crudely weight/synapse and token/second of experience equivalence, we can see that the human model's ratio of parameters to data is much greater than GPT-3, to the point that humans have significantly more parameters than timesteps (100 trillion to 2.2 billion), while GPT-3 has significantly fewer parameters than timesteps (175 billion to 500 billion). Given the information gain per timestep is different for the two models, but as I said, these are crude approximations meant to convey the ballpark relative difference.

This means basically that humans are much more prone to overfitting the data, and in particular, memorizing individual data points. Hence why humans experience episodic memory of unique events. It's not clear that GPT-3 has the capacity in terms of parameters to memorize its training data with that level of clarity, and arguably this is why such models seem less sample efficient. A human can learn from a single example by memorizing it and retrieving it later when relevant. GPT-3 has to see it enough times in the training data for SGD to update the weights sufficiently that the general concept is embedded in the highly compressed information model.

It's thus, not certain whether or not existing ML models are sample inefficient because of the algorithms being used, or if its because they just don't have enough parameters yet, and increased efficiency will emerge from scaling further.

Comment by Darklight on Darklight's Shortform · 2022-09-05T17:46:56.819Z · LW · GW

I recently interviewed with Epoch, and as part of a paid work trial they wanted me to write up a blog post about something interesting related to machine learning trends. This is what I came up with:

http://www.josephius.com/2022/09/05/energy-efficiency-trends-in-computation-and-long-term-implications/

Comment by Darklight on What does moral progress consist of? · 2022-08-20T22:29:50.447Z · LW · GW

I should point out that the logic of the degrowth movement follows from a relatively straightforward analysis of available resources vs. first world consumption levels.  Our world can only sustain 7 billion human beings because the vast majority of them live not at first world levels of consumption, but third world levels, which many would argue to be unfair and an unsustainable pyramid scheme.  If you work out the numbers, if everyone had the quality of life of a typical American citizen, taking into account things like meat consumption to arable land, energy usage, etc., then the Earth would be able to sustain only about 1-3 billion such people.  Degrowth thus follows logically if you believe that all the people around the world should eventually be able to live comfortable, first world lives.

I'll also point out that socialism is, like liberalism, a child of the Enlightenment and general beliefs that reason and science could be used to solve political and economic problems.  Say what you will about the failed socialist experiments of the 20th century, but the idea that government should be able to engineer society to function better than the ad-hoc arrangement that is capitalism, is very much an Enlightenment rationalist, materialist, and positivist position that can be traced to Jean-Jacques Rousseau, Charles Fourier, and other philosophes before Karl Marx came along and made it particularly popular.  Marxism in particular, at least claims to be "scientific socialism", and historically emphasized reason and science, to the extent that most Marxist states were officially atheist (something you might like given your concerns about religions).

In practice, many modern social policies, such as the welfare state, Medicare, public pensions, etc., are heavily influenced by socialist thinking and put in place in part as a response by liberal democracies to the threat of the state socialist model during the Cold War.  No country in the world runs on laissez-faire capitalism, we all utilize mixed market economies with varying degrees of public and private ownership.  The U.S. still has a substantial public sector, just as China, an ostensibly Marxist Leninist society in theory, has a substantial private sector (albeit with public ownership of the "commanding heights" of the economy).  It seems that all societies in the world eventually compromised in similar ways to achieve reasonably functional economies balanced with the need to avoid potential class conflict.  This convergence is probably not accidental.

If you're truly more concerned with truth seeking than tribal affiliations, you should be aware of your own tribe, which as far as I can tell, is western, liberal, and democratic.  Even if you honestly believe in the moral truth of the western liberal democratic intellectual tradition, you should still be aware that it is, in some sense, a tribe.  A very powerful one that is arguably predominant in the world right now, but a tribe nonetheless, with its inherent biases (or priors at least) and propaganda.

Just some thoughts.

Comment by Darklight on Thoughts On Computronium · 2022-06-17T16:29:16.111Z · LW · GW

I'm using the number calculated by Ray Kurzweil for his book, the Age of Spiritual Machines from 1999.  To get that figure, you need 100 billion neurons firing every 5 ms, or 200 Hz.  That is based on the maximum firing rate given refractory periods.  In actuality, average firing rates are usually lower than that, so in all likelihood the difference isn't actually six orders of magnitude.  In particular, I should point out that six orders of magnitude is referring to the difference between this hypothetical maximum firing brain and the most powerful supercomputer, not the most energy efficient supercomputer.

The difference between the hypothetical maximum firing brain and the most energy efficient supercomputer (at 26 GigaFlops/watt) is only three orders of magnitude.  For the average brain firing at the speed that you suggest, it's probably closer to two orders of magnitude.  Which would mean that the average human brain is probably one order of magnitude away from the Landauer limit.

This also assumes that its neurons and not synapses that should be the relevant multiplier.

Comment by Darklight on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-12T15:08:49.546Z · LW · GW

Okay, so I contacted 80,000 hours, as well as some EA friends for advice.  Still waiting for their replies.

I did hear from an EA who suggested that if I don't work on it, someone else who is less EA-aligned will take the position instead, so in fact, it's slightly net positive for myself to be in the industry, although I'm uncertain whether or not AI capability is actually funding constrained rather than personal constrained.

Also, would it be possible to mitigate the net negative by choosing to deliberately avoid capability research and just take an ML engineering job at a lower tier company that is unlikely to develop AGI before others and just work on applying existing ML tech to solving practical problems?

Comment by Darklight on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-09T19:47:29.602Z · LW · GW

I previously worked as a machine learning scientist but left the industry a couple of years ago to explore other career opportunities.  I'm wondering at this point whether or not to consider switching back into the field.  In particular, in case I cannot find work related to AI safety, would working on something related to AI capability be a net positive or net negative impact overall?

Comment by Darklight on Thoughts On Computronium · 2021-03-04T02:27:24.945Z · LW · GW

Even further research shows the most recent Nvidia RTX 3090 is actually slightly more efficient than the 1660 Ti, at 36 TeraFlops, 350 watts, and 2.2 kg, which works out to 0.0001 PetaFlops/Watt and 0.016 PetaFlops/kg.  Once again, they're within an order of magnitude of the supercomputers.

Comment by Darklight on Thoughts On Computronium · 2021-03-04T01:56:23.595Z · LW · GW

So, I did some more research, and the general view is that GPUs are more power efficient in terms of Flops/watt than CPUs, and the most power efficient of those right now is the Nvidia 1660 Ti, which comes to 11 TeraFlops at 120 watts, so 0.000092 PetaFlops/Watt, which is about 6x more efficient than Fugaku.  It also weighs about 0.87 kg, which works out to 0.0126 PetaFlops/kg, which is about 7x more efficient than Fugaku.  These numbers are still within an order of magnitude, and also don't take into account the overhead costs of things like cooling, case, and CPU/memory required to coordinate the GPUs in the server rack that one would assume you would need.

I used the supercomputers because the numbers were a bit easier to get from the Top500 and Green500 lists, and I also thought that their numbers include the various overhead costs to run the full system, already packaged into neat figures.

Comment by Darklight on Darklight's Shortform · 2021-02-20T15:09:01.110Z · LW · GW

Another thought is that maybe Less Wrong itself, if it were to expand in size and become large enough to roughly represent humanity, could be used as such a dataset.

Comment by Darklight on Darklight's Shortform · 2021-02-20T15:01:34.314Z · LW · GW

So, I had a thought.  The glory system idea that I posted about earlier, if it leads to a successful, vibrant democratic community forum, could actually serve as a kind of dataset for value learning.  If each post has a number attached to it that indicates the aggregated approval of human beings, this can serve as a rough proxy for a kind of utility or Coherent Aggregated Volition.

Given that individual examples will probably be quite noisy, but averaged across a large amount of posts, it could function as a real world dataset, with the post content being the input, and the post's vote tally being the output label.  You could then train a supervised learning classifier or regressor that could then be used to guide a Friendly AI model, like a trained conscience.

This admittedly would not be provably Friendly, but as a vector of attack for the value learning problem, it is relatively straightforward to implement and probably more feasible in the short-run than anything else I've encountered.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T21:18:20.910Z · LW · GW

A further thought is that those with more glory can be seen almost as elected experts.  Their glory is assigned to them by votes after all.  This is an important distinction from an oligarchy.  I would actually be inclined to see the glory system as located on a continuum between direct demcracy and representative democracy.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T21:00:22.655Z · LW · GW

So, keep in mind that by having the first vote free and worth double the paid votes does tilt things more towards democracy.  That being said, I am inclined to see glory as a kind of proxy for past agreement and merit, and a rough way to approximate liquid democracy where you can proxy your vote to others or vote yourself.

In this alternative "market of ideas" the ideas win out because people who others trust to have good opinions are able to leverage that trust.  Decisions over the merit of the given arguments are aggregated by vote.  As long as the population is sufficiently diverse, this should result in an example of the Wisdom of Crowds phenomenon.

I don't think it'll dissolve into a mere flag waving contest, anymore than the existing Karma system on Reddit and Less Wrong does already.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T18:25:15.896Z · LW · GW

Perhaps a nitpick detail, but having someone rob them would not be equivalent, because the cost of the action is offset by the ill-gotten gains.  The proposed currency is more directly equivalent to paying someone to break into the target's bank account and destroying their assets by a proportional amount so that no one can use them anymore.

As for the more general concerns:

Standardized laws and rules tend in practice to disproportionately benefit those with the resources to bend and manipulate those rules with lawyers.  Furthermore, this proposal does not need to replace all laws, but can be utilized alongside them as a way for people to show their disapproval in a way that is more effective that verbal insult, and less coercive than physical violence.  I'd consider it a potential way to channel people's anger so that they don't decide to start a revolution against what they see as laws that benefit the rich and powerful.  It is a way to distribute a little power to individuals and allow them to participate in a system that considers their input in a small but meaningful way.

The rules may be more consistent with laws, but in practice, they are also contentious in the sense that the process of creating these laws is arcane and complex and the resulting punishments often delayed for years as they work through the legal system.  Again, this makes sense when determining how the coercive power of the state should be applied, but leaves something to be desired in terms of responsiveness to addressing real world concerns.

Third-party enforcement is certainly desirable.  In practice, the glory system allows anyone outside the two parties to contribute and likely the bulk of votes will come from them.  As for cycles of violence, the exchange rate mechanism means that defence is at least twice as effective as attack with the same amount of currency, which should at least mitigate the cycles because it won't be cost-effective to attack without significant public support.  Though this is only relevant to the forum condition.

In the general condition as a currency, keep in mind that as a currency functions as a store of value, there is a substantial opportunity cost to spending the currency to destroy other people's currency rather than say, using it to accrue interest.  The cycles are in a sense self-limiting because people won't want to spend all their money escalating a conflict that will only cause both sides to hemorrhage funds, unless someone feels so utterly wronged as to be willing to go bankrupt to bankrupt another, in which case, one should honestly be asking what kind of injustice caused this situation to come into being in the first place.

All that being said, I appreciate the critiques.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T17:51:34.274Z · LW · GW

As for the cheaply punishing prolific posters problem, I don't know a good solution that doesn't lead to other problems, as forcing all downvotes to cost glory makes it much harder to deal with spammers who somehow get through the application process filter.  I had considered an alternative system in which all votes cost glory, but then there's no way to generate glory except perhaps by having admins and mods gift them, which could work, but runs counter to the direct democracy ideal that I was sorta going for.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T17:37:44.308Z · LW · GW

What I meant was you could farm upvotes on your posts.  Sorry.  I'll edit it for clarity.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T17:34:15.864Z · LW · GW

And further to clarify, you'd both be able to gift glory and also spend glory to destroy other people's glory, at the mentioned exchange rate.

The way glory is introduced into the system is that any given post allows everyone one free vote on them that costs no glory.

Comment by Darklight on The Glory System: A Model For Moral Currency And Distributed Self-Moderation · 2021-02-19T17:32:25.244Z · LW · GW

So, I guess I should clarify, the idea is that you can both gift glory, which is how you gain the ability to post, and also you gain or lose glory based on people's upvotes and downvotes on your posts.

Comment by Darklight on As a Washed Up Former Data Scientist and Machine Learning Researcher What Direction Should I Go In Now? · 2020-10-19T22:05:36.402Z · LW · GW

I have been able to land interviews at a rate of about 8/65 or 12% of the positions I apply to.  My main assumption is that the timing of COVID-19 is bad, and I'm also only looking at positions in my geographical area of Toronto.  It's also possible that I was overconfident early on and didn't prep enough for the interviews I got, which often involved general coding challenges that depended on data structures and algorithms that I hadn't studied since undergrad, as well as ML fundamentals for things like PCA that I hadn't touched in a long time as my research work has been deep learning focused.

As for corporate politics and how to handle them rationally, I'm not entirely sure I can be much help, as to be honest, I'm not entirely clear on what happened to cause the situation that I got myself into.

Perhaps the thing I could suggest is to be tactful and avoid giving people an excuse or opportunity to side line you, and never assume that you can work with anyone without issue, because toxic or hostile managers especially can make you miserable and prevent you from being successful, and noticing such people in advance and avoiding having to depend on their performance appraisals is probably a good idea.

Most people in business seem focused on performing and getting results, and some of them are wary of others who could overtake them, and so you need to balance showing your value with not seeming threatening to their position.  I was in an awkward position that my immediate manager and I didn't get along, but the director of the department who originally hired me protected me from too much reprisal.  However, he needed me to perform better to be able to advocate for me effectively, and it was difficult to do so under the person I was directly under.

Such situations can arise and get quite complicated.  I wish I could say you can use the tools of rationality to reason with anyone and convince them to work cooperatively on team goals, but I found that some people are less amenible than others.  Furthermore, if someone makes an attack against you in corporate politics, chances are you won't see it coming, using a subordinate to strike indirectly, and those involved will straight up ignore your communications or give you the runaround in such a way that you won't be sure who is actually responsible for what.  Many meetings are behind closed doors, and there is a clear limit to the information you will have relative to your superiors, which can make it difficult to defend yourself even if you know something is going on.

I guess another thing I can add is that probably a large part of why I was able to avoid being fired was that I had substantial documentation, including a detailed research journal, and a spreadsheet of my working hours to back me up.  When trying to be a rational and honest worker in the corporate world, a paper trail is protection and a good way to ensure that the compliance department and HR will be on your side when it counts.

Also, beware that if you let certain types of people get away with one seemingly small thing, they will see that as weakness and that you are exploitable.  Know your boundaries and the regulations of the company.  Bullies are not just a schoolyard problem, but in the office, they're much smarter and know how to get away with things.  Sometimes these people are also good enough at their jobs that you will not be able to do anything to them because the company needs what they provide.  That is life.  Pick your battles and don't allow unfair situations and difficulties to make you lose sleep and perform worse.  Do the best you can to do your job well, such that you are beyond rapproach if possible.  Be aware that things can spiral.  If you lose sleep over something that happened, and this makes you late for work the next day, you've given your detractors ammunition.

That's all I can think of right now.

Edit:  As an example of how clever other people can be at office politics, I was once put in a kind of double bind or trap situation that was similar to a fork in Chess.  Basically, I was told by a manager not to push some code into a repository, ostensibly because we'd just given privileges to someone who had been hired by a different department and who we suspected might steal the code for that department (there's a horse race culture at the corporation).  Here's the thing, if I did what he told me to, this repo would be empty and I'd have no independent evidence that my current project had made any progress, leaving me vulnerable to him accusing me of not doing work, or he could deny that he told me not to put in the code, making it look like I was concealing stuff from the company.  If I refused to go along and instead pushed the code, I would be insubordinate and disloyal to my department and his managers, who he claimed had told him to tell me what to do.

Comment by Darklight on Any Christians Here? · 2017-06-15T00:07:43.188Z · LW · GW

Actually, apparently I forgot about the proper term: Utilitronium

Comment by Darklight on Any Christians Here? · 2017-06-14T01:04:34.654Z · LW · GW

I would urge you to go learn about QM more. I'm not going to assume what you do/don't know, but from what I've learned about QM there is no argument for or against any god.

Strictly speaking it's not something that is explicitly stated, but I like to think that the implication flows from a logical consideration of what MWI actually entails. Obviously MWI is just one of many possible alternatives in QM as well, and the Copenhagen Interpretation obviously doesn't suggest anything.

This also has to due with the distance between the moon and the earth and the earth and the sun. Either or both could be different sizes, and you'd still get a full eclipse if they were at different distances. Although the first test of general relativity was done in 1919, it was found later that the test done was bad, and later results from better replications actually provided good enough evidence. This is discussed in Stephen Hawking's A Brief History of Time.

The point is that they are a particular ratio that makes them ideal for these conditions, when they could have easily been otherwise, and that these are exceptionally convenient coincidences for humanity.

There are far more stars than habitable worlds. If you're going to be consistent with assigning probabilities, then by looking at the probability of a habitable planet orbiting a star, you should conclude that it is unlikely a creator set up the universe to make it easy or even possible to hop planets.

The stars also make it possible for us to use telescopes to identify which planets are in the habitable zone. It remains much more convenient than if all star systems were obscured by a cloud of dust, which I can easily imagine being the norm in some alternate universe.

Right, the sizes of the moon and sun are arbitrary. We could easily live on a planet with no moon, and have found other ways to test General Relativity. No appeal to any form of the Anthropic Principle is needed. And again with the assertion about habitable planets: the anthropic principle (weak) would only imply that to see other inhabitable planets, there must be an inhabitable planet from which someone is observing.

Again, the point is that these are very notable coincidences that would be more likely to occur in a universe with some kind of advanced ordering.

So you didn't provide any evidence for any god; you just committed a logical fallacy of the argument from ignorance.

When I call this evidence, I am using it in the probabilistic sense, that the probability of the evidence given the hypothesis is higher than the probability of the evidence by itself. Even though these things could be coincidences, they are more likely to occur in a controlled universe meant for habitation by sentient beings. In that sense I consider this evidence.

I don't know why you bring up the argument from ignorance. I haven't proclaimed that this evidence conclusively proves anything. Evidence is not proof.

The way I view the universe, everything you state is still valid. I see the universe as a period of asymmetry, where complexity is allowed to clump together, but it clumps in regular ways defined by rules we can discover and interpret.

Why though? Why isn't the universe simply chaos without order? Why is it consistent such that the spacetime metric is meaningful? The structure and order of reality itself strikes me as peculiar given all the possible configurations that one can imagine. Why don't things simply burst into and out of existence? Why do cause and effect dominate reality as they do? Why does the universe have a beginning and such uneven complexity rather than just existing forever as a uniform Bose-Einstein condensate of near zero state, low entropy particles?

To me, the mark of a true rationalist is an understanding of the nature of truth. And the truth is that the truth is uncertain. I don't pretend like the interesting coincidences are proof of God. To be intellectually honest, I don't know that there is a God. I don't know that the universe around me isn't just a simulation I'm being fed either though. Ultimately we have to trust our senses and our reasoning, and accept tentatively some beliefs as more likely than others, and act accordingly. The mark of a good rationalist is a keen awareness of their own limited degree of awareness of the truth. It is a kind of humility that leads to an open mind and a willingness to consider all possibilities, weighed according to the probability of the evidence associated with them.

Comment by Darklight on Any Christians Here? · 2017-06-14T00:26:39.926Z · LW · GW

Interesting, what is that?

The idea of theistic evolution is simply that evolution is the method by which God created life. It basically says, yes, the scientific evidence for natural selection and genetic mutation is there and overwhelming, and accepts these as valid, while at the same time positing that God can still exist as the cause that set the universe and evolution in motion through putting in place the Laws of Nature. It requires not taking the six days thing in the Bible literally, but rather metaphorically as being six eons of time, or some such. The fact that sea creatures precede land creatures precede humans suggests that the general order described in scripture is consistent with established science as well.

Are you familiar with the writings of Frank J. Tipler?

I have heard of Tipler and his writings, though I have yet to actually read his books.

That would be computronium-based I suppose.

Positronium in this case means "Positive Computronium" yes.

Comment by Darklight on Looking for machine learning and computer science collaborators · 2017-06-13T05:37:40.784Z · LW · GW

I might be able to collaborate. I have a masters in computer science and did a thesis on neural networks and object recognition, before spending some time at a startup as a data scientist doing mostly natural language related machine learning stuff, and then getting a job as a research scientist at a larger company to do similar applied research work.

I also have two published conference papers under my belt, though they were in pretty obscure conferences admittedly.

As a plus, I've also read most of the sequences and am familiar with the Less Wrong culture, and have spent a fair bit of time thinking about the Friendly/Unfriendly AI problem. I even came up with an attempt at a thought experiment to convince an AI to be friendly.

Alas, I am based near Toronto, Ontario, Canada, so distance might be an issue.