Values, Valence, and Alignment 2019-12-05T21:06:33.103Z · score: 12 (4 votes)
Doxa, Episteme, and Gnosis Revisited 2019-11-20T19:35:39.204Z · score: 13 (4 votes)
The new dot com bubble is here: it’s called online advertising 2019-11-18T22:05:27.813Z · score: 55 (21 votes)
Fluid Decision Making 2019-11-18T18:39:57.878Z · score: 9 (2 votes)
Internalizing Existentialism 2019-11-18T18:37:18.606Z · score: 10 (3 votes)
A Foundation for The Multipart Psyche 2019-11-18T18:33:20.925Z · score: 7 (1 votes)
In Defense of Kegan 2019-11-18T18:27:37.237Z · score: 10 (5 votes)
Why does the mind wander? 2019-10-18T21:34:26.074Z · score: 11 (4 votes)
What's your big idea? 2019-10-18T15:47:07.389Z · score: 29 (15 votes)
Reposting previously linked content on LW 2019-10-18T01:24:45.052Z · score: 18 (3 votes)
TAISU 2019 Field Report 2019-10-15T01:09:07.884Z · score: 38 (19 votes)
Minimization of prediction error as a foundation for human values in AI alignment 2019-10-09T18:23:41.632Z · score: 13 (7 votes)
Elimination of Bias in Introspection: Methodological Advances, Refinements, and Recommendations 2019-09-30T20:23:13.139Z · score: 16 (3 votes)
Connectome-specific harmonic waves and meditation 2019-09-30T18:08:45.403Z · score: 12 (10 votes)
Goodhart's Curse and Limitations on AI Alignment 2019-08-19T07:57:01.143Z · score: 15 (7 votes)
G Gordon Worley III's Shortform 2019-08-06T20:10:27.796Z · score: 16 (2 votes)
Scope Insensitivity Judo 2019-07-19T17:33:27.716Z · score: 25 (10 votes)
Robust Artificial Intelligence and Robust Human Organizations 2019-07-17T02:27:38.721Z · score: 17 (7 votes)
Whence decision exhaustion? 2019-06-28T20:41:47.987Z · score: 17 (4 votes)
Let Values Drift 2019-06-20T20:45:36.618Z · score: 3 (11 votes)
Say Wrong Things 2019-05-24T22:11:35.227Z · score: 99 (36 votes)
Boo votes, Yay NPS 2019-05-14T19:07:52.432Z · score: 34 (11 votes)
Highlights from "Integral Spirituality" 2019-04-12T18:19:06.560Z · score: 20 (21 votes)
Parfit's Escape (Filk) 2019-03-29T02:31:42.981Z · score: 40 (15 votes)
[Old] Wayfinding series 2019-03-12T17:54:16.091Z · score: 9 (2 votes)
[Old] Mapmaking Series 2019-03-12T17:32:04.609Z · score: 9 (2 votes)
Is LessWrong a "classic style intellectual world"? 2019-02-26T21:33:37.736Z · score: 31 (8 votes)
Akrasia is confusion about what you want 2018-12-28T21:09:20.692Z · score: 27 (16 votes)
What self-help has helped you? 2018-12-20T03:31:52.497Z · score: 34 (11 votes)
Why should EA care about rationality (and vice-versa)? 2018-12-09T22:03:58.158Z · score: 16 (3 votes)
What precisely do we mean by AI alignment? 2018-12-09T02:23:28.809Z · score: 29 (8 votes)
Outline of Metarationality, or much less than you wanted to know about postrationality 2018-10-14T22:08:16.763Z · score: 19 (17 votes)
HLAI 2018 Talks 2018-09-17T18:13:19.421Z · score: 15 (5 votes)
HLAI 2018 Field Report 2018-08-29T00:11:26.106Z · score: 51 (21 votes)
A developmentally-situated approach to teaching normative behavior to AI 2018-08-17T18:44:53.515Z · score: 12 (5 votes)
Robustness to fundamental uncertainty in AGI alignment 2018-07-27T00:41:26.058Z · score: 7 (2 votes)
Solving the AI Race Finalists 2018-07-19T21:04:49.003Z · score: 27 (10 votes)
Look Under the Light Post 2018-07-16T22:19:03.435Z · score: 25 (11 votes)
RFC: Mental phenomena in AGI alignment 2018-07-05T20:52:00.267Z · score: 13 (4 votes)
Aligned AI May Depend on Moral Facts 2018-06-15T01:33:36.364Z · score: 9 (3 votes)
RFC: Meta-ethical uncertainty in AGI alignment 2018-06-08T20:56:26.527Z · score: 18 (5 votes)
The Incoherence of Honesty 2018-06-08T02:28:59.044Z · score: 22 (12 votes)
Safety in Machine Learning 2018-05-29T18:54:26.596Z · score: 17 (4 votes)
Epistemic Circularity 2018-05-23T21:00:51.822Z · score: 5 (1 votes)
RFC: Philosophical Conservatism in AI Alignment Research 2018-05-15T03:29:02.194Z · score: 29 (10 votes)
Thoughts on "AI safety via debate" 2018-05-10T00:44:09.335Z · score: 33 (7 votes)
The Leading and Trailing Edges of Development 2018-04-26T18:02:23.681Z · score: 24 (7 votes)
Suffering and Intractable Pain 2018-04-03T01:05:30.556Z · score: 15 (3 votes)
Evaluating Existing Approaches to AGI Alignment 2018-03-27T19:57:39.207Z · score: 22 (5 votes)
Idea: Open Access AI Safety Journal 2018-03-23T18:27:01.166Z · score: 64 (20 votes)


Comment by gworley on Many Turing Machines · 2019-12-11T02:54:34.267Z · score: 2 (1 votes) · LW · GW

I'm not sure what the question is here, so I'll comment instead.

Now, if the Church Turing Hypothesis is true, then this metaphorical tape is sufficiently powerful to simulate not only boring things like computers, but also fancy things like black-holes and (dare I say it) human intelligence!

I believe this to be an overstep. First, the Church-Turing Thesis is not a formal claim we can readily assess the truth of (formally speaking, we'd probably just say it's false), and instead a belief that some interpret as "the universe is computable" but mostly shows up in computer science as a way to handwave around messy details of proving any particular function is computable. For it to be a formal claim would require us knowing more physics than we do such that we would know the true metaphysics of the universe. Thus by invoking it you put the cart before the horse, claiming a thing that would already prove your argument without justification.

Since this is part of the post that seems to be making an argument you disagree with, I'm inclined to view your description as a strawman of MWI in light of this. If you mean it to be a strong argument against MWI I think you'll have to present it in a way that would convince someone who believes in MWI, since this reads to me like you haven't understood the MWI position and so are objecting to a position superficially similar to the MWI position but that's not the real position.

P.P.S. In case my own viewpoint was not obvious, I think "shut up and calculate" means we only worry about things that could potentially affect our future observations, and worrying about whether or not the other branches of the multiverse "exist" is about as meaningful as worrying about how many angels could stand on the head of a pin.

That's a pragmatic view, and you are free to ignore the (currently) metaphysical question being addressed by MWI because you think it doesn't matter to your life, but it's also not an argument against MWI, only against MWI mattering to your purposes.

Comment by gworley on Predictive coding = RL + SL + Bayes + MPC · 2019-12-10T20:49:31.467Z · score: 3 (2 votes) · LW · GW
Side note: Should we lump (d-e) together?

Or more generally, should we lump all of these levels together or not?

On the one hand, I think yes, because I think the same basic mechanism is at work (homeostatic feedback loop).

On the other hand, no, because those loops are wired together in different ways in different parts of the brain to do different things. I draw my model of what the levels are from the theory of dependent origination but other theories are possible, and maybe we can eventually get some thoroughly grounded in empirical neuroscience.

Comment by gworley on G Gordon Worley III's Shortform · 2019-12-10T19:47:09.263Z · score: 4 (2 votes) · LW · GW

After seeing another LW user (sorry, forgot who) mention this post in their commenting guidelines, I've decided to change my own commenting guidelines to the following, matching pretty close to the SSC commenting guidelines that I forgot existed until just a couple days ago:

Comments should be at least two of true, useful, and kind, i.e. you believe what you say, you think the world would be worse without this comment, and you think the comment will be positively received.

I like this because it's simple and it says what rather than how. My old guidelines were all about how:

Seek to foster greater mutual understanding and prefer good faith to bad, nurture to combat, collaboration to argument, and dialectic to debate. Do that by:
-aiming to understand the author and their intent, not what you want them to have said or fear that they said
-being charitable about potential misunderstandings, assuming each person is trying their best to be clearly understood and to advance understanding
-resolving disagreement by finding the crux or synthesizing contrary views to sublimate the disagreement
I'm fairly tolerant, but if you're making comments that are actively counterproductive to fruitful conversation by failing to read and think about what someone else is saying I'm likely to ask you to stop and, if you don't, delete your comments and, if you continue, ban you from commenting on my posts.Some behavior that is especially likely to receive warnings, deletions, and bans:
-trying to "score points"
-hitting "applause lights"
-being contrarian for its own sake

More generally, I think the SSC commenting guidelines might be a good cluster for those of us who want LW comment sections to be "nice" and so mark our posts as norm enforcing. If this catches on this might help deal with finding the few clusters of commenting norms that make people want without having lots of variation between authors.

Comment by gworley on G Gordon Worley III's Shortform · 2019-12-10T19:36:20.634Z · score: 4 (2 votes) · LW · GW

I similarly suspect automation is not really happening in a dramatically different way thus far. Maybe that will change in the future (I think it will), but it's not here yet.

So why so much concern about automation?

I suspect because of something they don't look at in this study much (based on the summary): displacement. People are likely being displaced from jobs into other jobs by automation or the perception of automation and some few of those exit the labor market rather than switch into new jobs. Further, those who do move to new jobs likely disprefer their new jobs because they require different skills, they are less skilled at them immediately after switching, and due to lack of initial skill these new jobs initially pay less than the old jobs. This creates compelling evidence for the automation "destroying" jobs story even though the bigger picture makes it clear that this isn't really happening, in particular because the destroying job story ignores the contrary evidence from what happens after a worker has been in a new job after displacement for a few years and have recovered to pre-displacement levels of wages.

Comment by gworley on Confabulation · 2019-12-09T19:51:12.552Z · score: 5 (3 votes) · LW · GW
When someone asks me why I did or said something I usually lie because the truthful answer is "I don't know". I literally don't know why I make >99% of my decisions. I think through none of these decisions rationally. It's usually some mixture of gut instinct, intuition, cultural norms, common sense and my emotional state at the time.

I know some folks on LW are very scrupulous, or worry about if they are sufficiently scrupulous. While I'm not saying there is no value in having an accurate view of reality (in fact I think it is quite useful!), this is also why when I try to imagine myself being very scrupulous it feels a bit silly because it all just seems like something I made up and any correspondence with reality is the result of things far outside my control, including by not limited to how well I remember things, how well I create ideas in other people's minds using words, and how my brain perceives the accuracy of what I'm saying. That doesn't mean I totally give up on trying to say something true, but I also realize I know so little that my notion of true is very small.

Comment by gworley on Comment on Coherence arguments do not imply goal directed behavior · 2019-12-06T23:30:54.301Z · score: 2 (1 votes) · LW · GW
You seem to be using the words "goal-directed" differently than the OP.
And in different ways throughout your comment.

That's a manifestation of my point: what it would mean for something to be a goal seems to be able to shift depending on what it is you think is an important feature of the thing that would have the goal.

Comment by gworley on The Actionable Version of "Keep Your Identity Small" · 2019-12-06T20:00:26.002Z · score: 33 (13 votes) · LW · GW

I've gone through keeping my identity small and come out the other side, so this might be an interesting nuance on it.

KYIS is important. I think of it in terms of attachment. It's important not to become attached to (or need, in your language) the identified with thing. That's the path down which motivated thinking, defensiveness, and general suffering lie.

However it's also import to project an identity. People get confused about how to interact with you if you don't fit cleanly into a role. To use a programming metaphor, projecting an identity is like documenting your API so people know what and how they can interact with you.

My own experience was that I made my identity so small and consequently projected so little identity that people didn't quite know what to make of me. I was getting labeled "eccentric" and "weird" a lot because I was confusing. So to help other people be less confused and improve my social interactions, I created a brand or identity to project outwards with my clothes, mannerism, etc. that is closely based on who I naturally am as a person but also plays into schema that other people have. The result is people have some clear sense of who I am, even though it's wrong, and it lets them interact with me in consistently positive ways, even if they aren't the maximally positive ways that would be possible if we spent the time to get to know each other deeply. I make my brand the closest Schelling point in in the identity space of schemas that people have, and things fall out smoothly from there.

Maybe not the approach everyone will want to take, but if you find it frustrating that everyone thinks you are weird and doesn't know how to interact with you in positive ways, consider showing some more identity to them (even if it's not the real thing!) so that they can "know" you better. If you're afraid to do that because it's not authentic, consider in what way "being authentic" is something you identify with!

Comment by gworley on Comment on Coherence arguments do not imply goal directed behavior · 2019-12-06T19:32:32.065Z · score: 3 (2 votes) · LW · GW

NB: I've not made a post about this point, but your thoughts made me think of it, so I'll bring it up here. Sorry if I left a comment elsewhere making this same point previously and I forgot about it. Also this is not really a direct response to your post, which I'm not explicitly agreeing or disagreeing with in this comment, but more a riff on the same ideas because you got me thinking about them.

I think much of the confusion around goals and goal directed behavior and what constitutes it and what doesn't lies in the fact that goals, as we are treating them here, are teleological, viz. they are defined in context of what we care about. Another way to say this is that goals are a way we anthropomorphize things, thinking of them as operating the same way we experience our own minds operating.

To see this, we can simply shift our perspective to think of anything as being goal directed. Is a twitching robot goal directed? Sure, if I created the robot to twitch, it's doing a great job of achieving its purpose. Is a bottle cap goal directed? Sure, it was created to keep stuff in, and it keeps doing a fine job of that. Conversely, am I goal directed? Maybe not: I just keep doing stuff and it's only after the fact that I can construct a story that says I was aiming to some goal. Is a paperclip maximizer goal directed? Maybe not: it just makes paperclips because it's programmed to and has no idea that that's what it's doing, no more than the bottle cap knows it's holding in liquid or the twitch robot knows it's twitching.

This doesn't mean goals are not important; I think goals matter a lot when we think about alignment because they are a construct that falls out of how humans make sense of the world and their own behavior, but they are interesting for that reason, not because they are a natural part of the world that exists prior to our creation of them in our ontologies, i.e. goals are a feature of the map, not the unmapped territory.

Comment by gworley on Values, Valence, and Alignment · 2019-12-06T19:09:12.071Z · score: 4 (2 votes) · LW · GW

I maybe don't quite understand your first two questions. If you're asking "where does positive valence come from" my answer is "minimization of prediction error", keeping in mind I think of that as a fancy way to say "feedback signal indicating a control system is moving towards a setpoint". I forget how to translate that into terms of Friston's free energy (increasing it? decreasing it?) if you prefer that model, but the point being that valence is a fundamental thing the brain does to signal parts of itself to do more or less of something.

As to your second question, valence is absolutely shaped by evolution so long as we hold the theory that all creatures with nerve cells have come to exist via evolutionary processes (maybe better to taboo "evolution" and say "differential reproduction with trait inheritance"). As to what effect evolution has had on valence seems a matter for evolutionary psychology and related studies of the evolutionary etiology of animal behavior.

Comment by gworley on The Devil Made Me Write This Post Explaining Why He Probably Didn't Hide Dinosaur Bones · 2019-12-05T19:17:51.185Z · score: 13 (4 votes) · LW · GW
In the case of Many-Worlds interpretations or parallel universes, the correct response is to be like Alice, and admit that multiple perspectives are equally admissible. (This is assuming that they truly are empirically indistinguishable.
This is no worse than accepting that there might be multiple mathematical proofs of the Pythagorean theorem, some algebraic and some geometric, or than accepting that angles can be expressed in degrees or in radians. All are equally valid ways to think about the same problem, so use whatever you like.

This seems not quite right to me, in that I doubt we can draw this equivalence. In the case of mathematical proofs and the units with which to measure angles, we can be indifferent between the choices in the case that our purpose (what we care about; our telos) is proving a statement true or having a measure of an angle, respectively, but if we care about length of proof or proof assumptions (maybe we want a proof of a theorem that doesn't rely on the axiom of choice) or angle units supported by a calculator or elegance of working with particular units then there is a difference between these that matters.

So it is with explanations. If our purpose is to make predictions about quantum effects, then a theory about how quantum mechanics works isn't important, only that the mathematical model predicts reality, and metaphysical questions are moot. But if our purpose is to understand what's going on beyond what can be predicted using quantum mechanics, then we care a lot about which interpretation of quantum mechanics is correct because it does make predictions about the thing we care about.

This kind of not-caring-because-it-works is only practical so long as it is pragmatic to a particular purpose. Perhaps many people should be more pragmatic, but that seems a separate issue, and there are many reasons why what is pragmatic for one purpose may not be for another, so I think your view is true but insufficient.

Comment by gworley on If giving unsolicited feedback was a social norm, what feedback would you often give? · 2019-12-04T21:01:10.452Z · score: 5 (3 votes) · LW · GW

Act into fear and abandon all hope

Comment by gworley on A list of good heuristics that the case for AI x-risk fails · 2019-12-03T19:32:50.100Z · score: 7 (5 votes) · LW · GW

Here's another: AI being x-risky makes me the bad guy.

That is, if I'm an AI researcher and someone tells me that AI poses x-risks, I might react by seeing this as someone telling me I'm a bad person for working on something that makes the world worse. This is bad for me because I derive import parts of my sense of self from being an AI researcher: it's my profession, my source of income, my primary source of status, and a huge part of what makes my life meaningful to me. If what I am doing is bad or dangerous, that threatens to take much of that away (if I also want to think of myself as a good person, meaning I either have to stop doing AI work to avoid being bad or stop thinking of myself as good), and an easy solution to that is to dismiss the arguments.

This is more generally a kind of motivated cognition or rationalization, but I think it's worth considering a specific mechanism because it better points towards ways you might address the objection.

Comment by gworley on A list of good heuristics that the case for AI x-risk fails · 2019-12-03T19:20:25.776Z · score: 4 (2 votes) · LW · GW

Sort of related to a couple points you already brought up (not in personal experience, outsiders not experts, science fiction), but worrying about AI x-risk is also weird, i.e. it's not a thing everyone else is worrying about, so you use some of your weirdness-points to publicly worry about it, and most people have very low weirdness budgets (because of not enough status to afford more weirdness, low psychological openness, etc.).

Comment by gworley on Neural Annealing: Toward a Neural Theory of Everything (crosspost) · 2019-12-03T01:44:19.939Z · score: 3 (2 votes) · LW · GW
Sometimes we’re at a functional local maxima, but we’re not pointed in the right direction globally, and frankly speaking our lack of a high energy parameter is our saving grace – our inability to directly muck up our emotional landscape.

I've heard a similar story in meditation circles about why integration work is important: greater awakening enables greater agency/freedom-of-action, and without integration, virtue, and ethics (these are traditionally combined via the paramita of sila) that can be dangerous because it can let a person run off in personally dangerous or socially bad directions that they were previously only managing not to because they weren't more capable, essentially protecting themselves from themselves with their own failure.

Comment by gworley on Chris_Leong's Shortform · 2019-11-27T20:04:28.535Z · score: 4 (2 votes) · LW · GW


I don't recall anymore, it's been too long for me to remember enough specifics to answer your question. It's just an impression or cached thought I have that I carry around from past study.

Comment by gworley on New MetaEthical.AI Summary and Q&A at UC Berkeley · 2019-11-27T20:02:42.573Z · score: 2 (1 votes) · LW · GW
Here, the optimal decisions would be the higher-order outputs which maximize higher-order utility. They are decisions about what to value or how to decide rather than about what to do.

What constitutes utility here, then? For example, some might say utility is grounded in happiness or meaning, in economics we often measure utility in money, and I've been thinking along the lines of grounding utility (through value) in minimization of prediction error. It's fine that you are concerned with higher-order processes (I'm assuming by that you mean processes about processes, like higher-order outputs is outputs about outputs, higher-order utility is utility about utility), and maybe you are primarily concerned with abstractions that let you ignore these details, but then it must still be that those abstractions can be embodied in specifics at some point or else they are abstractions that don't describe reality well. After all, meta-values/preferences/utility functions are still values/preferences/utility functions.

To capture rational values, we are trying to focus on the changes to values that flow out of satisfying one’s higher-order decision criteria. By unrelated distortions of value, I pretty much mean changes in value from any other causes, e.g. from noise, biases, or mere associations.

How do you distinguish whether something is a distortion or not? You point to some things that you consider distortions, but I'm still unclear on the criteria by which you know distortions from the rational values you are looking for. One person's bias may be another person's taste. I realize some of this may depend on how you identify higher-order processes, but even if that's the case we're still left with the question as it applies to those directly, i.e. is some particular higher-order decision criterion a distortion or rational?

In the code and outline I call the lack of distortion Agential Identity (similar to personal identity). I had previously tried to just extract the criteria out of the brain and directly operate on them. But now, I think the brain is sufficiently messy that we can only simulate many continuations and aggregate them. That opens up a lot of potential to stray far from the original state. This Agential Identity helps ensure we’re uncovering your dispositions rather than that of a stranger or a funhouse mirror distortion.

This seems strange to me, because much of what makes a person unique lies in their distortions (speaking loosely here), not in their lack. Normally when we think of distortions they are taking an agent away from a universal perfected norm, and that universal norm would ideally be the same for all agents if it weren't for distortions. What leads you to think there are some personal dispositions that are not distortions and not universal because they are caused by the shared rationality norm?

Comment by gworley on Chris_Leong's Shortform · 2019-11-27T02:30:53.279Z · score: 2 (1 votes) · LW · GW

I tend to think of Hegel as primarily important for his contributions to the development of Western philosophy (so even if he was wrong on details he influenced and framed the work of many future philosophers by getting aspects of the framing right) and for his contributions to methodology (like standardizing the method of dialectic, which on one hand is "obvious" and people were doing it before Hegel, and on the other hand is mysterious and the work of experts until someone lays out what's going on).

Comment by gworley on New MetaEthical.AI Summary and Q&A at UC Berkeley · 2019-11-27T00:58:45.041Z · score: 3 (2 votes) · LW · GW
A brain’s rational utility function is the utility function that would be arrived at by the brain’s decision algorithm if it were to make more optimal decisions while avoiding unrelated distortions of value.

By what mechanism do you think we can assess how unrelated and how much distortion of value is happening? Put another way, what are "values" in this model such that they are are separate from the utility function and how could you measure whether or not the utility function is better optimizing for those values?

Comment by gworley on 3 Cultural Infrastructure Ideas from MAPLE · 2019-11-27T00:15:12.455Z · score: 17 (4 votes) · LW · GW
Note that MAPLE is a young place, less than a decade old in its current form. So, much of it is "experimental." These ideas aren't time-tested. But my personal experience of them has been surprisingly positive, so far.

I think it's worth sharing that 3 of the ideas you brought up are, at least within zen, historically common to monastic practice, albeit changed in ways to better fit the context of MAPLE. You call them the care role, the ops role, and the schedule; I see them as analogues of the jisha, the jiki, and the schedule.

The jisha, in a zen monastery, is first and foremost the attendant of the abbot (caveat: some monasteries every teacher and high-ranking priest will have their own jisha). But in addition to this, the jisha is thought of as the "mother" of the sangha, with responsibilities to care for the monks, nuns, and guests, care for the sick, organize cleaning, and otherwise be supportive of the needs of people. This is similar to your care role in some ways, but MAPLE seems to have focused more on the care aspect and dropped the gendered-role aspects.

The jiki (also jikijitsu or jikido) is responsible for directing the movement of the students. They are the "father" to the jisha's "mother", serving as (possibly strict) disciplinarians to keep the monastery operating as intended by the abbot, enforcing rules and handing out punishments. This sounds similar to the Ops role, albeit probably with fewer slaps to the face and blows to the head.

The schedule is, well, the schedule. I expect MAPLE's schedule, though "young", is building on centuries of monastic schedule tradition while adding in new things. I think it's worth adding that the schedule is also there to support deep practice, because there's a very real way that having to make decisions can weaken samadhi, and having all decisions eliminated creates the space in which calm abiding can more easily arise.

Comment by gworley on A Theory of Pervasive Error · 2019-11-26T22:15:17.239Z · score: 2 (1 votes) · LW · GW

Depends what you care about.

Comment by gworley on Sayan's Braindump · 2019-11-24T00:20:11.932Z · score: 2 (1 votes) · LW · GW

What Dharma traditions in particular so you have in mind, because I can't think of one i would describe as saying everyone had innate "moral" perfection unless you sufficiently twist around the word "moral" such that it's use is confusing at best.

Comment by gworley on Doxa, Episteme, and Gnosis Revisited · 2019-11-22T18:37:50.855Z · score: 2 (1 votes) · LW · GW

Everything that is not a literal quote from the previous post is new.

Comment by gworley on Do you get value out of contentless comments? · 2019-11-22T02:33:10.165Z · score: 5 (5 votes) · LW · GW

No. I would rather receive a strong upvote. If I receive a comment I would prefer it contain some useful content.

Comment by gworley on G Gordon Worley III's Shortform · 2019-11-20T02:11:10.836Z · score: 2 (1 votes) · LW · GW

Story stats are my favorite feature of Medium. Let me tell you why.

I write primarily to impact others. Although I sometimes choose to do very little work to make myself understandable to anyone who is more than a few inferential steps behind me and then write out on a far frontier of thought, nonetheless my purpose remains sharing my ideas with others. If it weren't for that, I wouldn't bother to write much at all, and certainly not in the same way as I do when writing for others. Thus I care instrumentally a lot about being able to assess if I am having the desired impact so that I can improve in ways that might help serve my purposes.

LessWrong provides some good, high detail clues about impact: votes and comments. Comments on LW are great, and definitely better in quality and depth of engagement than what I find other places. Votes are also relatively useful here, caveat the weaknesses of LW voting I've talked about before. If I post something on LW and it gets lots of votes (up or down) or lots of comments, relative to what other posts receive, then I'm confident people have read what I wrote and I impacted them in some way, whether or not it was in the way I had hoped.

That's basically where story stats stop on LessWrong. Here's a screen shot of the info I get from Medium:

For each story you can see a few things here: views, reads, read ratio, and fans, which is basically likes. I also get an email every week telling me about the largest updates to my story stats, like how many additional views, reads, and fans a story had in the last week.

If I click the little "Details" link under a story name I get more stats: average read time, referral sources, internal vs. external views (external views are views on RSS, etc.), and even a list of "interests" associated with readers who read my story.All of this is great. Each week I get a little positive reward letting me know what I did that worked, what didn't, and most importantly to me, how much people are engaging with things I wrote.

I get some of that here on LessWrong, but not all of it. Although I've bootstrapped myself now to a point where I'll keep writing even absent these motivational queues, I still find this info useful for understanding what things I wrote that people liked best or found most useful and what they found least useful. Some of that is mirrored here by things like votes, but it doesn't capture all of it.

I think it would be pretty cool if I could see more stats about my posts on LessWrong similar to what I get on Medium, especially view and read counts (knowing that "reads" is a ultimately a guess based on some users allowing Javascript that lets us guess that they read it).

Comment by gworley on The Value Definition Problem · 2019-11-19T21:39:44.846Z · score: 3 (2 votes) · LW · GW

Possibly related but with a slightly different angle, you may have missed my work on trying to formally specify the alignment problem, which is pointing to something similar but arrives at somewhat different results.

Comment by gworley on The new dot com bubble is here: it’s called online advertising · 2019-11-19T21:30:12.477Z · score: 12 (4 votes) · LW · GW

It's true that not all of online advertising does nothing. We should expect, if nothing else, online advertising to continue to serve the primary and original purpose of advertising, which is generating choice awareness, and certainly my own experience backs this up: I am aware of any number of products and services only because I saw ads for them on Facebook, Google search, SlateStarCodex, etc.. To the extent that advertising helps people become aware of choices they otherwise would not have become aware of such that on the margin they may take that choice (since you make none of the choices you don't know how to make), it would seem to function successfully, assuming it can be had at a price low enough to produce positive return on investment.

However, my own experience in the industry suggests that most spend that goes beyond generating more than zero awareness is poorly spent. Much to the dismay of marketing departments, you can't usually spend your way through ads to growth. Other forms of marketing look better (content marketing can work really great and can be a win-win when done right).

Comment by gworley on Cybernetic dreams: Beer's pond brain · 2019-11-19T19:52:59.420Z · score: 3 (2 votes) · LW · GW

I'm excited for the rest of this miniseries. I'm similarly interested in cybernetics and am sad it failed for what in hindsight seem to be obvious and unavoidable reasons (interdisciplinary & easily cooped to justify bullshit). My own thinking has taken me in a direction convergent with cybernetics, as I've investigated a bit in the past.

Comment by gworley on The new dot com bubble is here: it’s called online advertising · 2019-11-19T19:41:52.676Z · score: 8 (2 votes) · LW · GW

Cool! I don't have time to look into this now, but I'm excited to see what you produce in this direction. As you know I'm pretty pessimistic that we can totally solve Goodhart effects, but I do expect we can mitigate them enough that for things other than superintelligent levels of optimization we can do better than we do now.

Comment by gworley on Personal Experiment: Counterbalancing Risk-Adversion · 2019-11-18T19:44:50.252Z · score: 2 (1 votes) · LW · GW

I've done something similar and it's similarly worked out well.

Comment by gworley on Impossible moral problems and moral authority · 2019-11-18T19:01:57.005Z · score: 5 (3 votes) · LW · GW

I think all of your reasons for how a human comes to have moral authority boil down to something like having a belief that doing things that this authority says are expected to be good (have positive valence, in my current working theory of values). This perhaps gives a way of reframing alignment as the problem of constructing an agent to whom you would give moral authority to decide for you, rather than as we normally do as an agent that is value aligned.

Comment by gworley on Insights from the randomness/ignorance model are genuine · 2019-11-13T22:50:58.137Z · score: 2 (1 votes) · LW · GW

I think so, thanks.

Comment by gworley on Insights from the randomness/ignorance model are genuine · 2019-11-13T20:23:49.239Z · score: 4 (3 votes) · LW · GW

I guess I'm a bit out of the loop on questions about how to define uncertainty, so I'm a bit confused about what position you are against or how this is different from what others do. That is, it seems to be like you are trying to fix a problem you perceive in the way people currently think about uncertainty, but I'm not sure what that problem is so that I can even understand how this framing might fix it. I've been reading this sequence of posts thinking "yeah, sure, this all sounds reasonable" but also without really understanding the context for it. I know you did the post on anthropics, but even there it wasn't really that clear to me how this framing helps us over what is perhaps otherwise normally done, although perhaps that reflects my ignorance of existing arguments about what methods of anthropic reasoning are correct.

Comment by gworley on Levers error · 2019-11-13T03:35:07.802Z · score: 2 (1 votes) · LW · GW

This seems quite right to me, that in our minds things are often confused and conflated that don't need to be and as a result we act in ways that aren't what we think should be possible and it feels like doing what we really want is impossible because in our minds we don't know how to separate the thing we want from the thing we don't want. One possible way to deal with these sorts of problems that I've been excited about lately as a good framing for the mechanism that underlies the processes that clear these sorts of confusions is memory reconsolidation.

Comment by gworley on Indescribable · 2019-11-13T03:21:18.989Z · score: 3 (2 votes) · LW · GW

I guess this is a matter of opinion on how much explanation makes something "untranslatable". For example, maybe it takes 1000 words to give enough context to adequately convey the meaning of a word with a very precise meaning in another language. Is this word "translatable"? In a certain sense no, because making sense of it required giving the person a lot of new context that they didn't have before such that they could make sense of it that was beyond simple reference to existing concepts they had. Obviously the other end of the spectrum where there are words that are literally impossible to explain don't exist or else even the speakers of the same language wouldn't be able to convey their meaning to each other, so it seems to me fair to say some words are untranslatable if by that we mean unable to provide a direct or simple translation on the order of using something up to the size of a phrase to capture the original meaning of the word.

Comment by gworley on Picture Frames, Window Frames and Frameworks · 2019-11-04T18:13:34.660Z · score: 5 (2 votes) · LW · GW

I generally call these sort of things "perspectives" or "stances", in the interest of sharing alternative terminology.

Comment by gworley on But exactly how complex and fragile? · 2019-11-03T20:09:18.786Z · score: 2 (1 votes) · LW · GW

For one thing, these dynamics are already in place: the world is full of agents and more basic optimizing processes that are not aligned with broad human values—most individuals to a small degree, some strange individuals to a large degree, corporations, competitions, the dynamics of political processes.

I don't think of this as evidence that unaligned AI is not dangerous. Arguable we're already seeing bad effects from unaligned AI, such as effects on public discourse as a result of newsfeed algorithms. Further, anything that limits the impact of unaligned action now seems largely the result of existing agents being of relatively low or similar power. Even the most powerful actors in the world right now can't effectively control much of the world (e.g. no government has figured out how to eliminate dissent, no military how to stop terrorists, etc.). I expect thing to look quite different if we develop an actor that is more powerful than a majority of all other actors combined, even if it develops into that power slowly because the steps along the way to that seem individually worth the tradeoff.

But it isn’t obvious to me that by that point it isn’t sufficiently well aligned that we would recognize its future as a wondrous utopia, just not the very best wondrous utopia that we would have imagined if we had really carefully sat down and imagined utopias for thousands of years.

To our ancestors we would appear to live in a wondrous utopia (bountiful food, clean water, low disease, etc.), yet we still want to do better. I think there will be suffering so long as we are not at the global maximum and anyone realizes this.

Comment by gworley on Toon Alfrink's sketchpad · 2019-11-02T00:50:26.875Z · score: 3 (2 votes) · LW · GW

I find this interesting as this gives one of the better arguments I can recall for there being something positive at the heart of social justice such that it isn't just one side trying to grab power from another to push a different set of norms, since that's often what the dynamics of it look like to me in practice, whatever the intent of social justice advocates, and I find such battles not compelling (why grant one group power rather than another, all else equal, if they will push for the things they want to the exclusion of those who would then not be in power just the same as those in power now do to those seeking to gain power?).

Comment by gworley on The Simulation Epiphany Problem · 2019-11-01T22:57:14.726Z · score: 3 (2 votes) · LW · GW

I'm inclined to think there is no problem here because the belief that [Dave] has about being in a simulation is unfounded as it's exactly the same situation Dave finds himself in later when PAL takes route B. That is, taking route B then seems to not be evidence about being in a simulation as you suggest, even if PAL normally takes route A and is highly reliable, because it could just as easily be that Dave is seeing the result of PAL acting on a simulation involving [Dave] causing PAL to prefer route B (assuming there is only one level of simulation; if there's reason to believe there's more than one level we start to tip in favor of simulation).

Comment by gworley on Prospecting for Conceptual Holes · 2019-10-30T18:23:25.637Z · score: 7 (5 votes) · LW · GW

I wonder if you picked kensho as an example because it is so apropos: it's a word related to the seeing of that which you previously didn't know existed.

Comment by gworley on A new kind of Hermeneutics · 2019-10-30T18:17:36.677Z · score: 3 (3 votes) · LW · GW

A few comments.

I was initially intrigued to read this because it seemed like you were going to make an interesting case somewhere along the lines of "mathematics involves hermeneutics" because ultimately mathematics is done by humans using a (formal) language that they must interpret before they can generate more mathematics that follows the rules of the language. It seems to me you never quite got there, a stumbled towards some other point that's not totally clear to me. Forgive me if this is an uncharitable reading, but I read your point as being "look at all the cool stuff we can do because we use formal languages".

Pointing out that formal languages let us do cool stuff is something I agree with, although it feels a bit obvious. I suspect I'm mostly reacting to having hoped you'd make a stronger case for applying hermeneutic methods and having an attitude of interpretation when dealing with formal systems, since this is a point often ignored when people learn about formal systems by remaking what I might call the "naive positivist" mistake of thinking formal systems describe reality precisely, or put in LW terms, confuse the map for the territory.

Additionally, I found your proof of "There exist a Turing Machine that recognize membership in the language." somewhat inadequate given what you presented in the text. It's only thanks to knowing some details about TMs already that this proof is very meaningful to me, and I think the uninitiated reader, whom you seem to be targeting, would have a hard time understanding this proof since you didn't explain the constraints of TMs. For example, an easy objection to raise to your proof, lacking a clear definition of TMs, would be "but won't it halt incorrectly if it sees a '2' or a '3' or a 'c' on the tape?"

I appreciate your clear writing style. If my comments seem harsh it's because you got my hopes up and then delivered something less than what you initially lead me to expect.

Comment by gworley on [Site Update] Subscriptions, Bookmarks, & Pingbacks · 2019-10-29T22:08:00.113Z · score: 15 (4 votes) · LW · GW

Do on site notifications and email notifications come at the same cadence? For example, if I ask for daily notifications by email and on site, will I see them in both places only daily or will I still see them on site immediately?

I'd like to get daily emails and have immediate on site notifications, but I'm not sure if that's currently possible.

Comment by gworley on Is requires ought · 2019-10-29T20:11:53.528Z · score: 4 (2 votes) · LW · GW

I agree with your arguments if we consider explicit forms of knowledge, such as episteme and doxa. I'm uncertain if they also apply to what we might call "implicit" knowledge like that of techne and gnosis, i.e. knowledge that isn't separable from the experience of it. There I think we can make a distinction between pure "is" that exists prior to conceptualization and "is from ought" arising only after such experiences are reified (via distinction/discrimination/judgement) that makes it so that we can only talk about knowledge of the "is from ought" form even if it built over "is" knowledge that we can only point at indirectly.

Comment by gworley on bgaesop's Shortform · 2019-10-24T20:07:42.378Z · score: 5 (2 votes) · LW · GW

Maybe. This is a very narrow definition of "enlightenment" in my opinion, as in Scott is claiming PNSE is enlightenment whereas I would say it's one small part of it. I think of it differently, as a combination of psychological development plus some changes to how the brain operates that seemingly includes PNSE but I'm not convinced that's the whole story.

Comment by gworley on What are some unpopular (non-normative) opinions that you hold? · 2019-10-23T17:46:19.201Z · score: 6 (3 votes) · LW · GW

Almost everything is "alive" or "conscious" because the only interesting property that separates things that are "alive" or "dead" is whether or not they contain feedback processes (that, as a consequence, generate information and locally reduce entropy while globally increasing it).

Comment by gworley on All I know is Goodhart · 2019-10-22T01:19:58.781Z · score: 7 (4 votes) · LW · GW

I find this way of formalizing Goodhart weird. Is there a standard formalization of it, or is this your invention? I'll explain what I think is weird.

You define U and V such that you can calculate U - V to find W, but this appears to me to skip right past the most pernicious bit of Goodhart, which is that U is only knowable via a measurement (not necessarily a measure), such that I would say for some "measuring" function and the problem is that is correlated with but different from U since there may not even be a way to compare U.

To make it concrete with an example, suppose U is "beauty as defined by Gordon". We don't, at least as of yet, have a way to find U directly, and maybe we never will. So supposing we don't, if we want to answer questions like "would Gordon find this beautiful?" and "what painting would Gordon most like?" we need to a measurement of U we can work with, as developed by, say, using IRL to discover a "beauty function" that describes U such that we could say how beautiful I would think something is. But we would be hard pressed to be precise about how far off the beauty function is from my sense of beauty because we only have a very gross measure of the difference: compare how beautiful the beauty function and I think some finite set of things are (finite because I'm a bounded, embedded agent who is never going to get to see all things, even if the beauty function somehow could), and even as we are doing this we are still getting a measurement of my internal sense of beauty rather than my internal sense of beauty itself because we are asking me to say how beautiful I think something is rather than directly observing my sense of beauty. This is much of why I expect that Goodhart is extremely robust.

Comment by gworley on The problem/solution matrix: Calculating the probability of AI safety "on the back of an envelope" · 2019-10-22T00:51:39.085Z · score: 4 (2 votes) · LW · GW

In the spirit of the linked article by Scott, I was really hoping for your attempt at a made up answer.

Comment by gworley on Minimization of prediction error as a foundation for human values in AI alignment · 2019-10-19T22:47:18.705Z · score: 2 (1 votes) · LW · GW

As of yet, no, although this brings up an interesting point, which is that I'm looking at this stuff to find a precise grounding because I don't think we can develop a plan that will work to our satisfaction without it. I realize lots of people disagree with me here, thinking that we need the method first and the value grounding will be worked out instrumentally by the method, but I dislike this because it makes it hard to verify the method than by observing what an AI produced by that method does, and this is a dangerous verification method due to the risk of a "treacherous" turn that isn't so much treacherous as it is the one that could have been predicted if we bothered to have a solid theory of what the method we were using really implied in terms of the thing we cared about, if we had bothered to know what the thing we cared about fundamentally was.

Also I suspect we will be able to think of our desired AI in terms of control systems and set points, because I think we can do this for everything that's "alive", although it may not be the most natural abstraction to use for its architecture.

Comment by gworley on Partial Agency · 2019-10-18T22:14:12.212Z · score: 2 (1 votes) · LW · GW

I read partial agency and myopia as a specific way the boundedness of embedded processes manifest their limitations, so it seems to me both not surprising that it exists nor surprising that there is an idealized "unbounded" form to which the bounded form may aspire but not achieve due to limitations created by being bounded and instantiated out of physical stuff rather than mathematics.

I realize there's a lot more details to the specific case you're considering, but I wonder if you'd agree it's part of this larger, general pattern of real things being limited in ways by embeddedness that makes them less than their theoretical (albeit unachievable) ideal.

Comment by gworley on Vanessa Kosoy's Shortform · 2019-10-18T18:13:34.448Z · score: 2 (1 votes) · LW · GW

This seems to me to address the meta problem of consciousness rather than the hard problem of consciousness itself, since you seem to be more offering an etiology for the existence of agents that would care about the hard problem of consciousness rather than an etiology of qualia.

Comment by gworley on Is value amendment a convergent instrumental goal? · 2019-10-18T18:03:40.517Z · score: 2 (1 votes) · LW · GW

There is an interesting addition to this, I think, which is that if a goal of the utility function is to encourage exploration then it paradoxically needs to be extremely robust against being modified while it explores and possibly modifies all other goals. I could easily imagine an agent finding some kind of mechanism to avoid local maxima (exploration) being important enough that it would lock it in so the only thing it can't not continue to do is explore well enough to not get trapped and keep looking for a global maximum.