Posts
Comments
Yeah, this works.
I'm a bit torn on where fairness should be properly placed. For example, if Alice is the only one deciding and Bob has no power to punish her at all, it seems like fairness should still come into Alice's consideration. So maybe it should be encoded into the utility function, not into the strategic behavior running on top of it. But that would mean we need to take actions that benefit random aliens while going about our daily lives, and I'm not sure we want that.
Yeah, that training took some time, but it worked. I can now write melodies and chords from imagination pretty easily. Have had this skill for awhile now. It's very useful, though of course not a golden ticket.
My current challenge in music is just coming up with interesting stuff, I think this challenge isn't gonna run out anytime soon.
I think in music you quickly learn to hear and play the right notes, and then it's just never a problem anymore. The real difficulty is having something to say. In writing and visual arts I think it's also like that: you learn a basic level of skill, and then it's all about what you say.
It is fun, but there is only a certain amount I can use my hands before I get RSI on any given day.
Might be worth experimenting a bit with finding more comfortable ways to play. Lots of people (including me) can play the guitar for many hours every day with no problems. But it's hard for a teacher to tell from outside what's crampy and what isn't, you need to rely on your feelings for this.
Cool. If you go with it, I'd be super interested to know how it went, and lmk if you need any help or elaboration on the idea.
Last year I had an idea for a debate protocol which got pretty highly upvoted.
I think once you're past a certain basic level in math, it's feasible to continue learning pretty much by yourself, just download problem sets and go through them. But it's a bit lonely. Going to classes lets you meet other people who are into the same thing as you! Shared love for something is what makes communities happen, I got this idea in Jim Butcher of all places. And the piece of paper itself is also quite nice, it can come in handy unexpectedly, and getting it now is probably easier than getting it later. So on the whole I'd lean toward getting the degree.
About the philosophical stuff, I think "the world and/or my life will be over soon anyway" is kind of a nasty idea, because it makes you feel like nothing's worth doing. That's no way for a human being to be! You're not a potato! Hence it's better to act on the assumption that neither the world nor your life will be over anytime soon.
Also check out "personalized pagerank", where the rating shown to each user is "rooted" in what kind of content this user has upvoted in the past. It's a neat solution to many problems.
Crosstalk is definitely a problem, e-drums and pads have it too. But are you sure the tradeoff is inescapable? Imagine the tines sit on separate pads, or on the same pad but far from each other. (Or close to each other, but with deep grooves between them, so that the distance through the connecting material is large.) This thought experiment shows that damping and crosstalk can be small at the same time. So maybe you can reduce damping but not increase crosstalk, by changing the instrument's shape or materials.
Reading a book, or even watching a movie, is less stimulating than ancestral activities like hunting or fighting. So maybe stimulation by itself isn't the problem, and instead of "superstimuli" we should be worried about activities that are low effort and/or fruitless. From that perspective, reading a book can be both difficult and fruitful (depending on the book - reading Dostoevsky or Fitzgerald isn't the same as reading a generic romance or young adult novel). And creativity is both difficult and fruitful. So we shouldn't put these things on par with watching tiktok.
Maybe you could reduce the damping, so that when muting you can feel your finger stopping the vibration? It seems to me that more feedback of this kind is usually a good thing for the player. Also the vibration could give you a continuous "envelope" signal to be used later.
I think Diffractor's post shows that logical induction does hit a certain barrier, which isn't quite diagonalization, but seems to me about as troublesome:
As the trader goes through all sentences, its best-case value will be unbounded, as it buys up larger and larger piles of sentences with lower and lower prices. This behavior is forbidden by the logical induction criterion... This doesn't seem like much, but it gets extremely weird when you consider that the limit of a logical inductor, P_inf, is a constant distribution, and by this result, isn't a logical inductor! If you skip to the end and use the final, perfected probabilities of the limit, there's a trader that could rack up unboundedly high value!
For similar instruments, you've seen the array mbira, right?
For marking the ends of notes, to me the most intuitive solution would be muting with fingers, but I'm not sure how that translates to electronics.
If a student is genuinely acting in bad faith—attending a class and ruining it for their peers—then they should be removed from the class and sent to a counselor/social worker.
The number of such students is larger than you think. But the more important question is what the social worker would do with the student - what tools would be available to them. Because by default the student will just disrupt another class tomorrow and so on. There isn't any magic method to make disruptive students non-disruptive; schools would love to have access to such magic if it existed.
Sorry for maybe naive question. Which other behaviors X could be defeated by this technique of "find n instructions that induce X and n that don't"? Would it work for X=unfriendliness, X=hallucination, X=wrong math answers, X=math answers that are wrong in one specific way, and so on?
Orwell is one of my personal heroes, 1984 was a transformative book to me, and I strongly recommend Homage to Catalonia as well.
That said, I'm not sure making theories of art is worth it. Even when great artists do it (Tolkien had a theory of art, and Oscar Wilde, and Flannery O'Connor, and almost every artist if you look close enough), it always seems to be the kind of theory which suits that artist and nobody else. Would advice like "good prose is like a windowpane" or "efface your own personality" improve the writing of, say, Hunter S. Thompson? Heck no, his writing is the opposite of that and charming for it! Maybe the only possible advice to an artist is to follow their talent, and advising anything more specific is as likely to hinder as help.
I think for good emotions the feel-it-completely thing happens naturally anyway.
To me it's less about thoughts and more about emotions. And not about doing it all the time, but only when I'm having some intense emotion and need to do something about it.
For example, let's say I'm angry about something. I imagine there's a knob in my mind: make the emotion stronger or weaker. (Or between feeling it less, and feeling it more.) What I usually do is turn the knob up. Try to feel the emotion more completely and in more detail, without trying to push any of it away. What usually happens next is the emotion kinda decides that it's been heard and goes away: a few minutes later I realize that whatever I was feeling is no longer as intense or urgent. Or I might even forget it entirely and find my mind thinking of something else.
It's counterintuitive but it's really how it works for me; been doing it for over a decade now. It's the closest thing to a mental cheat code that I know.
There's an amazing HN comment that I mention everytime someone links to this essay. It says don't do what the essay says, you'll make yourself depressed. Instead do something a bit different, and maybe even opposite.
Let's say for example you feel annoyed by the fat checkout lady. DFW advises you to step over your annoyance, imagine the checkout lady is caring for her sick husband, and so on. But that kind of approach to your own feelings will hurt you in the long run, and maybe even seriously hurt you. Instead, the right thing is to simply feel annoyed at the checkout lady. Let the feeling come and be heard. After it's heard, it'll be gone by itself soon enough.
Here's the whole comment, to save people the click:
DFW is perfect towards the end, when he talks about acceptance and awareness— the thesis ("This is water") is spot on. But the way he approaches it, as a question of choosing what to think, is fundamentally, tragically wrong.
To Mindfulness-Based Cognitive Therapy folks call that focusing on cognition rather than experience. It's the classic fallacy of beginning meditators, who believe the secret lies in choosing what to think, or in fact choosing not to think at all. It makes rational sense as a way to approach suffering; "Thinking this way is causing me to suffer. I must change my thinking so that the suffering stops."
In fact, the fundamental tenet of mindfulness is that this is impossible. Not even the most enlightened guru on this planet can not think of an elephant. You cannot choose what to think, cannot choose what to feel, cannot choose not to suffer.
Actually, that is not completely true. You can, through training over a period of time, teach yourself to feel nothing at all. We have a special word to describe these people: depressed.
The "trick" to both Buddhist mindfulness and MBCT, and the cure for depression if such a thing exists, lies in accepting that we are as powerless over our thoughts and emotions as we are over our circumstances. My mind, the "master" DFW talks about, is part of the water. If I am angry that an SUV cut me off, I must experience anger. If I'm disgusted by the fat woman in front of me in the supermarket, I must experience disgust. When I am joyful, I must experience joy, and when I suffer, I must experience suffering. There is no other option but death or madness— the quiet madness that pervades most peoples' lives as they suffer day in and day out in their frantic quest to avoid suffering.
Experience. Awareness. Acceptance. Never thought— you can't be mindful by thinking about mindfulness, it's an oxymoron. You have to just feel it.
There's something indescribably heartbreaking in hearing him come so close to finding the cure, to miss it only by a hair, knowing what happens next.
[Full disclosure: My mother is a psychiatrist who dabbles in MBCT. It cured her depression, and mine.]
And another comment from a different person making the same point:
Much of what DFW believed about the world, about himself, about the nature of reality, ran counter to his own mental wellbeing and ultimately his own survival. Of the psychotherapies with proven efficacy, all seek to inculcate a mode of thinking in stark contrast to Wallace's.
In this piece and others, Wallace encourages a mindset that appears to me to actively induce alienation in the pursuit of deeper truth. I believe that to be deeply maladaptive. A large proportion of his words in this piece are spent describing that his instinctive reaction to the world around him is one of disgust and disdain.
Rather than seeking to transmute those feelings into more neutral or positive ones, he seeks to elevate himself above what he sees as his natural perspective. Rather than sit in his car and enjoy the coolness of his A/C or the feeling of the wheel against his skin or the patterns the sunlight makes on his dash, he abstracts, he retreats into his mind and an imagined world of possibilities. He describes engaging with other people, but it's inside his head, it's intellectualised and profoundly distant. Rather than seeing the person in the SUV in front as merely another human and seeking to accept them unconditionally, he seeks a fictionalised narrative that renders them palatable to him.
He may have had some sort of underlying chemical or structural problem that caused his depression, but we have no real evidence for that, we have no real evidence that such things exist. What we do know is that patterns of cognition that he advocated run contrary to the basic tenets of the treatment for depression with the best evidence base - CBT and it's variants.
Wow, it's worse than I thought. Maybe the housing problem is "government-complete" and resists all lower level attempts to solve it.
What if you build your school-as-social-service, and then one day find that the kids are selling drugs to each other inside the school?
Or that the kids are constantly interfering with each other so much that the minority who want to follow their interests can't?
I think any theory of school that doesn't mention discipline is a theory of dry water. What powers and duties would the 1-supervisor-per-12-kids have? Can they remove disruptive kids from rooms? From the building entirely? Give detentions?
I sometimes had this feeling from Conway's work, in particular, combinatorial game theory and surreal numbers to me feel closer to mathematical invention than mathematical discovery. This kind of things are also often "leaf nodes" on the tree of knowledge, not leading to many followup discoveries, so you could say their counterfactual impact is low for that reason.
In engineering, the best example I know is vulcanization of rubber. It has had a huge impact on today's world, but Goodyear developed it by working alone for decades, when nobody else was looking in that direction.
You're saying governments can't address existential risk, because they only care about what happens within their borders and term limits. And therefore we should entrust existential risk to firms, which only care about their own profit in the next quarter?!
Yeah, the trapped priors thing is pretty worrying to me too. But I'm confused about the opposing interventions thing. Do charter cities, or labor unions, rely on donations that much? Is it really so common for donations to cancel each other out? I guess advocacy donations (for example, pro-life vs pro-choice) do cancel each other out, so maybe we could all agree that advocacy isn't charity.
If the housing crisis is caused by low-density rich neighborhoods blocking redevelopment of themselves (as seems the consensus on the internet now), could it be solved by developers buying out an entire neighborhood or even town in one swoop? It'd require a ton of money, but redevelopment would bring even more money, so it could be win-win for everyone. Does it not happen only due to coordination difficulties?
I don't know about others, but to me these approaches sound like "build a bureaucracy from many well-behaved agents", and it seems to me that such a bureaucracy wouldn't necessarily behave well.
I mean, one of the participants wrote: "getting comments that engage with what I write and offer a different, interesting perspective can almost be more rewarding than money". Others asked us for feedback on their non-winning entries. It feels to me that interaction between more and less experienced folks can be really desirable and useful for both, as long as it's organized to stay within a certain "lane".
I have maybe a naive question. What information is needed to find the MSP image within the neural network? Do we have to know the HMM to begin with? Or could it be feasible someday to inspect a neural network, find something that looks like an MSP image, and infer the HMM from it?
For example, if there were certain states of the world which I wanted to avoid at all costs (and thus violate the continuity axiom), I could assign zero utility to it and use geometric averaging. I couldn’t do this with arithmetic averaging and any finite utilities.
Well, you can't have some states as "avoid at all costs" and others as "achieve at all costs", because having them in the same lottery leads to nonsense, no matter what averaging you use. And allowing only one of the two seems arbitrary. So it seems cleanest to disallow both.
If I wanted to program a robot which sometimes preferred lotteries to any definite outcome, I wouldn’t be able to program the robot using arithmetic averaging over goodness values.
But geometric averaging wouldn't let you do that either, or am I missing something?
Sent the form.
What do you think about combining teaching and research? Similar to the Humboldt idea of the university, but it wouldn't have to be as official or large-scale.
When I was studying math in Moscow long ago, I was attending MSU by day, and in the evenings sometimes went to the "Independent University", which wasn't really a university. Just a volunteer-run and donation-funded place with some known mathematicians teaching free classes on advanced topics for anyone willing to attend. I think they liked having students to talk about their work. Then much later, when we ran the AI Alignment Prize here on LW, I also noticed that the prize by itself wasn't too important; the interactions between newcomers and old-timers were a big part of what drove the thing.
So maybe if you're starting an organization now, it could be worth thinking about this kind of generational mixing, research/teaching/seminars/whatnot. Though there isn't much of a set curriculum on AI alignment now, and teaching AI capability is maybe not the best idea :-)
Yeah, that might be a good idea in case any rich employers stumble on this :-)
In terms of goals, I like making something, having many people use it, and getting paid for it. I'm not as motivated by meaning, probably different from most EAs in that sense.
In terms of skillset, I'd say I'm a frontend-focused generalist. The most fun programming experience in my life was when I built an online map just by myself - the rendering of map data to png tiles, the serving backend, the javascript for dragging and zooming, there weren't many libraries back then - and then it got released and got hundreds of thousands of users. The second most fun was when I made the game - coming up with the idea, iterating on the mechanics, graphic design, audio programming, writing text, packaging for web and mobile, the whole thing - and it got quite popular too. So that's the prototypical good job for me.
I don't really understand your approach yet. Let's call your decision theory CLDT. You say counterfactuals in CLDT should correspond to consistent universes. For example, the counterfactual "what if a CLDT agent two-boxed in Newcomb's problem" should correspond to a consistent universe where a CLDT agent two-boxes on Newcomb's problem. Can you describe that universe in more detail?
Done! I didn't do it at first because I thought it'd have to be in person only, but then clicked around in the form and found that remote is also possible.
Besides math and programming, what are your other skills and interests?
Playing and composing music is the main one.
I have an idea of a puzzle game, not sure if it would be good or bad, I haven’t done even a prototype. So if anyone is interested, feel free to try
Yeah, you're missing out on all the fun in game-making :-) You must build the prototype yourself, play with it yourself, tweak the mechanics, and at some moment the stars will align and something will just work and you'll know it. There's no way anyone else can do it but you.
Yeah. My point was, we can't even be sure which behavior-preserving optimizations (of the kind done by optimizing compilers, say) will preserve consciousness. It's worrying because these optimizations can happen innocuously, e.g. when your upload gets migrated to a newer CPU with fancier heuristics. And yeah, when self-modification comes into the picture, it gets even worse.
I think there's a pretty strong argument to be more wary about uploading. It's been stated a few times on LW, originally by Wei Dai if I remember right, but maybe worth restating here.
Imagine the uploading goes according to plan, the map of your neurons and connections has been copied into a computer, and simulating it leads to a person who talks, walks in a simulated world, and answers questions about their consciousness. But imagine also that the upload is being run on a computer that can apply optimizations on the fly. For example, it could watch the input-output behavior of some NN fragment, learn a smaller and faster NN fragment with the same input-output behavior, and substitute it for the original. Or it could skip executing branches that don't make a difference to behavior at a given time.
Where do we draw the line which optimizations to allow? It seems we cannot allow all behavior-preserving optimizations, because that might lead to a kind of LLM that dutifully says "I'm conscious" without actually being so. (The p-zombie argument doesn't apply here, because there is indeed a causal chain from human consciousness to an LLM saying "I'm conscious" - which goes through the LLM's training data.) But we must allow some optimizations, because today's computers already apply many optimizations, and compilers even more so. For example, skipping unused branches is pretty standard. The company doing your uploading might not even tell you about the optimizations they use, given that the result will behave just like you anyway, and the 10x speedup is profitable. The result could be a kind of apocalypse by optimization, with nobody noticing. A bit unsettling, no?
The key point of this argument isn't just that some optimizations are dangerous, but that we have no principled way of telling which ones are. We thought we had philosophical clarity with "just upload all my neurons and connections and then run them on a computer", but that doesn't seem enough to answer questions like this. I think it needs new ideas.
Yeah, that seems to agree with my pessimistic view - that we are selfish animals, except we have culture, and some cultures accidentally contain altruism. So the answer to your question "are humans fundamentally good or evil?" is "humans are fundamentally evil, and only accidentally sometimes good".
I don't think altruism is evolutionarily connected to power as you describe. Caesar didn't come to power by being better at altruism, but by being better at coordinating violence. For a more general example, the Greek and other myths don't give many examples of compassion (though they give many other human values), it seems the modern form of compassion only appeared with Jesus, which is too recent for any evolutionary explanation.
So it's possible that the little we got of altruism and other nice things are merely lucky memes. Not even a necessary adaptation, but more like a cultural peacock's tail, which appeared randomly and might fix itself or not. While our fundamental nature remains that of other living creatures, who eat each other without caring much.
Guilty as charged - I did read your post as arguing in favor of geometric averaging, when it really wasn't. Sorry.
The main point still seems strange to me, though. Suppose you were programming a robot to act on my behalf, and you asked me to write out some goodness values for outcomes, to program them into the robot. Then before writing out the goodnesses I'd be sure to ask you: which method would the robot use for evaluating lotteries over outcomes? Depending on that, the goodness values I'd write for you (to achieve the desired behavior from the robot) would be very different.
To me it suggests that the goodness values and the averaging method are not truly independent degrees of freedom. So it's simpler to nail down the averaging method, to use ordinary arithmetic averaging, and then assign the goodness values. We don't lose any ability to describe behavior (as long as it's consistent), and we remain with only the degree of freedom that actually matters.
That makes me even more confused. Are you arguing that we ought to (1) assign some "goodness" values to outcomes, and then (2) maximize the geometric expectation of "goodness" resulting from our actions? But then wouldn't any argument for (2) depend on the details of how (1) is done? For example, if "goodnesses" were logarithmic in the first place, then wouldn't you want to use arithmetic averaging? Is there some description of how we should assign goodnesses in (1) without a kind of firm ground that VNM gives?
This seems misguided.
The normal VNM approach is to start with an agent whose behavior satisfies some common sense conditions: can't be money pumped and so on. From that we can prove that the agent behaves as if maximizing the expectation of some function on outcomes, which we call the "utility function". That function is not unique, you can apply an affine transform and obtain another utility function describing the same behavior. The behavior is what's real; utility functions are merely our descriptions of it.
From that perspective, it makes no sense to talk about "maximizing the geometric expectation of utility". Utility is, by definition, the function whose (ordinary, not geometric) expectation is maximized by your behavior. That's the whole reason for introducing the concept of utility.
The mistake is a bit similar to how people talk about "caring about other people's utility, not just your own". You cannot care about other people's utility at the expense of your own, it's a misuse of terms. If your behavior is consistent, then the function that describes it is called "your utility".
I thought employers (and more generally the elite, who are net buyers of labor) would be happy with a remote work revolution. But they don't seem to be, hence my confusion.
Your post mentions what seems to me the biggest economic mystery of all: why didn't outsourcing, offshoring and remote work take over everything? Why do 1st world countries keep having any non-service jobs at all? Why does Silicon Valley keep hiring programmers who live in Silicon Valley, instead of equally capable and much cheaper programmers available remotely? There are no laws against that, so is it just inertia? Would slightly better remote work tech lead to a complete overturn of the world labor market?
This seems like good news about alignment.
To me it sounds like alignment will do a good job of aligning AIs to money. Which might be ok in the short run, but bad in the longer run.
Sure, but there's an important economic subtlety here: to the extent that work is goal-aligned, it doesn't need to be paid. You could do it independently, or as partners, or something. Whereas every hour worked doing the employer's bidding, and every dollar paid for it, must be due to goals that aren't aligned or are differently weighted (for example, because the worker cares comparatively more about feeding their family). So it makes more sense to me to view every employment relationship, to the extent it exists, as transactional: the employer wants one thing, the worker another, and they exchange labor for money. I think it's a simpler and more grounded way to think about work, at least when you're a worker.
I think all AI research makes AGI easier, so "non-AGI AI research" might not be a thing. And even if I'm wrong about that, it also seems to me that most harms of AGI could come from tool AI + humans just as well. So I'm not sure the question is right. Tbh I'd just stop most AI work.
Interesting, your comment follows the frame of the OP, rather than the economic frame that I proposed. In the economic frame, it almost doesn't matter whether you ban sexual relations at work or not. If the labor market is a seller's market, workers will just leave bad employers and flock to better ones, and the problem will solve itself. And if the labor market is a buyer's market, employers will find a way to extract X value from workers, either by extorting sex or by other ways - you're never going to plug all the loopholes. The buyer's market vs seller's market distinction is all that matters, and all that's worth changing. The great success of the union movement was because it actually shifted one side of the market, forcing the other side to shift as well.