tahp

Posts
Comments

Posts

Example of GPU-accelerated scientific computing with PyTorch 2025-01-01T23:01:04.606Z

Everything you care about is in the map 2024-12-17T14:05:36.824Z

Experiments are in the territory, results are in the map 2024-12-06T15:44:50.412Z

Tahp's Shortform 2024-11-14T18:04:42.536Z

Thoughts after the Wolfram and Yudkowsky discussion 2024-11-14T01:43:12.920Z

What does "the universe is quantum" actually mean? 2024-07-22T11:52:49.479Z

Metastrategy get-started guide 2024-06-25T15:04:11.542Z

What is space? What is time? 2024-06-07T22:15:55.951Z

Some perspectives on the discipline of Physics 2024-05-20T18:19:22.429Z

Comments

Comment by Tahp on Renormalization Redux: QFT Techniques for AI Interpretability · 2025-01-20T01:58:52.020Z · LW · GW

I consider the lattice to be a regulator as well, but, semantics aside, thank you for the example.

Comment by Tahp on Renormalization Redux: QFT Techniques for AI Interpretability · 2025-01-19T22:04:14.670Z · LW · GW

Field theorist here. You talk about renormalization as a thing which can smooth over unimportant noise, which basically matches my understanding, but you haven't explicitly named your regulator. A regulator may be a useful concept to have in interpretability, but I have no idea if it is common in the literature.

In QFT, our issue is that we go to calculate things that are measurable and finite, but we calculate horrible infinities. Obviously those horrible infinities don't match reality, and they often seem to be coming from some particular thing we don't care about that much in our theory, so we find a way to poke it out of the theory. (To be clear, this means that our theories are wrong, and we're going to modify them until they work.) The tool by which you remove irrelevant things which cause divergences is called a regulator. A typical regulator is a momentum cutoff. You go to do the integral over all real momenta which your Feynman diagram demands, and you find that it's infinite, but if you only integrate the momenta up to a certain value, the integral is finite. Of course, now you have a bunch of weird constants sitting around which depend of the value of the cutoff. This is where renormalization comes in. You notice that there are a bunch of parameters, which are generally coupling constants, and these parameters have unknown values which you have to go out into the universe and measure. If you cleverly redefine those constants to be some "bare constant" added to a "correction" which depends on the cutoff, you can do your cutoff integral and set the "correction" to be equal to whatever it needs to be to get rid of all the terms which depend on your cutoff. (edit for clarity: This is the thing that I refer to when I say "renormalization." Cleverly redefining bare parameters to get rid of unphysical effects of a regulator.) By this two step dance, you have taken your theoretical uncertainty about what happens at high momenta and found a way to wrap it up in the values of your coupling constants, which are the free parameters which you go and measure in the universe anyway. Of course, now your coupling constants are different if you choose a different regulator or a different renormalization scheme to remove it, but physicists have gotten used to that.

So you can't just renormalize, you need to define a regulator first. You can even justify your regulator. It is a typical justification for a momentum cutoff that you're using a perturbative theory which is only valid at low energy scales. So what's the regulator for AI interpretability? Why are you justified in regulating in this way? It seems like you might be pointing at regulators when you talk about 1/w and d/w, but you might also be talking about orders in a perturbation expansion, which is a different thing entirely.

Comment by Tahp on Tahp's Shortform · 2025-01-04T16:14:36.656Z · LW · GW

A decision-theoretic case for a land value tax.

You can basically only take income tax by threatening people. "Give me 40% of your earnings or I put you in prison." It is the nicest type of threatening! Stable governments have a stellar reputation for only doing it once per year and otherwise not escalating the extortion. You gain benefit from the stable civilization supported by such stable governments because they use your taxes to pay for it. But there's no reason for the government to put you in prison except for the fact that they expect you to give them money not to. By participating, you are showing that you will respond to threats, which is an incentive to extract more wealth from you. If enough people understood decision theory and were dissatisfied by the uses the government put their money to, they could refuse to pay and the prison system wouldn't be big enough to deal with it. Oops, it's time to overthrow the government.

Under a better land value tax, the consequence for not paying your taxes is that the government takes the land away and gives it to someone else. They aren't threatening you, they're just reassigning their commitment to protect the interests of the person who uses the land over to a user who will pay them for the service. Of course, people can still all refuse to do it if they don't like the uses to which government puts their money, and from the point of view of the person paying taxes, it's still pretty much a case of "pay up or something bad will happen to you," so some would argue that the difference is mostly academic. That said, I really prefer to have a government which does not have "devise ways to make people miserable for the purpose of making them miserable" (you know, prison as a threat) as a load-bearing element of its mechanisms of perpetuating itself.

This argument flagrantly stolen from planecrash: https://www.projectlawful.com/replies/1721794#reply-1721794 Of course planecrash also offers an argument for what gives a hypothetical government the right to claim ownership for the land: https://www.projectlawful.com/replies/1773744#reply-1773744 I was inspired to write this by Richard Ngo's definition of unconditional love at https://x.com/richardmcngo/status/1872107000479568321 and the context of that post.

Comment by Tahp on The Intelligence Curse · 2025-01-03T21:16:32.910Z · LW · GW

I think your point has some merit in the world where AI is useful and intelligent enough to overcome the sticky social pressure to employ humans but hasn't killed us all yet. That said, I think AI will most likely kill us all in that 1-5 year window after becoming cheaper, faster, and more reliable than humans at most economic activity, and I think you have to convince me that I'm wrong about that before I start worrying about humans not hiring me because AI is smarter than I am. However I want to complain about this particular point you made because I don't think it's literally true:

Powerful actors don’t care about you out of the goodness of their heart.

One of the reasons why AI alignment is harder than people think, is because they say stuff like this and think AI doesn't care about people in the way that powerful actors don't care about people. This is generally not true. You cannot in general pay a legislator $400 to kill a person who pays no taxes and doesn't vote. That is impressive when you think about it. You can argue that they fear reputational damages or going to prison, but I truly think that if you took away the consequences, $400 would not be enough money to make most legislators overcome their distaste for killing another human being with their bare hands. Some of them really truly want to make society better, even if they aren't very effective at it. Call it noblesse oblige if you want, but it's in their utility function to do things which aren't just give the state more money or gain more personal power. The people who steer large organizations have goodness in their hearts, however little, and thus the organizations they steer do too, even if only a little. Moloch hasn't won yet. America the state is willing to let a lot of elderly people rot, but America wasn't in fact willing to let Covid rip, even though that might have stopped the collapse of many tax-generating businesses, and most people who generate taxes would have survived. I don't think that's because the elderly people who overwhelmingly would have been killed by that are an important voting constituency for the party which pushed hardest for lockdown.

AI which knows it won't get caught and literally only cares about tax revenue and power will absolutely kill anyone who isn't useful to them for $400. That's $399 worth of power they didn't have before if killing someone costs $1 of attention. I don't particularly want to live in a world where 1% percent of people are very wealthy and everyone else is dying of poverty because they've been replaced by AI, but that's a better world than the one I expect where literally every human is killed because, for example, those so-called "reliable" AIs doing all of the work humans used to do as of yesterday liked paperclips more than we thought and start making them today.

Comment by Tahp on Lucius Bushnaq's Shortform · 2025-01-02T12:47:41.508Z · LW · GW

Thank you. As a physicist, I wish I had an easy way to find papers which say "I tried this kind of obvious thing you might be considering and nothing interesting happened."

Comment by Tahp on The Field of AI Alignment: A Postmortem, and What To Do About It · 2024-12-29T17:44:04.531Z · LW · GW

My current job is only offered to me on the condition that I am doing physics research. I have some flexibility to do other things at the same time though. The insights and resources you list seem useful to me, so thank you.

Comment by Tahp on The Field of AI Alignment: A Postmortem, and What To Do About It · 2024-12-27T01:24:22.759Z · LW · GW

I am a physics PhD student. I study field theory. I have a list of projects I've thrown myself at with inadequate technical background (to start with) and figured out. I've convinced a bunch of people at a research institute that they should keep giving me money to solve physics problems. I've been following LessWrong with interest for years. I think that AI is going to kill us all, and would prefer to live for longer if I can pull it off. So what do I do to see if I have anything to contribute to alignment research? Maybe I'm flattering myself here, but I sound like I might be a person of interest for people who care about the pipeline. I don't feel like a great candidate because I don't have any concrete ideas for AI research topics to chase down, but it sure seems like I might start having ideas if I worked on the problem with somebody for a bit. I'm apparently very ok with being an underpaid gopher to someone with grand theoretical ambitions while I learn the material necessary to come up with my own ideas. My only lead to go on is "go look for something interesting in MATS and apply to it" but that sounds like a great way to end up doing streetlight research because I don't understand the field. Ideally, I guess I would have whatever spark makes people dive into technical research in a pretty low-status field for no money for long enough to produce good enough research which convinces people to pay their rent while they keep doing more, but apparently the field can't find enough of those that it's unwilling to look for other options.

I know what to do to keep doing physics research. My TA assignment effectively means that I have a part-time job teaching teenagers how to use Newton's laws so I can spend twenty or thirty hours a week coding up quark models. I did well on a bunch of exams to convince an institution that I am capable of the technical work required to do research (and, to be fair, I provide them with 15 hours per week of below-market-rate intellectual labor which they can leverage into tuition that more than pays my salary), so now I have a lot of flexibility to just drift around learning about physics I find interesting while they pay my rent. If someone else is willing to throw 30,000 dollars per year at me to think deeply about AI and get nowhere instead of thinking deeply about field theory to get nowhere, I am not aware of them. Obviously the incentives are perverse to just go around throwing money at people who might be good at AI research, so I'm not surprised that I've only found one potential money spigot for AI research, but I had so many to choose from for physics.

Comment by Tahp on Everything you care about is in the map · 2024-12-20T13:37:35.781Z · LW · GW

The simulation is not reality, so it can have hidden variables, it just can't simulate in-system observers knowing about the hidden variables. I think quantum mechanics experiments should still have the same observed results within the system as long as you use the right probability distributions over on-site interactions. You could track Everett branches if you want to have many possible worlds, but the idea is just to get one plausible world, so it's not relevant to the thought experiment.

The point is that I have every reason to believe that a single-level ruleset could produce a map which all of our other maps could align with to the same degree as the actual territory. I agree that my approach is reductionist. I'm not ready to comment on LDSL

Comment by Tahp on Everything you care about is in the map · 2024-12-18T14:07:13.694Z · LW · GW

From the inside, it feels like I want to know what's going on as a terminal value. I have often compared my desire to study physics to my desire to understand how computers work. I was never satisfied by the "it's just ones and zeros" explanation, which is not incorrect, but also doesn't help me understand why this object is able to turn code into programs. I needed to have examples of how you can build logic gates into adders and so on and have the tiers of abstraction that go from adders, etc to CPU instructions to compilers to applications, and I had a nagging confusion about using computers for years until I understood that chain at least a little bit. There is a satisfaction which comes with the dissolution of that nagging confusion which I refer to as joy.

There's a lot to complain about when it comes to public education in the United States, but I at least felt like I got a good set of abstractions with which to explain my existence, which was a chain that went roughly Newtonian mechanics on top of organs on top of cells on top of proteins on top of DNA on top of chemistry on top of electromagnetism and quantum mechanics, the latter of which wasn't explained at all. I studied physics in college, and the only things I got out of it were a new toolset and an intuitive understanding for how magnets work. In graduate school, I actually completed the chain of atoms on top of standard model on top of field theory on top of quantum mechanics in a way that felt satisfying. Now I have a few hanging threads, which include that I understand how matter is built out of fields on top of spacetime, but I don't understand what spacetime actually is, and also the universe is full of dark matter which I don't have an explanation for.

Comment by Tahp on Everything you care about is in the map · 2024-12-18T00:03:52.481Z · LW · GW

I'm putting in my reaction to your original comment as I remember it in case it provides useful data for you. Please do not search for subtext or take this as a request for any sort of response; I'm just giving data at the risk of oversharing because I wonder if my reaction is at all indicative of the people downvoting.

I thought about downvoting because your comment seemed mean-spirited. I think the copypasta format and possibly the flippant use of an LLM made me defensive. I mostly decided I was mistaken about it being mean spirited because I don't think that you would post a mean comment on a post like this based on my limited-but-nonzero interaction with you. At that point, I either couldn't see what mixing epistemology in with the pale blue dot speech added to the discussion, or it didn't resonate with me, so I stopped thinking about it and left the comment alone.

Comment by Tahp on Everything you care about is in the map · 2024-12-17T19:34:27.252Z · LW · GW

I think I see what you're saying, let me try to restate it:

If the result you are predicting is course-grained enough, then there exist models which give a single prediction with probability so close to one that you might as well just take the model as truth.

Comment by Tahp on Everything you care about is in the map · 2024-12-17T19:29:13.106Z · LW · GW

I appreciate your link to your posts on Linear Diffusion of Sparse Lognormals. I'll take a look later. My responses to your other points are essentially reductionist arguments, so I suspect that's a crux.

That said, I'm using "quantum mechanics" to mean "some generalization of the standard model" in many places. In practice, the actual experimental predictions of the standard model are something like probability distributions over the starting and ending momentum states of particles before and after they interact at the same place at the same time, so I don't think you can actually run a raw standard model simulation of the solar system which makes sense at all. To make my argument more explicit, I think you could run a lattice simulation of the solar system far above the Planck scale and full of classical particles (with proper masses and proper charges under the standard model) which all interact via general relativity, so at each time slice you move each particle to a new lattice site based on its classical momentum and the gravitational field in the previous time slice. Then you run the standard model at each lattice site which has more than one particle on it to destroy all of the input particles and generate a new set of particles according to the probabilistic predictions of the standard model, and the identities and momenta of the output particles according to a sample of that probability distribution will be applied in the next time slice. I might be making an obvious particle physics mistake, but modulo my own carelessness, almost all lattice sites would have nothing on them, many would have photons, some would have three quarks, fewer would have an electron on them, and some tiny, tiny fraction would have anything else. If you interpreted sets of sites containing the right number of up and down quarks as nucleons, interpreted those nucleons as atoms, used nearby electrons to recognize molecules, interpreted those molecules as objects or substances doing whatever they do in higher levels of abstraction, and sort of ignored anything else until it reached a stable state, then I think you would get a familiar world out of it if you had the utterly unobtainable computing power to do so.

Comment by Tahp on Everything you care about is in the map · 2024-12-17T18:02:21.421Z · LW · GW

In what way? I find myself disagreeing vehemently, so I would appreciate an example.

Maps are territory in the sense that the territory is the substrate on which minds with maps run, but one of my main points here is that our experience is all map, and I don't think any human has ever had a map which remotely resembles the substrate on which we all run.

Comment by Tahp on Everything you care about is in the map · 2024-12-17T17:52:35.269Z · LW · GW

This is tangential to what I'm saying, but it points at something that inspired me to write this post. Eliezer Yudkowsky says things like the universe is just quarks, and people say "ah, but this one detail of the quark model is wrong/incomplete" as if it changes his argument when it doesn't. His point, so far as I understand it, is that the universe runs on a single layer somewhere, and higher-level abstractions are useful to the extent that they reflect reality. Maybe you change your theories later so that you need to replace all of his "quark" and "quantum mechanics" words with something else, but the point still stands about the relationship between higher-level abstractions and reality.

I'm not sure I understand your objection, but I will write a response that addresses it. I suspect we are in agreement about many things. The point of my quantum mechanics model is not to model the world, it is to model the rules of reality which the world runs on. Quantum mechanics isn't computationally intractable, but making quantum mechanical systems at large scales is. That is a statement about the amount of compute we have, not about quantum mechanics. We have every reason to believe that if we simulated a spacetime background which ran on general relativity and threw a bunch of quarks and electrons into it which run on the standard model and start in a (somehow) known state of the Earth, Moon, and Sun, then we would end up with a simulation which gives a plausible world-line for Earth. The history would diverge from reality due to things we left out (some things rely on navigation by starlight, cosmic rays from beyond the solar system cause bit flips which affect history, asteroid collisions have notable effects on Earth, gravitational effects from other planets probably have some effect on the ocean, etc.) and we would have to either run every Everett branch or constantly keep only one of them at random and accept slight divergences due to that. In spite of that, the simulation should produce a totally plausible Earth, although people would wonder where all the starts went. There do not exist enough atoms on Earth to build a computer which could actually simulate that, but that isn't a weakness in the ability of the model to explain the base-level of reality.

Comment by Tahp on Is AI alignment a purely functional property? · 2024-12-15T23:48:54.325Z · LW · GW

It may be that generating horrible counterfactual lines of thought for the purpose of rejecting them is necessary for getting better outcomes. To the extent that you have a real dichotomy here, I would say that the input/output mapping is the thing that matters. I want all humans to not end up worse off for inventing AI.

That said, humans may end up worse off by our own metrics if we make AI that is itself suffering terribly based off of its internal computation or it is generating ancestor torture simulations or something. Technically that is an alignment issue, although I worry that most humans won't care if the AI is suffering if they don't have to look at it suffer and it generates outputs that humans like aside from that hidden detail.

Comment by Tahp on Write Good Enough Code, Quickly · 2024-12-15T15:51:39.338Z · LW · GW

I'm doing a physics PhD, and you're making me feel better about my coding practices. I appreciate your explicit example as well, as I'm interested in trying my hand at ML research and curious about what it looks like in terms of toolsets and typical sort-of-thing-one-works-on. I want to chime in down here in the comments to assure people that at least one horrible coder in a field which has nothing to do with machine learning (most of the time) thinks that the sentiment of this post is true. I admit that I'm biased by having very little formal CS training, so proper functional programming is more difficult for me than writing whatever has worked for me in the past writing ad-hoc Bash scripts. My sister is a professional software developer, and she winces horribly at my code. However, you point out that it is often the case that any particular piece of research code you are running has a particular linear set of tasks to achieve, and so:

You don't need to worry much about resilient code which handles weird edge cases.
It is often better to have everything in one place where you can see it than to have a bunch of broken up functions scattered across a folder full of files.
Nobody else will need to use the code later, including yourself, so legibility is less important

As an example of the Good/Good-enough divide, here's a project I'm working on. I'm doing something which requires speed, so I'm using c++ code built on top of old code someone else wrote. I'm extremely happy that the previous researcher did not follow your advice, at least when they cleaned up the code for publishing, because it makes life easier for me to have most of the mechanics of my code hidden away out of view. Their code defines a bunch of custom types which rather intuitively match certain physical objects. They wrote a function which parses arg files so that you don't need to recompile the code to rerun a calculation with different physical parameters. Then there's my code which uses all of that machinery: My main function that I have written is sort of obviously a nest of loops over discrete tasks which could easily be separate functions, but I just throw them all together into one file, and I rewrite the whole file for different research questions so I have a pile of "main" files which reuse a ton of structure. As an example of a really ugly thing I did, I hard-code indices corresponding to momenta I want to study into the front of my program instead of making a function which parses momenta and providing an argument file listing the sets I want. I might have done that for the sake of prettiness, but I needed to provide a structure which lets me easily find momenta of opposite parity. Hard-coding the momenta let me keep the structure I was using at front of mind when I created the four other subtasks in the code which exploited that structure to let me construct subtasks which needed to easily find objects of opposite parity.

Comment by Tahp on Doing Research Part-Time is Great · 2024-11-23T03:07:03.221Z · LW · GW

I'd agree with you, because I'm a full-time student, but I'm doing research part-time in practice because I'm losing half my time to working as a TA to pay my rent. Part of me wonders if I could find a real job and slow-roll the PhD.

Comment by Tahp on Thoughts after the Wolfram and Yudkowsky discussion · 2024-11-15T15:16:37.305Z · LW · GW

I think we're both saying the same thing here, except that the thing I'm saying implies that I would bet for Eliezer being pessimistic about this. My point was that I have a lot of pessimism that people would code something wrong even if we knew what we were trying to code, and this is where a lot of my doom comes from. Beyond that, I think we don't know what it is we're trying to code up, and you give some evidence for that. I'm not saying that if we knew how to make good AI, it would still fail if we coded it perfectly. I'm saying we don't know how to make good AI (even though we could in principle figure it out), and also current industry standards for coding things would not get it right the first time even if we knew what we were trying to build. I feel like I basically understanding the second thing, but I don't have any gears-level understanding for why it's hard to encode human desires beyond a bunch of intuitions from monkey's-paw things that go wrong if you try to come up with creative disastrous ways to accomplish what seem like laudable goals.

I don't think Eliezer is a DOOM rock, although I think a DOOM rock would be about as useful as Eliezer in practice right now because everyone making capability progress has doomed alignment strategies. My model of Eliezer's doom argument for the current timeline is approximately "programming smart stuff that does anything useful is dangerous, we don't know how to specify smart stuff that avoids that danger, and even if we did we seem to be content to train black-box algorithms until they look smarter without checking what they do before we run them." I don't understand one of the steps in that funnel of doom as well as I would like. I think that in a world where people weren't doing the obvious doomed thing of making black-box algorithms which are smart, he would instead have a last step in the funnel of "even if we knew what we need a safe algorithm to do we don't know how to write programs that do exactly what we want in unexpected situations," because that is my obvious conclusion from looking at the software landscape.

Comment by Tahp on Thoughts after the Wolfram and Yudkowsky discussion · 2024-11-14T23:41:18.244Z · LW · GW

I might as well check out the panel discussion. I didn't know about it.

I think I listened to the Hotz debate. The highlight of that one was when Hotz implied that he was using an LLM to drive a car, Yudkowsky freaks out a bit, and Hotz clarifies that he means the architecture for his learning algorithm is basically the same as an LLM.

I suspect the Destiny discussion is qualitatively similar to the Dwarkesh one.

At this point, maybe I should just read old MIRI papers.

Comment by Tahp on Tahp's Shortform · 2024-11-14T18:27:39.036Z · LW · GW

I think that our laws of physics are in part a product of our perception, but I need to clarify what I mean by that. I doubt space or time are fundamental pieces in whatever machine code runs our universe, but that doesn't mean that you can take perception-altering drugs and travel through time. I think that somehow the fact that human intelligence was built on the evolutionary platform of DNA means that any physics we come up with has to build up to atoms which have the chemical properties that make DNA work. Physics doesn't have to describe everything, it just needs to describe the things relevant to DNA, which is in fact a lot! DNA can code the construction things which react to electromagnetic fields correlated with all sorts of physical processes.

This leads me to the question of what would it look like to see an alien that runs on different physics on the same universe platform through our physics. As an example which I haven't thought through rigorously, you can formulate non-relativistic quantum mechanics with momentum and position operators, but you move back and forth between them with Fourier transforms which only differ by a sign flip. You could make a self-consistent physics by just exchanging all of the momentum and position operators with each other. Maybe you could end up with localized atoms which are near each other and interacting in momentum space but diffuse nonsense in our native position space. If you build life in that universe, maybe it doesn't have localized structure in ours, and maybe it just acts like diffuse energy or something to us.

Comment by Tahp on Tahp's Shortform · 2024-11-14T18:04:42.647Z · LW · GW

This is unoriginal, but any argument that smart AI is dangerous by default is also an argument that aliens are dangerous by default. If you want to trade with aliens, you should preemptively make it hard enough to steal all of your stuff so that gains from trade are worthwhile even if you meet aliens that don't abstractly care about other sentient beings.

Comment by Tahp on Thoughts after the Wolfram and Yudkowsky discussion · 2024-11-14T17:56:48.362Z · LW · GW

I don't think you're being creative enough about solving the problem cheaply, but I also don't think this particular detail is relevant to my main point. Now you've made me think more about the problem, here's me making a few more steps toward trying to resolve my confusion:

The idea with instrumental convergence is that smart things with goals predictably go hard with things like gathering resources and increasing odds of survival before the goal is complete which are relevant to any goal. As a directionally-correct example for why this could be lethal, humans are smart enough to do gain-of-function research on viruses and design algorithms that predict protein folding. I see no reason to think something smarter could not (with some in-lab experimentation) design a virus that kills all humans simultaneously at a predetermined time, and if you can do that without affecting any of your other goals more than you think humans might interfere your goals, then sure, you kill all the humans because it's easy and you might as well. You can imagine somehow making an AI that cares about humans enough not to straight up kill all of them, but if humans are a survival threat, we should expect it to find some other creative way to contain us, and this is not a design constraint you should feel good about.

In particular, if you are an algorithm which is willing to kill all humans, it is likely that humans do not want you to run, and so letting humans live is bad for your own survival if you somehow get made before the humans notice you are willing to kill them all. This is not a good sign for humans' odds of being able to get more than one try to get AI right if most things are concerned with their own survival, even if that concern is only implicit in having any goal whatsoever.

Importantly, none of this requires humans to make a coding error. It only requires a thing with goals and intelligence, and the only apparent way to get around it is to have the smart thing implicitly care about literally every thing that humans care about to the same relative degrees that humans care about them. It's not a formal proof, but maybe it's the beginning of one. Parenthetically, I guess it's also a good reason to have a lot of military capability before you go looking for aliens, even if you don't intend to harm any.

Comment by Tahp on Thoughts after the Wolfram and Yudkowsky discussion · 2024-11-14T14:42:44.040Z · LW · GW

Oops, I meant cellular, and not molecular. I'm going to edit that.

I can come up with a story in which AI takes over the world. I can also come up with a story where obviously it's cheaper and more effective to disable all of the nuclear weapons than it is to take over the world, so why would the AI do the second thing? I see a path where instrumental convergence leads anything going hard enough to want to put all of the atoms on the most predictable path it can dictate. I think the thing that I don't get is what principle it is that makes anything useful go that hard. Something like (for example, I haven't actually thought this through) "it is hard to create something with enough agency/creativity to design and implement experiments toward a purpose without also having it notice and try to fix things in the world which are suboptimal to the purpose."

Comment by Tahp on What is space? What is time? · 2024-06-09T21:52:05.977Z · LW · GW

Be careful. Physics seems to be translation invariant, but space is not. You can drop the ball in and out of the cave and its displacement over time will be the same, but you can definitely tell whether it is in the cave or out of the cave. You can set your zero point anywhere, but that doesn’t mean that objects in space move when you change your zero point. Space is isotropic. There’s no discernible difference between upward, sideways, or diagonal, but if you measure the sideways distance between two houses to be 40 meters, a person who called your “sideways” their “up” will measure the distance between the houses to be 40 meters up and down. You can do everything here as you can do there, but here is not there. In the absence of any reference point, no point in space is different from any other point, but in the absence of any reference point there’s no need for physics, because if there was anything to describe with physics, you could use it as a reference point.

I suppose you could try to define space as the thing you can move around in without changing your physics, but the usual strategy is to define physics and derive conservation of momentum from the fact that your physics is translation invariant.

Comment by Tahp on Smoke without fire is scary · 2022-10-08T01:41:29.242Z · LW · GW

The main idea seems good: if you're in a situation where you think you might be in the process of being deceived by an AI, do not relax when the AI provides great evidence that it is not deceiving you. The primary expected outputs of something really good at deception should be things which don't look like deception.

Some of the things in the post don't seem general enough to me, so I want to try to restate them.

Test 1 I like. If you understand all of the gears, you should understand the machine.

Test 2 I like. Tweak the model in a way that should make it worse at process A, and then check the outputs of process A to see whether they change in ways that reveal biases you can triangulate with past outputs in order to catch deception.

Test 3 scares me. This seems to me to be a test for the very specific form of deception "hide capabilities while in testing but not when deployed" which seems like a small domain in deception space.

Test 4 I kind of like. One expected behavior from a deceptive AI is to hide capabilities that scare people studying them. If you see scary capabilities getting less scary, you should take that as a sign of deception unless you have good reason to believe those capabilities should get worse with scale. Thus it is a good idea to find out which things should get worse with scale ahead of time. I do worry that this paradigm relies too much on AI which improves via "more dakka" (eg more GPUs, larger datasets, better processors, etc) rather than via algorithm improvements or something, in which case I don't know that people will have a good handle on what capabilities will get worse. The "scaling helps" section also worries me for this reason.

In the section "deceptive models know this" you suggest "deciding on a level of deceptive capabilities that’s low enough that we trust models not to be deceptively aligned". Won't that just optimize on things which start deceiving well earlier? I think I may be misinterpreting what you mean by "deceptive capabilities" here. Maybe your "deceptive capabilities" are "smoke" and actual deception is "fire", but I'm not sure what deceptive capabilities that aren't deception are.

Comment by Tahp on Book review: The Age of Surveillance Capitalism · 2022-02-21T01:53:54.453Z · LW · GW

The ad market amounts to an auction for societal control. An advertisement is an instrument by which an entity attempts to change the future behavior of many other entities. Generally it is an instrument for a company to make people buy their stuff. There is also political advertising, which is an instrument to make people take actions in support of a cause or person seeking power. Advertising of any type is not known for making reason-based arguments. I recall in an interview with the author that this influence/prediction market was a major objection to the new order. If there is to be a market where companies and political-power-seekers bid for the ability to change the actions of the seething masses according to their own goals, the author felt that the seething masses should have some say in it.

To me, the major issue here is that of consent. It may very well be that I would happily trade some of my attention to Google for excellent file-sharing and navigation tools. It may very well be that I would trade my attention to Facebook for a centralized place to get updates about people I know. In reality, I was never given the option to do anything else. Google effectively owns the entire online ad market which is not Facebook. Any site which is not big enough to directly sell ads against itself has no choice but to surrender the attention of its readers to Google or not have ads. According to parents I know, Facebook is the only place parents are organizing events for their children, so you need a Facebook page if you want to participate in your community. In the US, Facebook marketplace is a necessity for anyone trying to buy and sell things on the street. I often want to look up information on a local restaurant, only to find that the only way to do so is on their Instagram page, and I don't have an account, so I can't participate in that part of my community. The tools which are holding society together are run by a handful of private companies such that I can't participate in my community without subjecting myself to targeted advertising which is trying to make me do things I don't want to do. I find this disturbing.

Comment by Tahp on Some thoughts on vegetarianism and veganism · 2022-02-14T15:54:53.028Z · LW · GW

There’s also timeless decision theory to consider. A rational agent should take other rational agents into consideration when choosing actions. If I choose to go vegan, it stands to reason that similarly acting moral agents would also choose that course. If many (but importantly not all) people want to be vegan, then demand for vegan foods goes up. If demand for vegan food goes up, then suppliers make more vegan food and have an incentive to make it cheaper and tastier. If vegan food is cheaper and tastier, than more people who were on the fence about veganism can make the switch. It’s a virtuous cycle. Just in the four years since I went vegan, I’ve noticed that packaged vegan food is much easier to find in the grocery store I’ve been using for 5 years. My demand contributed to that change.

I’m not sure whether there’s a moral case against animal suffering anymore, but I still think plant farming is net better than animal farming for other reasons. Mass antibiotic use risks super-bugs, energy use is much higher for non-chicken farming than for plants, and the meat-processing industry has more amputation in its worker base than I like. I would like to incentivize readily available plant based food.

User info

Posts

Comments