Posts
Comments
o3 has a different base model (presumably).
All of the figures are base model equivalated between RL and not
I would expect "this paper doesn't have the actual optimal methods" is true, this is specifically a test for PPO for in distribution actions. Concretely, there is a potential story here about PPO reinforces traces that hit in self-play, consequently, there is a sense which we would expect it to only select previously on policy actions.
But if one has enough money, you can finetune GPT models, and test that.
Also note that 10k submissions is about 2 OOM out of distribution for the charts in the paper.
Pass at inf k includes every path with nonzero probability (if there is a policy of discarding exact repeat paths).
We know that RL decreases model entropy, so the first k passes will be more different for a high variance model.
Pass at k is take best, so for normal distribution take best has EV mean+variance*log(samples).
At very large K, we would expect variance to matter more than mean.
Noone cared.
You don't know what questions they did not ask you, and the assumptions of shared cultural background that they made because they saw that. They would not tell you. (unless you have comparisons to job searching before getting the degree).
Fundamentally, this is the expected phenomenology, since people do not tend to notice sources of your own status.
Credentialism is good because the limiting factor on employment is trust, not talent for most credential requiring positions (white collar, buisness and engineering work).
Universities are bad at teaching skills, but generate trust and social capital.
Trust that allows the system to underwrite new white collar workers to do things that might lose buisnesses lots of money is important and expensive.
Consequently you get credential requirements, because there is no test other than years of being in social systems that can tell you that a person has the ability to go 4 years without crashing out (which is the key skill).
Additionally, going to university has become a class signifier, and all classes wish they were bigger and more prominent.
The alternative to credentialism is selection, or real meritocracy.
The alternative to credentialism is not selection, it is hiring your buddies, hiring by visible factors, and hiring randomly. Most business are not that guy that they can run a competitive selective process (THOSE ARE REALLY EXPENSIVE).
"universities provide to employers is the ability to confirm you are clever, driven, and have relevant skills" is false. They provide that you are a member of the professional class that is not going to do stupid things that lose money/generate risk.
Fundamentally, this misunderstands the purpose of the degree to the hiring bureaucracy, and the political economy behind it.
In short, it seems like the current system unfairly kills drugs that take a long time to develop and do not have a patentable change in the last few years of that cycle.
If the story about drug prices and price controls is correct (that price controls are bad because the limiting factor for drug development is returns on capital, which this reduces), then we must rethink the political economy of drug development.
Basically, we would expect if that to be the case that the sectoral return rates of biotech to match the risk adjusted rate , but drug development is both risky and skewed, effecting costs of capital.
Most of drug prices are capital costs, and so interventions that lower the capital costs of pharmaceutical companies might produce more drugs.
Most of those capital costs from the total raise required, which is effected basically by the costs of pharmaceutical research (which is probably mostly the labor of expensive professionals).
The expected rate of return is dominated by the risks of pharmaceutical companies.
Drug prices are what the market will bear/monopoly for a time, then drop to a very low level once a compound is generic.
There is a big problem here with out of patent molecules, since if a drug is covered by a patent and stalls 20 years, there is not the return to push it through the process, which means that there might be zombie drugs around from companies that fell apart and did a bad job of selling that asset (so it did not finish the process and did not fail the process).
There seems to be space for the various approvals to become more IP like (so that all drugs have the same exclusivity, regardless of how long they took to prove out).
I don't think that people from the natsec version have made that update, since they have been talking this line for a while.
But the dead organization framing matters here.
In short, people think that democratic institutions are not dead (especially electoralism). If AGI is "Democratic", that live institution, in which they are a stakeholder, will have the power to choose to do fine stuff. (and might generalize to everybody is a stakeholder) Which is + ev, especially for them.
They also expect that China as a live actor will try to kill all other actors if given the chance.
I am neither an American citizen nor a Chinese citizen.
does not describe most people who make that argument.
Most of these people are US citizens, or could be. under liberalism/democracy those sorts of people get a say in the future, so think AGI will be better if it gives those sorts of people a say.
Most people talking about the USG AGI have structural investments in the US, which are better and give them more chances to bid on not destroying the world. (many are citizens or are in the US block). Since the US government is expected to treat other stakeholders in its previous block better than China treats members of it's block, it is better for people who are only US aligned if the US gets more powerful, since it will probably support its traditional allies even when it is vastly more powerful, as it did during the early cold war. (This was obvious last year and no longer obvious).
In short, the USG was commited to international liberalism, which is a great thing for AGI to have for various reasons which are hard to say, but basically of the form that liberals are commited to not doing crazy stuff.
People who can't reason well about the CCP's internal ideologies /political conflicts(like me), and predict ideological alignemnt for AGI, think that USG AGI will use the frames of international liberalism (which don't let you get away with terrible things even if you are powerful), and worry about frames of international realism (which they assign to China, since they cannot tell, and argue that if you have the power you must/should/can use it to do anything, including ruining everybody else).
As a summary, if you are not an american citizen, do not trust the US natsec framing. A lot of this is carryover from times when the US liberal international block (global international order), was stronger, and so as a block framing it is better iff the US block is somehow bigger, which at the time it was.
You would need to make sure that this change in asset values does not wipe out highly leveraged players. But that is also a thing that has been done in the past. (see 2023 banking failures for what happens).
Problem is that the relation of asset values to realized returns (that they will equalize between assets with fixed returns), means that any tax on asset returns immediately reflects in the valuations. But that is not the end of the world, since if you hold assets, paine would argue, you can afford a haircut.
I noticed another thing.
All these analysis put a lot of stock on the democrats being Anti-market, because well, it is in the democratic discourse. But I think that is misreading that discourse. A lot of it is that the democrats are rightly very scared and suspiscious (almost paranoid), over monopolies, monopsonies, and cartelization. And they don't just endorse the obvious solution of agressively breaking up companies. (since it is bad for buisnesses even though it is good for competition)
But i just don't think that it is the only way to frame that. Especially Biden's SEC and FTC are very skeptical of M and A because they are very scared of monopolies, and most of the democratic policies make sense in a we think that there might be an x monopoly, and we dont just want to point antitrust at it, so what should we do.
And generally the solution that they come up with is that government should engage in effectively price negotiations with the monopoly provider, where they use to law to get people to coordinate in bargining for a better price, so you end up with 2 agent no alt bargianing as the pricing mechanism), hopefully to agree to something closer to the free market pricing (the price capping). That is a bad pricing mechanism (often ending up below market). It is really hard to figure out the coordination method used so as to break them up. This is a bad solution. If you think there is a cartel, you don't put in a price cap, you break the cartel.
Another broad problem is not noticing (or caring) about the degree to which being a good administrator of the federal bureaucracy, is a critical skill for a president. The things where it seems like Trump has no clue what is going on was baked in in P2025, when it talked about doing things that disrupted the normal function of agencies, because 90% of what the president knows he is getting from his secretaries and advisors, which get stuff from their departments. The fact that Trump watches Fox but sometimes ignores briefings is in fact a big deal.
Harris winning probably would not stop the democratic civil war, (unless she got some deals done), because the democrats have a civil war every election cycle and did not get a chance to do so in the primaries. We don't know how she would govern through that.
Note that knowing != doing, so in principle there is a gap between a world model which includes lots of information about what the user is feeling (what you call cognitive empathy), and acting on that information in prosocial/beneficial ways.
Similarly, one can considers anothers emotions both to mislead or to comfort someone.
There is a bit of tricky framing/ training work in making a model that "knows" what a user is feeling, having that at a low enough layer that the activation is useful, and actually acting on that in a beneficial way.
Steering might help taxonimize here.
. It is within our power to prevent lab-originated pandemics but not natural pandemics
Might be false.
If you could clear vaccines for deployment with good transmission prevention before zoonosis, and the hypothesis that viruses in the wild that are prone to zoonosis are observable (so you can prepare), then you could basically prevent zoonosis events from becoming pandemics (because you could immediately begin ring vaccination).
So there are no "natural pandemics", there are new diseases which interact with social conditions to become pandemic (just as existing diseases can mutate past current limitations). If those social conditions do not exist, disease does not reach pandemic status.
Generally the releases of the "open source" models release the inference code and the weights, but not the exact training data, and often not information about training setup. (for instance, Deepseek has done a pile of hacking on how to get the most out of their H800s, which is private)
Thank you.
The effect will at first be most clear in fields where entry-level cognitive tasks are especially prone to near-term AI automation, like software engineering and law as seen above. Other industries where complex human interaction or physical dexterity are crucial, for example healthcare and construction, will hold out for longer before being automated too through a combination of AI and robotics.
Repeat Paragraph
Historically, the social contract describes the set of agreements concerning the legitimacy of government and its role in governing citizens. This concept, developed by philosophers like Hobbes, Locke, and Rousseau, posits that individuals surrender certain natural freedoms and contribute a portion of their wealth to governments in exchange for protection, order, and social stability.
Over time, this foundational concept evolved beyond the basic relationship between citizen and state. The industrial age expanded the social contract to encompass economic relationships between workers, employers, and broader society. Citizens came to expect not just security, but also economic opportunity. Governments increasingly took on responsibilities for education, infrastructure, and basic welfare as part of their obligation under this implicit agreement.
This evolution produced the modern social contract we recognize today: citizens contribute their labor and a portion of their earnings through taxation; in return, they receive not just protection but also economic security and the promise that hard work would be rewarded with prosperity.
I am not sure how this social change is discontinuous with previous developments which introduced new social conditions, new capabilities, and new externalities. In short, it is clear that if there is big economic change, there will be political changes too, but if this is rethinking the social contract, then we have been doing this continuosly. We do not need to begin to rethink the social contract. We need to recognize that we have always been continually rethinking it.
There is another analogy where this works. It is like bank failures, where things fall apart slowly, then all at once. That is to say that being past the critical threshold does not guarntee failure timing. Specifically, you can't tell if an organization can do something without actually trying to do it. So noticing disempowerment is not helpful if you notice it only after the critical threshold, where you try something and it does not work.
Mainly things that we would never think of, as fruitful for AI and not for us.
Things that are useful for us but not for AI is things like investigating gaps in tokenization, hiding things from AI, and things that are hard to explain/judge, because we probably ought to trust the AI researchers less than we do human researchers with regards to good faith.
That is, given that you get useful work out of AI-driven research before things fall apart (reasonable but not guarnteed).
That being said, this strategy relies on approaches that are fruitful for us and fruitful to AI assisted, accelerated, or done research to be the same approaches. (again reasonable, but not certain).
It also relies on work done now giving useful direction, especially if paralelism grows faster than serial speeds.
In short, this says that if time horizons to AI assistance are short, the most important things are A. The framework to be able to verify an approach, so we can hand it off. B. Information about whether it will ultimately be workable.
As always, it seems to bias towards long term approaches where you can do the hard part first.
If this becomes widespread, and there are two problems bad enough that they might create significant backlash.
if things like 4. happen, or get leaked because data security is in general hard, people will either generate precautions on their own (use controls like local models or hosting on dedicated cloud infrastructure). There is a tension that you want to save all context so that your therapist knows you better, and you will probably tell them things that you do not want others to know.
Second, there is a tension with not wanting the models to talk about particular things, in that letting the model talk about suicide can help prevent it. But if this fails, somebody talks to a model about suicide, it says what it thinks would help, and it does not work, that will be very unpopular even if the model acted in a high EV manner.
You are wrong. The article does not refute that argument because (2) is about exactly the large dimensions of types of talent demanded. (Since universities want a variety of things)
You are assuming the consequent that there is not a large variety of things a university wants.
Saying if you relax a problem, it is easier, is not an argument that it can be relaxed. That is your fundamental misunderstanding. For the university, they do really find value in the things they select for with 2, so they have a lot of valuable candidates, and so picking a mixture of valuable candidates with a large supply of hard to compare offerings is in fact difficult, and will leave any one metric too weak for their preferences.
This is about your top line claim, and your framing.
If you try to say that if you exclude the reason a system is competitive, it does not need to be competitive, this is obvious.
The system you propose does not fufill the top line purposes of the admissions system.
there isn’t a huge oversupply of talent at all for these spots,
Misses the fact that the complexity of the admissions process does not come from competition over talent (universities would be willing to accept most people on their waitlist if they had more slots, and slots are limited by other factors), but from highly multidimensional preference frontiers which require complicated information about applicants to get good distributions of students.
Basically, the argument about talent is wrong directioned for talking about admission systems.
University cohorts are basically setup to maximally benefit the people who do get admitted, not to admit the most qualified. For this purpose universities would rather have students that make the school more rewarding for other students, and not the smartest possible students. This is combined with a general tendency to do prestigous/donor wanted things. And donors want to have gone to a college that is hard to get into (even though they did not like applying). The challenge of application (and with that admit rates over yield rates) is a signal.
I think you might fundamentally misunderstand the purpose of admission systems. To be frank, admissions is set up to benefit the university and the university alone. If getting good test scores was the bottleneck, you would see shifts in strategic behaviour until the test became mostly meaningless. For instance, you can freely retake the SAT, so if you just selected based on that people would just retake till they got a good result.
The university has strong preferences about the distribution of students in classes. They have decided that they want different things from their applicants than "just" being good at tests.
They get this exactly through account race-based affirmative action, athletic recruitment, “Dean’s Interest List”-type tracking systems for children of donors or notable persons, and legacy preference for children of alumni, and a bunch of ill articulated selection actions in admissions offices and in other various places.
stable-marriage system would require a national system, which would require universities as distinct and competing organizations (mostly for prestige) to coordinate for the benefit of students. They obviously should do things wiht that general description, but they tend not to.
Maximize EV is probably a skewed distribution. But maximize skewness+variance+EV is lower EV than maximize EV for almost certain.
Exactly. You would expect hypersuccess or bust to be a lower mean strategy than maximize EV.
There might be a third level to this approach. You can imagine that there are efficient vs inefficient coalitions. That is to say, some ways of organizing might be coherent (they do find courses of action), and well founded (have good recursive properties with regard to the coherence/properties of subagents under internal confict), but in which valuable trades do not happen, or overall overhead is high.
A good example is well vs badly managed companies. Even if there is not infighting, and they do come to some decisions, some companies do good jobs of actually achiving their goals given the individual competence of their members, and others have very competant subagents, who organize well and just structurally don't exectute.
So I think that you can measure the degree to which the agent is the most effectual organization of some subagents (for instance is task splitting efficient), especially past the human scale where coalitions are more freely formed.
Billionares probably give bad advice
Why?
Because in scenarios where you make decisions that have a combination of changes to variance and changes to mean, selecting for the highest value (and maximizing the odds of passing an arbitrary threshold), sometimes increasing variance increases your odds of being in the top bracket more than increases in mean. Specifically, for a given threshold over the mean, increasing variance means increases in the chance to pass it, and similarly for skewness. This occurs for both absolute, and compared to a proportion drawn from a fixed distribution (which is statistically similar to an absolute threshhold).
Buisness hypersuccess has as much to do with doing high variance things as it does with doing high-EV things
This checks out with annecdotal evidence from things like Forbes top 200 by wealth, (most have concentrated holdings and are CEOs in high variance industries) and other measures of things like elite athletes.
(Probably comes from things like the tails separating).
I could work out the precise sizes of these effects for gaussians.
That does not look like state valued consequentialism as we typically see it, but as act valued consequentialism (In markov model this is sum of value of act (intrinsic), plus expected value of the sum of future actions) action agent with value on the acts, Use existing X to get more Y and Use existing Y to get more X. I mean, how is this different from the value on the actions X producing y, and actions Y producing X, if x and Y are scale in a particular action.
It looks money pump resistant because it wants to take those actions as many times as possible, as well as possible, and a money pump generally requires that the scale of the transactions drops over time (the resources the pumper is extracting). But then the trade is inefficient. There is probably benefits for being an efficient counterparty, but moneypumpers are inefficient counterparties.
You are right.
I had an interesting conversation about this.
This is in contrast to consequentialism as a part/tool of other moral systems
If there is a deotological rule don't murder, then the question becomes, what makes an action murder.
Murder is when you kill someone. (Obviously).
That is, murder is an action with the consequence of death (to rounding)
So there is a sense in which consequentialism is also for saying which actions fall into which categories in some other moral framework.
I retract.
"What actions are what" is a question only a consequentialist would ask (consequentialism grows out of the ontology of figuring out the consequences of actions).
Other moral systems can/do exist in ontologies where you do not believe that is possible, or do not trust that it is possible to be good at that, and so judge on other grounds.
Serious take
CDT might work
Basically because of the bellman fact that
the option
1 utilon, and play a game with EV 1 utilon are the same.
So working out the bellman equations
If each decision changes the game you are playing
This will get integrated.
In any case where somebody is actually making decisions based on your decision theory
The actions you take in previous games might also have the result
Restart from position x with a new game based on what they have simulated to do
The hard part is figuring out binding.
Note to self
A point that we cannot predict past (classically the singularity), does not mean that we can never predict past it. Just that we can't predict past at this point. It is not a sensical thing to predict the direction of your predictions at a future point in time (or it is, but will not get you anywhere). But we can predict that our predictions of an event likely improve as we near it.
Therefore, arguments that because we have a prediction horizon, we cannot predict past a certain point will always appear defeated by the fact that now that we have neared it, we can now predict past it are unconvincing, since we now have more information.
However, arguments that we will never predict past a certain point need to justify why our prediction ability will in fact get worse over time.
forward pass (e.g. the residual stream) has to be deleted, outputting only a single token.
Does not actually happen.
What it is that the new token is now at the root of the attention structure, and can pass information from the final layers to the first layers inferencing the next token.
The residuals are translation independent, and are cached for further inference in autoregressive mode.
A trivial note
Given standard axioms for Propositional logic
A->A is a tautology
Consequently, 1. Circularity is not a remarkable property (It is not any strong argument for a position)
2. Contradiction still exists
But a system cannot meaningfully say anything about it's axioms other than their logical consequences.
Consequently, since axioms being the logical consequences of themselves is exactly circularity
In a bayesian formulation there is no way of justifying a prior
Or in regular logic you cannot formally justify axioms nor the right to take them.
Thought before I need to log off for the day,
This line of argument seems to indicate that physical systems can only completely model smaller physical systems (or the trivial model of themselves), and so complete models of reality are intractable.
I am not sure what else you are trying to get at.
The problem seems to be that we have free choice of internal formal systems and
A consistent system, extended by an unprovable axiom, is also consistent, (since if this property was false then we would be able to prove the axiom by taking the extension and searching for a contradiction).
Consequently accepting the unprovable as true or false only has consequences for other unprovable statements.
I don't think this entire exercise says anything.
In short we expect for probablistic logics and decision theories to converge under self-reflection.
I think you might be grossly misreading Godel's incompleteness theorem. Specifically, it proves that a formal system is either incomplete or inconsistent. You have not addressed the possibility that minds are in fact inconsistent/make moves that are symbolically describable but unjustifiable (which generate falsehoods)
We know both happen.
The question then is what to do with inconsistent mind.
Actually, I think this argument demonstrates the probable existence of the opposite of it's top line conclusion.
In short, we can infer from the fact that a symbolic regulator has more possible states than it has inputs that anything that can be modeled as a symbolic regulator has a limited amount of information about it's own state (that is, limited internal visibility). You can do clever things with indexing so that it can have information about any part of its state, but not all at once.
In a dynamic system, this creates something that acts a lot like consciousness, maybe even deserves the name.
How much of a consensus is there on pausing AI
Not much compared to the push to get the stuff that already exists out to full deployment (For various institutions this is meaningful impact on profit margins).
People don't want to fight that, even if they think that further capabilities are bad price/risk/benefits tradeoff.
There is a co-ordination problem where if you ask to pause and people say no, you can't make other asks.
3rd. They might just not mesh/trust that particular movement and the consolidation of platform it represents, and so want to make points on their own instead of joining a bigger organizations demands.
The Good Regulator Premise
Every good regulator of a system must be a model of that system. (Conant and Ashby)
This theorem asserts a necessary correspondence between the regulator (internal representation) and the regulated system (the environment or external system). Explicitly, it means:
A good map (good regulator) of an island (system) is sufficient if external to the island.
But if the map is located on the island, it becomes dramatically more effective if it explicitly shows the location of the map itself ("you are here"), thus explicitly invoking self-reference.
In the case of a sufficiently expressive symbolic system (like the brain), this self-reference leads directly into conditions required for Gödelian incompleteness.Therefore: The brain is evidently a good regulator
Is not the good regulator theorem.
The good regulator theorem is "there is a (deterministic) mapping h:S→R from the states of the system to the states of the regulator."
I don't think this requires embeddedness
An AI control solution is per se a way to control what a AI is doing. If you have AI control, you have the option to tell your AI, don't go FOOM, and have that work.
You would not expect a control measure to continue to work if you told an AI under an AI control protocol to go FOOM.
Improvements in training efficiency are only realized if you actually train the model, and AI control allows you to take the decision to realize training efficiency gains by training a model to a higher point of performance away from the AI that is controlled.
FOOM for software requires that that decision is always yes (either since people keep pushing or the model is in the drivers seat).
So put broadly, the AI control agendas answer to what should you do with an AI system that could go FOOM is not to let it try. Since before it goes FOOM, the model is not able to beat the controls, and going FOOM takes time where the model is working on improving itself not trying hard to not get violently disassembled, an AI control protocol is supposed to be able to turn an AI that goes FOOM over the explicit controls over the course of hours, weeks or months into a deactivated machine.
AI control protocols want to fail loud for this reason. (But a breakout will involving trying to get silent failure for the same reason)
A quick thought on germline engineering, riffing off of https://www.wired.com/story/your-next-job-designer-baby-therapist/, which should not be all that suprising
Even if germline engineering is very good, and so before a child is born we have a lot of control about how things will turn out, once the child is born people will need to change their thinking because they do not have that control any longer. Trying to keep that control for a long time is probably a bad idea. Similarly, if things are not as expected, action as always should be taken on the world as it turned out, not as you planned it. No amount of gene engineering will be so powerful that social factors are completely irrelevant.
The idea of abstraction is generaly a consequential formulation. A thing is an abstraction when its predicts the same consequences as a system produces. Abstraction of moral values would need exactly "moral judgements about an action at different levels of abstraction" to behave properly as collections.
you can have different moral judgements about an action at different levels of abstraction
Is philosophically disputed at a deep level.
(some) Moral realists would disagree with you.
- This is powerful evidence that even though models are trained to output one word at a time, they may think on much longer horizons to do so.
from anthropics most recent release, mainly was the thought.
I was trying to fit that into how that behaviour shows up.
I noticed a possible training artifact that might exist in LLMs, but am not sure what the testable prediction is. That is to say I would think that the lowest loss model for the training tasks will be doing things in the residual stream for the benefit of future tokens, not the collumn aligned token.
- the residuals are translation invariant
- The gradient is the gradient of the overall loss
3. therefore when taking the gradient of the attention heads, we also take the gradient of the residuals of past tokens of the total loss, not just the loss for the gradient of the loss for the activation column aligned for them
Thus we would expect to see some computation being donated to tokens further ahead in the residual stream (if it was efficient).
This explains why we see lookahead in autoregressive models
Exactly. It is notable that google hosts so much ad copy, but is bad at it. You would think that they could get good by imitation, but turns out that no, imitating good marketing is hard.
I am trying to figure out a problem where every moral theory is a class of consequentialism.
In short, I cannot figure out how you argue what actions actually are a particular moral category, without appealing to consequences
Curling my finger is only shooting somewhen when I am holding a gun and pointing it at someone. We judge the two cases very differently.
In short, the classing of actions for moral analysis is hard.