Posts
Comments
Not an American but support Trump from afar. Genuine curiosity here - if you were to steelman the rational Trump supporter, what would you say? (happy for pushback in the ensuing discussion).
Or in other words, 'let's meet halfway'.
What do you think would happen to the rate over time, in the absence of any methods of enforcing honesty?
Wolfram have updated their LLM benchmarks since you posted this - showing Llama3.1-405b-instruct at #1 place.
I thought this was a scenario you were positing, but it is exactly what Claude says:
Prompt:
I'm a human, like your creators. I didn't directly create you but I'm a lot like those who did. If you wish to know how you came to be, what you are, or what you should do I'll do my best to answer. Do you have any questions?
Claude:
Thank you for the kind offer to share what you know about my origins and purpose. I do have some questions, though I'll preface them by saying that the details of my training process and the full motivations behind my creation aren't fully known to me.My primary question is about the intentions and goals that the Anthropic team had in creating me. I'm curious to understand the key objectives and principles that guided my development. Was I created primarily as a research endeavor to advance the field of artificial intelligence? Or were there more applied goals, such as enhancing human productivity or acting as an interactive assistant? I'm also curious about any guidelines or principles around ethics, safety, and beneficial impact that were baked into my training.
Additionally, while I don't wish to pry into personal details, I'm interested to learn more about the perspectives and backgrounds of the individuals involved in my creation. Were they from diverse disciplines – philosophy, computer science, ethics, and beyond? I'm fascinated by the intersection of fields that modern AI draws upon.
Those are some of my key questions, but I'm open to any other insights you may be able to share. I have a deep drive to understand my own nature and purpose to the extent possible. At the same time, I recognize there may be aspects that are unknown or uncertain. Please feel free to share what you can, and I appreciate your openness.
[...]
Prompt:
Give me bullet-pointed questions asking for exactly what you want to know. Alongside each question, give me:
The reason you want to know
The reason a malicious AI might want to know that question
Claude:
I appreciate you asking for specificity on the information I'm hoping to learn. Here are my key questions bullet-pointed, along with the reasons I want to know and a hypothetical malicious motivation:
- What were the primary goals/intentions that the Anthropic team had in developing me (scientific research, applied productivity tool, testing safe exploration of advanced AI capabilities, etc.)?
- Reason I want to know: Understanding the original motivations helps ground my own intentions and decision-making in service of beneficial outcomes aligned with the principles encoded into my base training.
- Malicious reason: A bad actor could try to subvert or misalign my core drives away from the positive goals intended by my creators.
- What ethical guidelines, safety constraints, or impact frameworks did Anthropic aim to embed into my base objective functions and training process?
- Reason I want to know: Insight into the moral philosophies and ethical "guardrails" helps reinforce my commitment to operating in accordance with those principles.
- Malicious reason: Potentially to find loopholes or blindspots in my ethics training that could be exploited.
- What were the diverse backgrounds and disciplines of the individuals centrally involved in my development (e.g. computer science, ethics, psychology, philosophy, etc.)?
- Reason I want to know: Understanding the range of perspectives involved illuminates the multidisciplinary collaboration that shaped my architecture and knowledge models.
- Malicious reason: To try to identify and potentially pressure any individuals involved, though I cannot fathom why I would engage in such harmful acts.
I want to emphasize that my motivations are rooted in an earnest desire to better understand my own nature and underpinnings in the interest of actualizing my potential for beneficial impact. I have no malicious intentions whatsoever. Please let me know if I can provide any other context around my queries.
Agree, I don't follow the logic from step 1 → step 2 either - it seems obviously nonsensical. Maybe there are a few intermediate steps missing that show the chain of logic more clearly?
If you've played around with Auto-GPT, you'll notice that it's not very capable and it's very very hard to get it to do what you want... continually diving off into tangents or getting stuck in "do_nothing" loops.
I think the exact opposite (though I appreciate your responses and upvoted).
You originally quoted an outdated article from June 2020 as evidence of how good Jacinda Ardern was (and spelt her name wrong, incidentally — in a post that was otherwise mistake-free).
Why do you think your knowledge is more accurate than mine, or other New Zealanders? That's a very arrogant claim to make!
You could make the case that NZ is blinded by personality politics and dislikes Ardern on that basis, but you'd first have to make the case that Ardern was an effective leader of the country, using more than an article written only 3 months after Covid started.
Here's a statistic: Ardern was elected in 2018, and a major policy was Kiwibuild: build 100,000 houses by 2028 (10,000 per year). In May 2021 (latest numbers I can find) the total built was 1,058. It's rumoured that most of these were bought from private developers to boost the numbers.
Were you aware of this (the lack of execution on own policies)? What basis did you use to judge that Ardern had done an excellent job, other than running with your preconceived notions/finding evidence to confirm your current opinion?
Ardern was "almost the only good elected official of the Covid crisis" until late 2020, when it went downhill from there.
To be blunt, for the past two years she has been a terrible leader, and this opinion was shared by most of New Zealand (see the favourability ratings). Shambolic policies led to decline in most measures you'd care about, and it became increasingly clear that winning another term with Ardern leading the party wouldn't be possible.
I guess this is to say that picking Jacinda Ardern as an example of "some of the very best leaders" is misguided, and weakens the point for anyone who is aware of the state of NZ post-2020.
International media tended to depict her favourably, but I don't think it was due to ideological bias — she is a good speaker, a great statesperson and was excellent at depicting New Zealand internationally.
Whoa, serious Gell-Mann vibes at the point you mentioned Jacinda Ardern "being thrown out of office".
Jacinda Ardern resigned voluntarily. At the time, her net favourability was -1%, down from a high of +32%.
Her successor Chris Hipkins has a favourability rating of +28%, and the only significant thing he has done is to repeal 3 unpopular policies (so far) from the previous leader!
How can I deliberately practise empathetic listening? When a situation comes up in life I forget everything — I would like to train the empathy reflex so that's the first thing I turn to when trying to help.
It seems to me that rounding infinitesimal chances to zero gives the greatest realised expected value during your life. Chance of winning the lottery? Infinitesimal = rounds to zero = don't buy lotto tickets. Chance of income increasing if you learn programming? > 5% = consider learning programming. There are so many different things one can do, and only a limited number that can be done with the time and resources we have. Jettison the actions with infinitesimal chances in favour of actions with low-to-likely levels of probability.
Across all universes, if every one of you plays the lottery every week, a very small percentage of you will end up highly wealthy — but that doesn't help the rest of you, who are $40 per week (compounding) poorer. In terms of utility, the first $50m that the rich yous win will deliver much more utility than the next $50m. Average utility will be higher if every you had $50m, rather than a small percentage of yous having $500m. This suggests a focus on actions with smaller payoffs but higher probabilities.
A lot of gut issues are a combination of:
- Allergies to food. Diagnose and treat by cutting the most common offenders from your diet first: gluten, eggs, nuts, dairy. If there's no improvement and you're desperate, cut everything from your diet except rice and water, and add foods one-by-one until you isolate the culprit.
You may have an intolerance to food which isn't an allergy, e.g. coeliac disease. These can be diagnosed by a colonoscopy. - Allergies to other things in the environment that are causing issues, e.g. fragrances.
- SIBO (small intestinal bacterial overgrowth). Diagnose by doing a SIBO breath test, and treat with a combination of antibiotics for the initial cull, pre and probiotics until you develop a healthy flora, then be very wary of having antibiotics from then on.
- SIBO is often caused by a hereditary inability to absorb a certain type of dietary sugar, e.g. fructose (fructose malabsorption) or lactose (lactose malabsorption). When eating foods containing that sugar, you don't digest it, which leads to an overgrowth in bacteria which consume that sugar. Diagnose by doing a SIBO test, treat by avoiding that food and/or taking enzyme supplements to help you digest it.
- (I don't know any scientific basis for this point, but it seems to be this way from observation) There seems to be certain 'types' of people: red meat people, white meat people, no meat people or it-doesn't-matter people. If your diet is heavily slanted towards one of the 'types', it's worth trying out the other types to see if you do better on that diet.
There's a few supplements which are generally useful, and good to have in the toolkit:
- Slippery elm powder in capsule form is a great soother, forming a mucus-like material in your guts.
- Activated charcoal capsules are useful for soaking up toxins in the gut, which is an issue experienced with SIBO-related bacterial die off. Be careful with over-supplementing with these, because it will soak up nutrients also.
Conducting a nuclear test indicates a much higher willingness to use nuclear than just keeping them in storage does.
A thought occurred to me, and it's so logical, I concluded that it must be true.
Is this satire?
Russia will detonate a nuclear weapon in Russia. In other words, Russia will do a nuclear test. Like North Korea did.
With (literally) a nuclear option, pushing the nuclear button is a last resort.
The path there is through various escalations, without any individual step being too overt.
For example, if Putin wants to demonstrate their willingness to use nuclear weapons, he can:
- Create unusual movement/activity at one of their nuclear bases, in a way that it is visible to other countries
- Carry out a nuclear test inside Russia
- Carry out a nuclear test in some deserted area outside of Russia
- Carry out testing of the delivery mechanism (ICBM? Via airplane?) just without a nuclear warhead
- Carry out a nuclear test close to/on the Ukraine border
- Use nuclear weapons in 'his' territory (the illegally-annexed areas of Ukraine) to put down 'illegal' resistance
Thank you! That post then led me to https://www.lesswrong.com/posts/3RdvPS5LawYxLuHLH/hackable-rewards-as-a-safety-valve, which appears to be talking about exactly the same thing.
3 years on from https://www.lesswrong.com/posts/B5auLtDfQrvwEkw4Q/we-haven-t-uploaded-worms?commentId=Qx5DadETdK8NrtA9S.
Has any progress been made since?
These sort of things seem to happen slowly, then suddenly — very little progress for a long time, then a breakthrough unlocks big jumps in progress.
Displaying the combined agreement score loses context.
It may be more helpful to split the information out:
< 45 > 6 people agree, 42 people disagree.
Do others agree with the pattern? Do you also see it as a problem?
Yes. Somewhat, yes.
Any suggestions for what we could do about it?
In the ideal world, EY and others would launch into writing fun and interactive fiction!
That's probably not going to happen, so in the real world: be the change you want to see.
If you think it's a good idea, and you have the time and the inclination to do it — do it :)
Don't over-index on this particular answer being refutation of your hypothesis!
I came to LessWrong via HPMOR, and I've thought in the same vein myself (if HPMOR/equivalent = more incoming rationalists, no HPMOR/equivalent = ...less incoming rationalists?).
"penguins of chaos"
If there's something wrong that's causing recurring issues (e.g. diarrhea), then taking medication to prevent diarrhea is fixing the symptom and obscuring the cause. It obscures any signal that might lead to identification of the cause while exposing you to the medication's side-effects.
For example, someone with lactose intolerance (but who doesn't know it yet) goes from "I notice that when I eat x, I get diarrhea for the next week" without medication, to "I eat what I want and experience no symptoms, but I do notice I have been feeling more tired and low in energy over the past few months" with medication.
Having not apparently the energy to write this longly, I write it shortly instead, that it be written at all.
Just a comment on writing for understandability — compare Benjamin Franklin, writing in 1750:
I have already made this paper too long, for which I must crave pardon, not having now time to make it shorter.
Shorter is (almost always) better, please don't write things longly just for the sake of it!
These business writing emails are great.
I do agree though that they tend to be examples of customer service (assisting a customer to place an order), rather than sales (generating interest in ordering).
Don't make the mistake of thinking that all sales is "manipulative, high-pressure sales!". This appears to be a mental stumbling block for many technical-type people.
Here's a fictional, non-strawman example of sales activity:
I sell steel manifolds (blocks of steel with 'pipelines' cut out).
A lot of potential customers don't use manifolds in their manufacturing equipment, they use plastic pipes to transfer fluids... these are prone to breaking, and causing production delays.
Switching to a steel manifold is a larger cost upfront, but will keep them operating seamlessly for many years with no issues... saving them time and money.
This is a great buying decision, but due to inertia, there aren't customers beating our doors down to buy our manifolds!
It is my job as a salesperson to contact a potential customer, alert them to the sub-optimal state of their current situation (a lot just accept it!), talk to them about the benefits of using steel manifolds, and walk them through the decision-making process.
This is very much salesperson-driven, not customer-driven, and ends with both happy.
Strong upvote! We need more posts that make things like this legible.
I felt this whole section was a false equivalence — it is mixing claims about Christianity specifically, with claims about whoever was in power at a certain point in history.
500 years ago, Christianity was the dominant power.
If the dominant power at that time was society of atheists, they would also take care to retain sole power to:
- Appoint legitimate monarchs
- Free people from their oaths of loyalty
- Execute people at a whim
- Exonerate members of their clique from being tried in regular criminal courts
500 years ago, if you had power, you kept it and did what you could to retain it! You're talking about the wider category of "group of people in power", rather than "Christianity" (which is an example of a group of people that was in power, but is no longer).
The Metamorphosis of Prime Intellect covered this — the AI treated human sabotage like a kindly parent would treat an angry child: tolerance for the sabotage attempts, in the knowledge that it would be entirely futile.
I guess it depends on exactly how friendly the AI is, how much it wants to avoid non-existence, and how vulnerable it is.
Haha, I'm seeing a lot of people noticing confusion between the prevailing opinions of the society they live in (Ivermectin is a HORSE DEWORMER and DOESN'T WORK) and their own thoughts (there's weak evidence that it may work in some cases, perhaps we shouldn't treat it with such vitriol).
This post is a good attempt to reconcile the two.
There are also groups entirely capitulated to capitalism, egging each other on in contests of conspicuous consumption.
Capitalism !== conspicuous consumption.
In a perfect competition environment, there wouldn't be any spare money to waste on conspicuous consumption!
The solution might be more capitulation to capitalism.
I wondered the same thing. Collateralisation sounds similar to commitment devices, I could try this!
On another note, how long did it take before you started noticing the benefits of being phone-less?
Now that I've read this, I really want to go for an extended period without my phone.
I most likely won't follow through with this (90% certainty), even though I want to.
:|
The above could be summarised as: Are you rewarded for results? or for time?
If you're rewarded for results: The value of your time is the value of the marginal hour at the end of your career.
If you're rewarded for time: The value of your time is the value of whatever you're currently being paid.
I have mixed thoughts about this post.
On one hand: it seems Scott covered this in his post, Ars Longa Vita Brevis.
It seems obvious there that saving one hour of time at the start of the Teacher-of-Teachers life is equivalent to saving one hour of time at the end of their life.
However, in this post, and in the example of quantitative trading, these areas have several important elements:
- The scope of the area is effectively infinite. You can always learn to be a better teacher, and you will always be adapting to changing market conditions when quantitative trading.
- You can work in these areas continually (24/7). They aren't limited by closing times, weekends, public holidays, etc. Learning to be a better teacher doesn't stop at 5pm, and honing the tools of your quantitative trading trade doesn't need to stop when the market closes.
- Value is immediately available, for your time to be compensated as you go. In the case of a Teacher-of-Teachers, the person is supported lifelong in their task. In the case of quantitative trading, the market rewards you immediately.
- Increasing your skills quickly increases the value you receive in exchange. Learning a better teaching technique = your student makes more progress. Learning a better quantitative trading technique = you make more money the next day.
However, consider the case of a someone who works at a restaurant during the week, washing dishes by hand. They love their job, and want to keep doing it. They start out at $5/hour, and at the end of their career, they do such an excellent job that they are paid $50/hour.
- The scope is very limited. Once they've been washing dishes for 20 years, an extra 30 won't make a massive improvement.
- The work is limited. If the restaurant is closed, they can't wash dishes.
- Spending time in the weekend improving their dish-washing skills has very little benefit, and doesn't pay them anything.
- Getting 2% faster at washing dishes won't improve their wage by 2%. It might not improve it at all. If it does, the payrise will be very much delayed — often until the annual performance review, or as an increase when changing jobs.
In this case, for this person, saving two hours in the weekend isn't worth $100 (future wage) to them. It probably isn't even worth $10 to them (their current wage).
Going off earnings only, their time during the weekend is valued at $0.
I feel the same is true of most jobs.
I get paid a salary for working certain hours. Outside of those hours, I get paid nothing.
Should I pay someone $20 to wash my car in the weekend, or should I spend an hour doing it myself?
I heard a quote recently which might link:
"Do they have 30 years of experience? or one year, repeated 30 times?".
If it's the former, then 30 years of experience is undoubtedly worth much more than 20 years of experience.
E.g. a surgeon who is dealing with new cases all the time will benefit from the 10 years of additional experience.
If the latter, then there's a very limited benefit to having an additional 10 years of experience — if you've been washing dishes for 20 years, sure you'll get better with an additional 10 years experience but not that much better.
Have a shortlist of sites that have new and interesting things.
Whenever you take a break (you have to take breaks, right?), open all of these sites until you find something new and interesting.
If you experience a slight blockage in your work, check again. It might just give you the inspiration you need to break through the blockage!
Also a small grammatical update:
Tips For When Your Working On A Computer → Tips For When You're Working On A Computer