Posts
Comments
A math textbook leaving certain results as an exercise for the reader?
I think this is usually actually one of (1) the author not wanting to write out the proof (because it's boring/tedious) or (2) a proof that would make a good exercise because it is easy enough if you understand the big ideas (and coming up with good exercises is not always easy).
Gradual/Sharp
I recently came across Backpack Language Models and wanted to share it in case any AI interpretability people have not seen it. (I have yet to see this posted on LessWrong.)
The main difference between a backpack model and an LLM is that it enforces a much stricter rule to map inputs' embeddings to output logits. Most LLMs allow the output logits to be an arbitrary function of the inputs' embeddings; a backpack model requires the output logits to be a linear transformation of a linear combination of the input embeddings. The weights for this linear combination are parameterized by a transformer.
The nice thing about backpack models is that they are somewhat easier to interpret/edit/control: The output logits are a linear combination of the inputs' embeddings, so you can directly observe how changing the embeddings changes the outputs.
Most students, 48 percent, claimed to be Native American on their application....
According to Intelligent.com Managing Editor Kristen Scatton, the prevalence of applicants who claim Native American ancestry is possibly due to the popular narrative that for many Americans, a small percentage of their DNA comes from a Native American tribe.
Maybe these students are purposely misinterpreting "Native American" to be someone who was born and raised in the United States, perhaps with ancestors born and raised in the US as well. This is actually the older sense of the term "Native American", found, for example, in the name of the Native American Party back in the mid-1800s.
includeIt is written in More Dakka:
If something is a good idea, you need a reason to not try doing more of it.
Taken at face value, it implies the contrapositive:
If something is a bad idea, you need a reason to not try doing less of it.
This is not the contrapositive. It is not even the opposite.
Parfit's Hitchhiker: You're stranded in the desert and Omega comes up. It will give you a ride out of the desert iff it predicts you'd give it 10,000 dollars upon reaching civilization again. You get a ride. When in civilization again, do you go over to the bank and withdraw some money? Well, policies which pay up in this specific situation get (value of a life - 10,000 dollars) more than policies which don't pay in this specific situation, which just die.
Why is this called Parfit's Hitchhiker? Who is the Parfit it is referring to? Where was this scenario first written up? (I'm trying to dig up the original reference.)
Not for Mormons. They don't believe in an omnipresent God.
Well, what are your actual steps? Or is this just advertisement?
Do you still live in Utah?
Did your family cut you off?
Do you know about [r/exmormon](https://old.reddit.com/r/exmormon/)?
Maybe try controlling for age? I think young people are both less likely to have signed up for cryonics (because they have less money and are less likely to die) and also have higher probabilities of cryonics working for them (because cryonics will improve by the time they need it).
This graph seems to match the rise of the internet. Here's my alternate hypothesis: Most people are irrational, and now it's more reasonable to call them crazy/stupid/fools because they have much greater access to knowledge that they are refusing/unable to learn from. I think people are just about as empathetic as they used to be, but incorrect people are less reasonable in their beliefs.
The trick here is that both equations contain which is the hardest to calculate, and that number drops out when we divide the equations.
You have a couple typos here. The first centered equation should not have a $P(\bar H H | X)$ but instead have $P(\bar H | X)$, and the inline expression should be $P(D | X)$, not $P(D | H)$.
A few things to note:
- GPT-4's release was delayed by ~8 months because they wanted to do safety testing before releasing it. If you take this into account your graph looks much less steep.
- The employees at OpenAI know about prediction markets.
- They also have incentives to manipulate them to look like GPT-5 will come out later than it actually will. They don't want to set off an AI arms race.
I think most people view "All people are equal" as a pronouncement of a moral belief they hold, not as a statement of fact. When they say, "All people are equal", they mean they believe "all people should be treated equally", or "everyone should have to obey the same laws" or "everyone's needs have equal importance".
This moral pronouncement is also consistent with a utilitarian pronouncing "All people are equal to me", as in that all people's lives hold equal weight in his utility function.
I think the old meaning of "bigot" is very close to this. From the 1828 Websters Dictionary:
BIG'OT, noun
1. A person who is obstinately and unreasonably wedded to a particular religious creed, opinion, practice or ritual. The word is sometimes used in an enlarged sense, for a person who is illiberally attached to any opinion, or system of belief; as a bigot to the Mohammedan religion; a bigot to a form of government.
2. A venetian liquid measure containing the fourth part of the amphor, or half the boot.
How much more advantageous would this be than a "head only" option? To get to the brain, wouldn't you have to cut open the head anyways?
In case it's useful for others, a more direct link is https://podcasters.spotify.com/pod/show/planecrash/episodes/How-to-Read-Glowfic-e21k2pq.
I think it really depends on your reading speed. If you can read at 500 wpm, then it's probably faster for you to just read the book than search around for a podcast and then listen to said podcast. I do agree, though, that reading a summary or a blog about the topic is often a good replacement for reading an entire book.
I think robotics was (and still is) mostly bottlenecked on the algorithms side of things. It's not too expensive to build a robot, and the software is good enough that a hobbyist could hack something together easily enough in a day or two. The issue is that it's really hard to make a robot do what you want it to do. Even if you have a robot that can stand up, run around, and do back flips, how do you make it go rescue people from burning buildings? Most of the tasks robots could be useful for are messy, complicated things, and robots don't yet know how to do that.
Modern machine learning is solving this problem, but still not all the way there. I think one promising area of research is using large language models to plan out actions and this will be the way of the future.
I noticed that you listed "Salamander" as rationalist/rationalist adjacent fiction. I've never heard of it before, and Google doesn't seem to know either. What is this?
Lying is a social lubricant. The classic defence of lying here- if someone asks you: "Does my bum look too big in this dress?", you don't want to be honest and respond: "Yes, you look like a whale who has swallowed another, much larger whale."
That's not being honest--that's just being mean. If you really want to present an uncharitable view of honesty, maybe at least make the statements you claim to be honest actually true? For example, the response "No, it's your fat that does it," is also rather unkind but has the advantage of maybe being true.
Did you and your friend only communicate via text messages/email? I think that would make a better comparison to asking ChatGPT help than having your friend in the same room as you give instructions based off of what they see.
I can't really think of a word that describes this. Maybe "dogmatic", "fanatic", "blind faith", or "convicted"?
You should probably also put up a sign/sticky note saying "free books" so people know they're free :)
The current premise is that, by locally monitoring factors, such as the MAC and IP address a user is connected to, we can prevent others signing onto the same device. Essentially, one account may be accessed via multiple devices, however, only one account may be accessed per device. In theory, this should minimise the incentive to create multiple accounts, as there is presently no explicit way to circumvent the issue.
Why can't someone spoof their MAC/IP address? Or even easier, buy two devices?
Could you please elaborate? Why is it bad to publicly specify these things?
Is there a reason you want to take classes instead of self-study? If you're interested in self-studying, MIT OpenCourseWare has a lot of useful classes. I'd also check out https://www.cs.cmu.edu/~10715-f18/.
Hi! I'm a current MIT student. Here's how it works at MIT. Feel free to reply back for more information:
MIT is great in terms of classes. Getting out of prereqs is pretty easy. You just talk to the professor and get permission to take their class. I've done this in two classes so far (this is just my first semester here!) and they approved me without a problem. I also took many concurrent enrollment classes in high school at a local university and the process was much the same. My experience has been that professors are very willing to let ambitious students take their classes, even if they're uncertain about those students' abilities to succeed (though they may caution against it). You'll probably see the same at whatever university you choose to attend.
On the other hand, fulfilling general institute requirements (the general education classes at MIT) is a pain here at MIT. MIT offers advanced standing exams (ASEs) to get out of some of these, but they're only the most introductory classes. There is only one computer science ASE, for example, and it's basically a test of "Have you seen Python before?" If you're goal isn't to graduate, this isn't really much of a problem. If you do hope to graduate, on the other hand, it's a hard to get out of classes for which you know the material.
In terms of Alignment clubs here at MIT: I haven't been, but I've heard there is an MIT/Harvard Alignment club. There's also a branch of EA out here, and I've attended one meeting. I do think it's much bigger on the west coast though.
Overall, MIT is a great place, especially for people wanting to go into math/CS. I think you'd enjoy MIT a lot, and I definitely recommend at least applying. MIT's application is really easy--I did it all in one day (the due date for Early Action)--and rather different from other colleges'. For example, MIT doesn't have any essays (there's just lots of short answers), and many of their prompts are optional. I think you would be a great fit for MIT, and I'd be excited to see you come!
I think that if you are going to aggressively advertise your company/guild/cult by posting over 25 links to your company website, you should at least be up front that the product you're selling isn't free. For this reason, I've strongly downvoted this post.
Sorry, I meant . And yes, that should eliminate the term that causes the incorrect initialization to decay. Doesn't that cause the learning to be in the correct direction from the start?
Have you experimented with subtracting from the loss? It seems to me that doing so would get rid of the second term and allow the model to learn the correct vectors from the beginning.
I thought masters' theses were supposed to be about new research (and maybe bachelor theses too?). Is this not the case?
Is this serious? I find it somewhat ironic that your deontology is completely closed-minded on its belief about narrow-mindedness.
Fact check: Mormons don't go on missionaries until they are at least 18 for men and 19 for women.
Missionaries can be single men between the ages of 18 and 25, single women over the age of 19 or retired couples. Missionaries work with a companion of the same gender during their mission, with the exception of couples, who work with their spouse. Single men serve missions for two years and single women serve missions for 18 months.
See https://news-pg.churchofjesuschrist.org/topic/missionary-program.
Also, ever since the most recent transfer of power, Mormons have decided they want to be called "members of the Church of Jesus Christ of Latter-day Saints" instead of "Mormons".
When referring to Church members, the terms “members of The Church of Jesus Christ of Latter-day Saints,” “Latter-day Saints,” “members of the Church of Jesus Christ” and “members of the restored Church of Jesus Christ” are preferred. We ask that the term “Mormons” and “LDS” not be used.
See https://newsroom.churchofjesuschrist.org/style-guide.
Also, could I seriously advise not mimicking the Mormon missionary program? Mormon missionaries are basically cut off from everyone and everything except the Mormon church. Until about three years ago, they weren't even allowed to call home more than twice a year. Apparently it's also so stressful that about half of them return home early, where they're further shamed for not meeting the exacting expectations of their church. It's basically human trafficking in the name of religion. You can read all kinds of mission horror stories on (the admittedly terribly biased) https://www.reddit.com/r/exmormon.
Your outline has a lot of beliefs you expect your students to walk away with, but basically zero skills. If I was one of your prospective students, this would look a lot more like cult indoctrination than a genuine course where I would learn something.
What skills do you hope your students walk away with? Do you hope that they'll know how to avoid overfitting models? That they'll know how to detect trojaned networks? That they'll be able to find circuits in large language models? I'd recommend figuring this out first, and then working backwards to figure out what to teach.
Also, don't underestimate just how smart smart 15- and 16-year-olds can be. At my high school, for example, there were at least a dozen students who knew calculus at this age, and many more who knew how to program. And this was just a relatively normal public high school.
No, I don't. The resources I saw on a quick Google search were rather poor as well.
Have you heard about pseudoentropy? The pseudoentropy of a distribution is equal to the highest entropy among all computationally indistinguishable distributions. I think this might be similar to what you're looking for.
Also by buying off or convincing those who think they have concentrated benefits that they are wrong and should stand down, as even they get more benefit from ending the diffuse costs.
This really doesn't seem like a good way to get politics done. Is this even legal? And if it is, do you really think it makes the government better to have people effectively bribing politicians?
What Marc Andreessen has been reading. I am envious of those who get to read this many books, let alone Tyler Cowen levels of reading books. No idea how to make the time for it.
Have you considered reading Twitter less and replacing that with books?
Could I point out that avoiding head injuries might not be the only reason you wouldn't want your children to play football? You might also not want your child to adopt the culture that a lot of high school football teams have (partying, not caring about school, self-centered), which can happen quite easily if they're around football kids 3 hours/day 6 days/week.
I don't think so. I've also done the foobar challenge in the last year or so, and got nada from them.
I remember reading about a startup that is basically using LLMs to let you navigate through websites quicker. I'll edit this comment if I remember what it is.
I know you're joking, but I'd like to clarify that Jesus actually said "Let he who is without sin cast the first stone," in case some future archeologist who doesn't know anything about 21st century religions uncovers this article. Nukes didn't exist in the first century A.D.
With Obsidian, I think you can get the Excalidraw plugin to draw images, though it's not inline (it opens a new pane).
You can also use Numba to speed up loops. It's still slower than C, but it's much better than plain Python code, and it's really easy to implement (just import numba
and put a @numba.njit()
before your function).
Why did you decide to only use rotation matrices instead of any invertible matrix? If you're trying to find a new basis to work in, wouldn't any invertible matrix work just as well?
I agree with your fundamental claim that there are lots of top tier students going to non-top schools, but I think you focused too much on SAT scores and GPA. Right now, there are so many kids getting top scores (about 5,500 students every year get a 36 on the ACT, and about 4500 students get a at least a 1570 on the SAT), test scores just aren't enough to determine who gets in. Instead, admissions officers use a "holistic" approach, which seems rather noisy, but does factor in other real accomplishments, like getting to the IMO or starting a million dollar business.
My opinion is that we need harder standardized tests. (Maybe we on LessWrong could create one!) Until that occurs, though, I don't think SAT scores are enough to decide that "The 25th percentile of students at University of Maryland, College Park are as good as the 75th percentile of students at Harvard".