LessWrong 2.0 Reader
View: New · Old · Top← previous page (newer posts) · next page (older posts) →
← previous page (newer posts) · next page (older posts) →
Thank you for this. I'm not eligible for it but I will send it to my sister who is. She needs emergency dental work but the health insurance plan offered through her employer doesn't cover it so she's just been suffering through the pain. So really, thank you. She will be so glad.
martinsq on Martín Soto's ShortformClaude learns across different chats. What does this mean?
I was asking Claude 3 Sonnet "what is a PPU" in the context of this thread [LW(p) · GW(p)]. For that purpose, I pasted part of the thread.
Claude automatically assumed that OA meant Anthropic (instead of OpenAI), which was surprising.
I opened a new chat, copying the exact same text, but with OA replaced by GDM. Even then, Claude assumed GDM meant Anthropic (instead of Google DeepMind).
This seemed like interesting behavior, so I started toying around (in new chats) with more tweaks to the prompt to check its robustness. But from then on Claude always correctly assumed OA was OpenAI, and GDM was Google DeepMind.
In fact, even when copying in a new chat the exact same original prompt (which elicited Claude to take OA to be Anthropic), the mistake no longer happened. Neither when I went for a lot of retries, nor tried the same thing in many different new chats.
Does this mean Claude somehow learns across different chats (inside the same user account)?
If so, this might not happen through a process as naive as "append previous chats as the start of the prompt, with a certain indicator that they are different", but instead some more effective distillation of the important information from those chats.
Do we have any information on whether and how this happens?
(A different hypothesis is not that the later queries had access to the information from the previous ones, but rather that they were for some reason "more intelligent" and were able to catch up to the real meanings of OA and GDM, where the previous queries were not. This seems way less likely.)
I've checked for cross-chat memory explicitly (telling it to remember some information in one chat, and asking about it in the other), and it acts is if it doesn't have it.
Claude also explicitly states it doesn't have cross-chat memory, when asked about it.
Might something happen like "it does have some chat memory, but it's told not to acknowledge this fact, but it sometimes slips"?
Probably more nuanced experiments are in order. Although note maybe this only happens for the chat webapp, and not different ways to access the API.
tigerlily on How would you navigate a severe financial emergency with no help or resources?Thank you for the thoughtful suggestions. Aella is exemplary but camgirling strikes me as a nightmare.
I have considered making stuff, like custom glasses/premium drinkware, and selling on Etsy but the market seems saturated and I've never had the money to buy the equipment to learn the skills required to do this kind of thing.
I am certified in Salesforce and could probably get hired helping to manage the Salesforce org for my tribe (Cherokee Nation) but would have to move to Oklahoma.
I've applied for every grant I can find that I'm eligible for, but there's not much out there and the competition is stiff.
We will figure out something, I'm sure. If we don't, there's nothing standing between us and homelessness and that reality fills me with anger and despair.
I feel like there's nothing society wants from me, so there's no way for me to convince society that I deserve anything from it.
It's so hard out here.
martinsq on William_S's ShortformWhat's PPU?
crissman on LessOnline Festival Updates ThreadHealth and longevity blogger from Unaging.com here. I've submitted talks on optimal diet, optimal exercise, how to run sub 3:30 for your first marathon, and sugar is fine -- fight me!
Looking forward to extended, rational health discussions!
remmelt-ellen on "Open Source AI" is a lie, but it doesn't have to beAlthough the training process, in theory, can be wholly defined by source code, this is generally not practical, because doing so would require releasing (1) the methods used to train the model, (2) all data used to train the model, and (3) so called “training checkpoints” which are snapshots of the state of the model at various points in the training process.
Exactly. Without the data, the model design cannot be trained again, and you end up fine-tuning a black box (the "open weights").
Thanks for writing this.
Interesting...
Wouldn't I expect the evidence to come out in a few big chunks, e.g. OpenAI releasing a new product?
thomas-kwa on Thomas Kwa's ShortformYou should update by +-1% on AI doom surprisingly frequently
This is just a fact about how stochastic processes work. If your p(doom) is Brownian motion in 1% steps starting at 50% and stopping once it reaches 0 or 1, then there will be about 50^2=2500 steps of size 1%. This is a lot! If we get all the evidence for whether humanity survives or not uniformly over the next 10 years, then you should make a 1% update 4-5 times per week. In practice there won't be as many due to heavy-tailedness in the distribution and the fact you don't start at 50%. But I do believe that evidence is coming in every week such that ideal market prices should move by 1% on maybe half of weeks, and it is not crazy for your probabilities to shift by 1% during many weeks if you think about it.
keltan on Increasing IQ is trivialVery quick google search. But something like this link for the “targeted NIR interference therapy”?
florian_dietz on Mechanistic Interpretability Workshop Happening at ICML 2024!It just seems intuitively like a natural fit: Everyone in mech interp needs to inspect models. This tool makes it easier to inspect models.
Does it need to be more specific than that?
One thing that comes to mind: The tool allows you to categorize different training steps and records them separately, and you can define categories arbitrarily. This can be used to compare what the network does internally in two different scenarios of interest. E.g. the categories could be "the race of the character in the story" or some other real-life condition you would want to know the impact of.
The tool will then allow you to quickly compare KPIs of tensors all across the network for these categories. It's less about testing a specific hypothesis and more about quickly getting an overview and intuition, and finding anomalies.