Posts

Agentic Language Model Memes 2020-08-01T18:03:30.844Z
How well can the GPT architecture solve the parity task? 2020-07-11T19:02:07.730Z
AvE: Assistance via Empowerment 2020-06-30T22:07:50.220Z
The Economic Consequences of Noise Traders 2020-06-14T17:14:59.343Z
Facebook AI: A state-of-the-art open source chatbot 2020-04-29T17:21:25.050Z
Are there any naturally occurring heat pumps? 2020-04-13T05:24:16.572Z
Can we use Variolation to deal with the Coronavirus? 2020-03-18T14:40:35.090Z
FactorialCode's Shortform 2019-07-30T22:53:24.631Z

Comments

Comment by FactorialCode on Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm) · 2022-11-03T15:56:39.977Z · LW · GW

Another alternative is to use a 440nm light source and a frequency doubling crystal https://phoseon.com/wp-content/uploads/2019/04/Stable-high-efficiency-low-cost-UV-C-laser-light-source-for-HPLC.pdf although the efficiency is questionable, there are also other variations based on frequency quadrupling https://opg.optica.org/oe/fulltext.cfm?uri=oe-29-26-42485&id=465709.

Comment by FactorialCode on FactorialCode's Shortform · 2020-11-30T03:59:54.414Z · LW · GW

That said, you can hide it in your user-settings.

This solves my problem, thank you. Also it does look just like the screenshot, no problems other than what I brought up when you click on it.

Comment by FactorialCode on FactorialCode's Shortform · 2020-11-30T03:14:04.506Z · LW · GW

This might just be me, but I really hate the floating action button on LW. It's an eyesore on what is otherwise a very clean website. The floating action button was designed to "Represent the primary action on a screen" and draw the user's attention to itself. It does a great job at it, but since "ask us anything, or share your feedback" is not the primary thing you'd want to do, it's distracting.

Not only does it do that, but it also give the impression that this is another cobbled together Meteor app and therefore my brain instantly makes me associate it with all the other crappy Meteor apps.

The other thing is that when you click on it, it's doesn't fit in with the rest of the site theme. LW has this great black-grey-green color scheme, but if you click on the FAB, you are greeted with a yellow waving hand, and when you close it, you get this ugly red (1) in the corner of your screen.

It also kinda pointless since the devs and mods on this website are all very responsive and seem to be aware of everything that gets posted.

I could understand it at the start of LW 2.0 when everything was still on fire, but does anyone use it now?

/rant

Comment by FactorialCode on FactorialCode's Shortform · 2020-11-30T03:13:44.567Z · LW · GW
Comment by FactorialCode on FactorialCode's Shortform · 2020-11-30T03:12:54.580Z · LW · GW
Comment by FactorialCode on Signalling & Simulacra · 2020-11-16T00:22:44.262Z · LW · GW

I bet this is a side effect of having a large pool of bounded rational agents that all need to communicate with each other, but not necessarily frequently. When two agents only interact briefly, neither agent has enough data to work out what the "meaning" of the other's words. Each word could mean too many different things. So you can probably show that under the right circumstances, it's beneficial for agents in a pool to have a protocol that maps speech-acts to inferences the other party should make about reality (amongst other things, such as other actions). For instance, if all agents have shared interests, but only interact briefly with limited bandwidth, both agents would have an incentive to implement either side of the protocol. Furthermore, it makes sense for this protocol to be standardized, because the more standard the protocol, the less bandwidth and resources the agents will need to spend working out the quirks of each others protocol.

This is my model of what languages are.

Now that you have a well defined map from speech-acts to inferences, the notion of lying becomes meaningful. Lying is just when you use speech acts and the current protocol to shift another agents map of reality in a direction that does not correspond to your own map of reality.

Comment by FactorialCode on It's hard to use utility maximization to justify creating new sentient beings · 2020-10-20T00:51:35.518Z · LW · GW

I personally think that something more akin to minimum utilitarianism is more inline with my intuitions. That is, to a first order approximation, define utility as (soft)min(U(a),U(b),U(c),U(d)...) where a,b,c,d... are the sentients in the universe. This utility function mostly captures my intuitions as long as we have reasonable control over everyone's outcomes, utilities are comparable, and the number of people involved isn't too crazy.

Comment by FactorialCode on As a Washed Up Former Data Scientist and Machine Learning Researcher What Direction Should I Go In Now? · 2020-10-19T23:34:13.706Z · LW · GW

Money makes the world turn and it enables research, be it academic or independent. I would just focus on getting a bunch of that. Send out 10x to 20x more resumes than you already have, expand your horizons to the entire planet, and put serious effort into prepping for interviews.

You could also try getting a position at CHAI or some other org that supports AI alignment PhDs, but it's my impression that those centres are currently funding constrained and already have a big list of very high quality applicants, so your presence or absence might not make that much of a difference.

Other than that, you could also just talk directly with the people working on alignment. Send them emails, and ask them about their opinion on what kind of experiments they'd like to know the result of but don't have time to run. Then turn those experiments into papers. Once you've gotten a taste for it, you can go and do your own thing.

Comment by FactorialCode on What are some beautiful, rationalist artworks? · 2020-10-18T13:40:52.024Z · LW · GW

Comment by FactorialCode on Is Stupidity Expanding? Some Hypotheses. · 2020-10-17T18:37:34.731Z · LW · GW

I'd put my money on lowered barriers to entry on the internet and eternal September effects as the primary driver of this. In my experience the people I interact with IRL haven't really gotten any stupider. People can still code or solve business problems just as well as they used to. The massive spike in stupidity seems to have occurred mostly on the internet.

I think this is because of 2 effects that reinforce each other in a vicious cycle.

  1. Barriers to entry on the internet have been reduced. A long time ago you needed technical know how to even operate a computer, then thing got easier but you still needed a PC, and spending any amount of time on the internet was still the domain of nerds. Now anyone with a mobile phone can jump on twitter and participate.

  2. Social media platforms are evolving to promote ever dumber means of communication. If they don't they're out competed by the ones that do. For example, compare a screenshot of the reddit UI back when it started vs now. As another example, the forums of old made it fairly easy to write essays going back and forth arguing with people. Then you'd have things like facebook where you can still have a discussion, but it's more difficult. Now you have TikTok and Instagram, where the highest form of discourse comes down to a tie between a girl dancing with small text popups and an unusually verbose sign meme. You can forget about rational discussion entirely.

So I hypothesize that you end up with this death spiral, where technology lowers barriers to entry, causing people who would otherwise have been to dumb effectively to participate, causing social media companies to further modify their platforms to appeal to the lowest common denominator, causing more idiots to join... and so on and so forth. To top it off, I've found myself and other people I would call "smart" disconnecting from the larger public internet. So you end up with evaporative cooling on top of all the other aforementioned effects.

The end result is what you see today, I'm sure the process is continuing, but I've long ago checked out of the greater public internet and started hanging out in the cozyweb or outside.

Comment by FactorialCode on The Solomonoff Prior is Malign · 2020-10-15T03:36:19.803Z · LW · GW

At its core, this is the main argument why the Solomonoff prior is malign: a lot of the programs will contain agents with preferences, these agents will seek to influence the Solomonoff prior, and they will be able to do so effectively.

Am I the only one who sees this much less as a statement that the Solomonoff prior is malign, and much more a statement that reality itself is malign? I think that the proper reaction is not to use a different prior, but to build agents that are robust to the possibility that we live in a simulation run by influence seeking malign agents so that they don't end up like this.

Comment by FactorialCode on The Treacherous Path to Rationality · 2020-10-15T03:04:21.335Z · LW · GW

Hmm, at this point it might be just a difference of personalities, but to me what you're saying sounds like "if you don't eat, you can't get good poisoning". "Dual identity" doesn't work for me, I feel that social connections are meaningless if I can't be upfront about myself.

That's probably a good part of it. I have no problem hiding a good chunk of my thoughts and views from people I don't completely trust, and for most practical intents and purposes I'm quite a bit more "myself" online than IRL.

But in any case there will many subnetworks in the network. Even if everyone adopt the "village" model, there will be many such villages.

I think that's easier said than done, and that a great effort needs to be made to deal with effects that come with having redundancy amongst villages/networks. Off the top of my head, you need to ward against having one of the communities implode after their best members leave for another:

Many of our best and brightest leave, hollowing out and devastating their local communities, to move to Berkeley, to join what they think of as The Rationalist Community.

Likewise, even if you do keep redundancy in rationalist communities, you need to ensure that there's a mechanism that prevents them from seeing each other as out-groups or attacking each other when they do. This is especially important since one group viewing the other as their out-group, but not vice versa can lead to the group with the larger in-group getting exploited.

Comment by FactorialCode on The Treacherous Path to Rationality · 2020-10-15T02:29:21.262Z · LW · GW

So first of all, I think the dynamics of surrounding offense are tripartite. You have the the party who said something offensive, the party who gets offended, and the party who judges the others involved based on the remark. Furthermore, the reason why simulacra=bad in general is because the underlying truth is irrelevant. Without extra social machinery, there's no way to distinguish between valid criticism and slander. Offense and slander are both symmetric weapons.

This might be another difference of personalities...you can try to come up with a different set of norms that solves the problem. But that can't be Crocker's rules, at least it can't be only Crocker's rules.

I think that's a big part of it. Especially IRL, I've taken quite a few steps over the course of years to mitigate the trust issues you bring up in the first place, and I rely on social circles with norms that mitigate the downsides of Crocker's rules. A good combination of integrity+documentation+choice of allies makes it difficult to criticize someone legitimately. To an extent, I try to make my actions align with the values of the people I associate myself with, I keep good records of what I do, and I check that the people I need either put effort into forming accurate beliefs or won't judge me regardless of how they see me. Then when criticism is levelled against myself and or my group, I can usually challenge it by encouraging relevant third parties to look more closely at the underlying reality, usually by directly arguing against what was stated. That way I can ward off a lot of criticism without compromising as much on truth seeking, provided there isn't a sea change in the values of my peers. This has the added benefit that it allows me and my peers to hold each other accountable to take actions that promote each others values.

The other thing I'm doing that is both far easier to pull off and way more effective, is just to be anonymous. When the judging party can't retaliate because they don't know you IRL and the people calling the shots on the site respect privacy and have very permissive posting norms, who cares what people say about you? You can take and dish out all the criticism you want and the only consequence is that you'll need to sort through the crap to find the constructive/actionable/accurate stuff. (Although crap criticism can easily be a serious problem in and of itself.)

Comment by FactorialCode on The Treacherous Path to Rationality · 2020-10-11T14:03:30.963Z · LW · GW

I'm breaking this into a separate thread since I think it's a separate topic.

Second, specifically regarding Crocker's rules, I'm not their fan at all. I think that you can be honest and tactful at the same time, and it's reasonable to expect the same from other people.

So I disagree. Obviously you can't impose Croker's rules on others, but I find it much easier and far less mentally taxing to communicate with people I don't expect to get offended. Likewise, I've gained a great deal of benefit from people very straightforwardly and bluntly calling me out when I'm dropping the ball, and I don't think they would have bothered otherwise since there was no obvious way to be tactful about it. I also think that there are individuals out there that are both smart and easily offended, and with those individuals tact isn't really an option as they can transparently see what you're trying to say, and will take issue with it anyways.

I can see the value of "getting offended" when everyone is sorta operating on simulacra level 3 and factual statements are actually group policy bids. However, when it comes to forming accurate beliefs, "getting offended" strikes me as counter productive, and I do my best to operate in a mode where I don't do it, which is basically Croker's rules.

Comment by FactorialCode on The Treacherous Path to Rationality · 2020-10-11T13:45:36.110Z · LW · GW

First, when Jacob wrote "join the tribe", I don't think ey had anything as specific as a rationalist village in mind? Your model fits the bill as well, IMO. So what you're saying here doesn't seem like an argument against my objection to Zack's objection to Jacob.

So my objection definitely applies much more to a village than less tightly bound communities, and Jacob could have been referring to anything along that spectrum. But I brought it up because you said:

Moreover, the relationships between them shouldn't be purely impersonal and intellectual. Any group endeavour benefits from emotional connections and mutual support.

This is where the objection begins to apply. The more interdependent the group becomes, the more susceptible it is to the issues I brought up. I don't think it's a big deal in an online community, especially with pseudonyms, but I think we need to be careful when you get to more IRL communities. With a village, treating it like an experiment is good first step, but I'd definitely be in the group that wouldn't join unless explicit thought had been put in to deal with my objections, or the village had been running successfully for long enough that I become convinced I was wrong.

Third, sure, social and economic dependencies can create problems, but what about your social and economic dependencies on non-rationalists? I do agree that dilution is a real danger (if not necessarily an insurmountable one).

So in this case individual rationalists can still be undermined by their social networks, but theres a few reasons this is a more robust model. 1) You can have a dual-identity. In my case most of the people I interact with don't know what a rationalist is, I either introduce someone to the ideas here without referencing this place, or I introduce them to this place after I've vetted them. This makes it harder for social networks to put pressure on you or undermine you. 2) A group failure of rationality is far less likely to occur when doing so requires affecting social networks in New York, SF, Singapore, Northern Canada, Russia, etc., then when you just need to influence in a single social network.

Comment by FactorialCode on Covid 10/8: October Surprise · 2020-10-11T02:56:42.345Z · LW · GW

IMO, F*** or F!#@, I feel like it has more impact that way. Since it means you went out of your way to censor yourself, and it's not just a verbal habit, as would be the case with either fuck or a euphemism.

Comment by FactorialCode on The Treacherous Path to Rationality · 2020-10-11T00:01:14.539Z · LW · GW

I mean some of us made buckets of money off of the chaos, so theres that.

Comment by FactorialCode on The Treacherous Path to Rationality · 2020-10-10T23:38:44.287Z · LW · GW

So full disclosure, I'm on the outskirts of the rationality community looking inwards. My view of the situation is mostly filtered through what I've picked up online rather than in person.

With that said, in my mind the alternative is to keep the community more digital, or something that you go to meetups for, and to take advantage of societies' existing infrastructure for social support and other things. This is not to say we shouldn't have strong norms, the comment box I'm typing this in is reminding me of many of those norms right now. But the overall effect is that rationalists end up more diffuse, with less in common other than the shared desire for whatever it is we happen to be optimizing for. This in contrast to building something more like a rationalist community/village, where we create stronger interpersonal bonds and rely on each other for support.

The reason I say this is because as I understood it, the rationalist (at least the truth seeking side) came out of a generally online culture, where disagreement is (relatively) cheap, and individuals in the group don't have much obvious leverage over one another. That environment seems to have been really good for allowing people to explore and exchange weird ideas, and to follow logic and reason wherever it happens to go. It also allows people to more easily "tell it like it is".

When you create a situation where a group of rats become interdependent socially or economically, most of what I've read and seen indicates that you can gain quite a bit in terms of quality of life and group effectiveness, but I feel it also opens up the door to the kind of "catastrophic social failure" I'd mentioned earlier. Doubly so if the community starts to build up social or economic capital that other agents who don't share the same goals might be interested in.

Comment by FactorialCode on The Treacherous Path to Rationality · 2020-10-10T13:37:31.761Z · LW · GW

Sure, tribes also carry dangers such as death spirals and other toxic dynamics. But the solution isn't disbanding the tribe, that's throwing away the baby with the bathwater.

I think we need to be really careful with this and the dangers of becoming a "tribe" shouldn't be understated w.r.t our goals. In a community focused on promoting explicit reason, it becomes far more difficult to tell apart those who are carrying out social cognition from those who are actually carrying out the explicit reason, since the object level beliefs and their justifications of those doing social cognition and those using explicit reason will be almost identical. Likewise, it becomes much easier to slip back into the social cognition mode of thought while still telling yourself that your still reasoning.

IMO, if we don't take additional precautions, this makes us really vulnerable to the dynamics described here. Doubly so the second we begin to rack up any kind of power, influence or status. Initially everything looks good and everyone around you seems to be making their way along The Path^T^M. But slowly you build up a mass of people who all agree with you on the object level but who acquired their conclusions and justifications by following social cues. Once the group reaches critical mass, you might get into a disagreement with a high status individual or group, and instead of using reason and letting the chips fall where they may, standard human tribal coordination mechanisms are used to strip you of your power and status. Then you're expelled from the tribe. From there whatever mission the tribe had is quickly lost to the usual status games.

Personally, I haven't seen much discussion of mechanisms for preventing this and other failure modes, so I'm skeptical of associating myself or supporting any IRL "rationalist community/village".

Comment by FactorialCode on Open Communication in the Days of Malicious Online Actors · 2020-10-08T13:14:53.697Z · LW · GW

Another option not discussed is to control who your message reaches in the first place, and in what medium. I'll claim, without proof or citation, that social media sites like twitter are cesspits that are effectively engineered to prevent constructive conversation and to exploit emotions to keep people on the website. Given that, a choice that can mitigate these kind of situations is to not engage with these social media platforms in the first place. Post your messages on a blog under your own control or a social media platform that isn't designed to hijack your reward circuitry.

Comment by FactorialCode on Open Communication in the Days of Malicious Online Actors · 2020-10-08T13:02:35.942Z · LW · GW

I think you're missing an option, though. You can specifically disavow and oppose the malicious actions/actors, and point out that they are not part of your cause, and are actively hurting it. No censorship, just clarity that this hurts you and the cause. Depending on your knowledge of the perpetrators and the crimes, backing this up by turning them or actively thwarting them may be in scope as well.

There is a practical issue with this solution in the era of modern social media. Suppose you have malicious actors who go on to act in your name, but you never would have associated yourself with them under normal circumstances because they don't represent your values. If you tell them to stand down or condemn them, then you've associated yourself with them, and that condemnation can be used against you.

Comment by FactorialCode on Brainstorming positive visions of AI · 2020-10-08T01:55:13.274Z · LW · GW

Assuming we can solve the relevant ethical dilemmas. There is exactly one thing I want:

Waifus. Sentient optimized and personalized romantic partners.

Comment by FactorialCode on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-28T19:52:54.713Z · LW · GW

I haven't actually figured that out yet, but several people in this thread have proposed takeaways. I'm leaning towards "social engineering is unreasonably effective". That or something related to keeping a security mindset.

Comment by FactorialCode on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-28T06:56:33.501Z · LW · GW

I personally feel that the fact that it was such an effortless attempt makes it more impressive, and really hammers home the lesson we need to take away from this. It's one thing to put in a great deal of effort to defeat some defences. It's another to completely smash through them with the flick of a wrist.

Comment by FactorialCode on Honoring Petrov Day on LessWrong, in 2020 · 2020-09-26T18:10:45.539Z · LW · GW

Props to whoever petrov_day_admin_account was for successfully red-teaming lesswrong.

Comment by FactorialCode on The rationalist community's location problem · 2020-09-26T06:01:10.766Z · LW · GW

As much as I hate to say it, I don't think that it makes much sense for the main hub of the rationalist movement to move away from Berkeley and the Bay Area. There are several rationalist adjacent organizations that are firmly planted in Berkely. The ones that are most salient to me are the AI and AI safety orgs. You have OpenAI, MIRI, CHAI, BAIR, etc. Some of these could participate in a coordinated move, but others are effectively locked in place due to their tight connections with larger institutions.

I think that more creative options need to be brainstormed and explored to deal with the situation in Berkley.

Comment by FactorialCode on The rationalist community's location problem · 2020-09-26T05:25:36.408Z · LW · GW

Ehh, Singapore is a good place to do business and live temporarily. But mandatory military service for all male citizens and second gen permanent residents, along with the work culture make it unsuitable as a permanent location to live. Not to mention that there's a massive culture gap between the rats and the Singaporeans.

Comment by FactorialCode on AI Advantages [Gems from the Wiki] · 2020-09-23T06:42:21.306Z · LW · GW

I think the cooperative advantages mentioned here have really been overlooked when it comes to forecasting AI impacts, especially in slow takeoff scenarios. A lot of forecasts, like what WFLL, mainly posit AI's competing with each other. Consequently Molochian dynamics come into play and humans easily lose control of the future. But with these sorts of cooperative advantages, AIs are in an excellent position to not be subject to those forces and all the strategic disadvantages they bring with them. This applies even if an AI is "merely" at the human level. I could easily see an outcome that from a human perspective looks like a singleton taking over, but is in reality a collective of similar/identical AI's working together with superhuman coordination capabilities.

I'll also add source-code-swapping and greater transparency to the list of cooperative advantages at an AI's disposal. Different AIs that would normally get stuck in a multipolar traps might not stay stuck for long if they can do things analogous to source code swap prisoners dilemmas.

Comment by FactorialCode on Where is human level on text prediction? (GPTs task) · 2020-09-21T08:55:50.665Z · LW · GW

Just use bleeding edge tech to analyze ancient knowledge from the god of information theory himself.

This paper seems to be a good summary and puts a lower bound on entropy of human models of english somewhere between 0.65 and 1.10 BPC. If I had to guess, the real number is probably closer 0.8-1.0 BPC as the mentioned paper was able to pull up the lower bound for hebrew by about 0.2 BPC. Assuming that regular english compresses to an average of 4* tokens per character, GPT-3 clocks in at 1.73/ln(2)/4 = 0.62 BPC. This is lower than the lower bound mentioned in the paper.

So, am I right in thinking that if someone took random internet text and fed it to me word by word and asked me to predict the next word, I'd do about as well as GPT-2 and significantly worse than GPT-3?

That would also be my guess. In terms of data entropy, I think GPT-3 is probably already well into the superhuman realm.

I suspect this is mainly because GPT-3 is much better at modelling "high frequency" patterns and features in text that account for a lot of the entropy, but that humans ignore because they have low mutual information with the things humans care about. OTOH, GPT-3 also has extensive knowledge of pretty much everything, so it might be leveraging that and other things to make better predictions than you.

This is similar to what we see with autoregressive image and audio models, where high frequency features are fairly well modelled, but you need a really strong model to also get the low frequency stuff right.

*(ask Gwern for details, this is the number I got in my own experiments with the tokenizer)

Comment by FactorialCode on Most Prisoner's Dilemmas are Stag Hunts; Most Stag Hunts are Schelling Problems · 2020-09-15T14:49:11.270Z · LW · GW

I'm OOTL, can someone send me a couple links that explain the game theory that's being referenced when talking about a "battle of the sexes"? I have a vague intuition from the name alone, but I feel this is referencing a post I haven't read.

Edit: https://en.wikipedia.org/wiki/Battle_of_the_sexes_(game_theory)

Comment by FactorialCode on How much can surgical masks help with wildfire smoke? · 2020-08-21T16:09:19.721Z · LW · GW

I'm gonna go with barely, if at all. When you wear a surgical mask and you breath in, a lot of air flows in from the edges, without actually passing through the mask, so the mask doesn't have very good opportunity to filter the air. At least with N95 and N99 mask, you have a seal around your face, and this forces the air through the filter. Your probably better off wearing a wet bandana or towel that's been tied in such a way as to seal around your face, but that might make it hard to breath.

I found this, which suggests that they're generally ineffective. https://www.cdph.ca.gov/Programs/EPO/Pages/Wildfire Pages/N95-Respirators-FAQs.aspx

Comment by FactorialCode on Money creation and debt · 2020-08-13T06:14:02.028Z · LW · GW

Yeah, I'll second the caution to draw any conclusions from this. Especially because this is macroeconomics.

Comment by FactorialCode on Money creation and debt · 2020-08-12T22:00:44.591Z · LW · GW

https://en.wikipedia.org/wiki/Sectoral_balances

It is my understanding that this is broadly correct. It is also my understanding that this is not common knowledge.

Comment by FactorialCode on Generalizing the Power-Seeking Theorems · 2020-07-28T18:44:24.303Z · LW · GW

One hypothesis I have is that even in the situation where there is no goal distribution and the agent has a single goal, subjective uncertainty makes powerful states instrumentally convergent. The motivating real world analogy being that you are better able to deal with unforeseen circumstances when you have more money.

Comment by FactorialCode on Open & Welcome Thread - July 2020 · 2020-07-25T06:20:42.958Z · LW · GW

I've gone through a similar phase. In my experience you eventually come to terms with those risks and they stop bothering you. That being said, mitigating x and s-risks has become one of my top priorities. I now spend a great deal of my own time and resources on the task.

I also found learning to meditate helps with general anxiety and accelerates the process of coming to terms with the possibility of terrible outcomes.

Comment by FactorialCode on Alignment As A Bottleneck To Usefulness Of GPT-3 · 2020-07-23T00:38:26.770Z · LW · GW

The way I was envisioning it is that if you had some easily identifiable concept in one model, e.g. a latent dimension/feature that corresponds to the log odd of something being in a picture, you would train the model to match the behaviour of that feature when given data from the original generative model. Theoretically any loss function will do as long as the optimum corresponds to the situation where your "classifier" behaves exactly like the original feature in the old model when both of them are looking at the same data.

In practice though, we're compute bound and nothing is perfect and so you need to answer other questions to determine the objective. Most of them will be related to why you need to be able to point at the original concept of interest in the first place. The acceptability of misclassifying any given input or world-state as being or not being an example of the category of interest is going to depend heavily on things like the cost of false positives/negatives and exactly which situations get misclassified by the model.

The thing about it working or not working is a good point though, and how to know that we've successfully mapped a concept would require a degree of testing, and possibly human judgement. You could do this by looking for situations where the new and old concepts don't line up, and seeing what inputs/world states those correspond to, possibly interpreted through the old model with more human understandable concepts.

I will admit upon further reflection that the process I'm describing is hacky, but I'm relatively confident that the general idea would be a good approach to cross-model ontology identification.

Comment by FactorialCode on Alignment As A Bottleneck To Usefulness Of GPT-3 · 2020-07-22T22:52:22.304Z · LW · GW

I think you can loosen (b) quite a bit if you task a separate model with "delineating" the concept in the new network. The procedure does effectively give you access to infinite data, so the boundary for the old concept in the new model can be as complicated as your compute budget allows. Up to and including identifying high level concepts in low level physics simulations.

Comment by FactorialCode on Alignment As A Bottleneck To Usefulness Of GPT-3 · 2020-07-22T21:31:35.025Z · LW · GW

I think the eventual solution here (and a major technical problem of alignment) is to take an internal notion learned by one model (i.e. found via introspection tools), back out a universal representation of the real-world pattern it represents, then match that real-world pattern against the internals of a different model in order to find the "corresponding" internal notion.

Can't you just run the model in a generative mode associated with that internal notion, then feed that output as a set of observations into your new model and see what lights up in it's mind? This should work as long as both models predict the same input modality. I could see this working pretty well for matching up concepts between the latent spaces of different VAEs. Doing this might be a bit less obvious in the case of autoregressive models, but certainly not impossible.

Comment by FactorialCode on $1000 bounty for OpenAI to show whether GPT3 was "deliberately" pretending to be stupider than it is · 2020-07-22T06:06:27.129Z · LW · GW

I think this is pretty straight forward to test. GPT-3 gives joint probabilities of string continuations given context strings.

Step 1: Give it 2 promps, one suggesting that it is playing the role of a smart person, and one where it is playing the roll of a dumb person.

Step 2: Ask the "person" a question that demonstrates that persons intelligence. (something like a math problem or otherwise)

Step 2: Write continuations where the person answers correctly and incorrectly

Step 3: Compare the relative probabilities GPT-3 assigns to each continuation given the promps and questions.

If GPT-3 is sandbagging itself, it will assign a notably higher probability to the correct answer when conditioned on the smart person prompt than when conditioned on the dumb person prompt. If it's not, it will give similar probabilities in both cases.

Step 4: Repeat the experiment with problems of increasing difficulty and plot the relative probability gap. This will show the limits of GPT-3's reflexive intelligence. (I say reflexive because it can be instructed to solve problems it otherwise couldn't with the amount of serial computations at it's disposal by carrying out an algorithm as part of its output, as is the case with parity)

This is an easy $1000 for anyone who has access to the beta API.

Comment by FactorialCode on Collection of GPT-3 results · 2020-07-19T04:24:19.149Z · LW · GW

Hypothesis: Unlike the language models before it and ignoring context length issues, GPT-3's primary limitation is that it's output mirrors the distribution it was trained on. Without further intervention, it will write things that are no more coherent than the average person could put together. By conditioning it on output from smart people, GPT-3 can be switched into a mode where it outputs smart text.

Comment by FactorialCode on Collection of GPT-3 results · 2020-07-19T02:11:12.976Z · LW · GW

According to Gwern, it fails the Parity Task.

Comment by FactorialCode on The New Frontpage Design & Opening Tag Creation! · 2020-07-09T18:00:58.395Z · LW · GW

Huh.

I did not believe you so went and checked the internet archive. Sure enough, all the old posts with a ToC are off center. I did not notice until now.

Comment by FactorialCode on AI Research Considerations for Human Existential Safety (ARCHES) · 2020-07-09T16:59:18.789Z · LW · GW

Nitpick, is there a reason why the margins are so large?

Comment by FactorialCode on The New Frontpage Design & Opening Tag Creation! · 2020-07-09T16:37:54.235Z · LW · GW

The content on the front page is noticeably off center to the right on 1440x900 monitors.

https://imgur.com/VhPQsv6

Edit: The content is noticeably off center to the right in general.

https://imgur.com/015ewvd

Comment by FactorialCode on What should we do about network-effect monopolies? · 2020-07-07T16:35:44.879Z · LW · GW

On the standardization and interoperability side of things. There's been effort to develop decentralized social media platforms and protocols. Most notably being the various platforms of the Fediverse. Together with opensource software, this let's people build large networks that keep the value of network effects while removing monopoly power. I really like the idea of these platforms, but due to the network monopoly of existing social media platforms I think they'll have great difficulty gaining traction.

Comment by FactorialCode on [Crowdfunding] LessWrong podcast · 2020-07-06T06:19:02.744Z · LW · GW

Yeah, that's pretty pricy. Google is telling me that they can do 1 million characters/month for free using a wavenet. That might be good enough.

Comment by FactorialCode on [Crowdfunding] LessWrong podcast · 2020-07-05T15:35:30.813Z · LW · GW

What's the going rate for audio recordings on Fiverr?

Comment by FactorialCode on FactorialCode's Shortform · 2020-06-23T19:18:13.138Z · LW · GW

With the ongoing drama that is currently taking place. I'm worried that the rationalist community will find itself inadvertently caught up in the culture war. This might cause a large influx of new users who are more interested in debating politics than anything else on LW.

It might be a good idea to put a temporary moratorium/barriers on new signups to the site in the event that things become particularly heated.

Comment by FactorialCode on SlateStarCodex deleted because NYT wants to dox Scott · 2020-06-23T16:34:19.058Z · LW · GW

Organizations, and entire nations for that matter, can absolutely be made to "feel fear". The retaliation just needs to be sufficiently expensive for the organization. Afterwards, it'll factor in the costs of that retaliation when deciding how to act. If the cost is large enough, it won't do things that will trigger retaliation.

Comment by FactorialCode on Image GPT · 2020-06-21T18:05:50.613Z · LW · GW

There is no guarantee that it is learning particularly useful representations just because it predicts pixel-by-pixel well which may be distributed throughout the GPT,

Personally, I felt that that wasn't really surprising either. Remember that this whole deep learning thing started with exactly what OpenAI just did. Train a generative model of the data, and then fine tune it to the relevant task.

However, I'll admit that the fact that theres an optimal layer to tap into, and that they showed that this trick works specifically with transformer autoregressive models is novel to my knowledge.