Posts

Comments

Comment by muggingblaise on Against Almost Every Theory of Impact of Interpretability · 2023-08-19T19:03:26.480Z · LW · GW

Emulating GPT-4 using LLMs like GPT-3 as different submodules that send messages written in plain English to each other before outputting the next token. If the neural network had deceptive thoughts, we could see them in these intermediate messages.

This doesn't account for the possibility that there's still stenography involved. Plain English coming from an LLM may not be so plain given 

33. Alien Concepts: “The AI does not think like you do” There may not necessarily be a humanly understandable explanation for cognition done by crunching numbers through matrix products.

Considering current language models are able to create their own "language" to communicate with each other without context (hit or miss, admittedly), who's to say a deceptive model could find a way to hide misaligned thoughts in human language, like puzzles that spell a message using the first letter of every fourth word in a sentence? There could be some arbitrarily complicated algorithm (i.e., https://twitter.com/robertskmiles/status/1663534255249453056) to hide the subversive message in the "plain English" statement.

Comment by muggingblaise on The U.S. is becoming less stable · 2023-08-19T18:11:03.264Z · LW · GW

I think you can see this happening more today. States like Florida and Texas are marketing themselves as bastions for those on the right to escape overly invasive federal/state policies, particularly those that appeal to the LGBTQ/gender/racial culture war talking points. On the left, you have states like California, Washington, Oregon, and Colorado leading the charge against overly invasive federal/state policies against bodily autonomy WRT drug usage and now abortion.

I'd like to know why you term this "devolution," as this seems to be the intended modus operandi of the United States. My only hope is that this "devolution" continues further to the point that local, municipal politics becomes most important. I view this as the fastest way out of the political gridlock we face ourselves in. If Americans can view neighbors as neighbors again, working together to make their communities better for one another, rather than enemies trying to overthrow the government or prosecute political opponents, I think tensions will ratchet down dramatically. 

I'm not sure how we can take steps to change the culture to view local, municipal politics as most important. I think it's a very difficult walk. I think community-based news and dialogue is key to start. Local news outlets are disappearing, with corporate consolidation favoring a "cookie-cutter" / "assembly line" approach to journalism favoring national news stories. On top of this, I believe social media timelines (particularly Reddit, Twitter, and Facebook) lead users to engage with national or international news stories as those are most likely to go viral (they're "bigger" or "more impactful"). This idea that these stories are "more impactful" is a red herring, in my opinion. They really don't have an impact on one's day-to-day life quite like the story about what new zoning laws your city council discussed in last Thursday's meeting.