milan-w

Posts
Comments

Posts

Using ideologically-charged language to get gpt-3.5-turbo to disobey it's system prompt: a demo 2024-08-24T00:13:12.925Z

Milan W's Shortform 2024-08-21T16:12:11.909Z

ChatGPT understands, but largely does not generate Spanglish (and other code-mixed) text 2022-12-23T17:40:59.862Z

Comments

Comment by Milan W (weibac) on Milan W's Shortform · 2025-04-05T23:11:49.641Z · LW · GW

i think my preference is "both at once" or something like that

Comment by Milan W (weibac) on Milan W's Shortform · 2025-04-05T23:10:33.360Z · LW · GW

keep in mind that one persons modus tollens is anothers modus ponens, and i provided no indication as to what update i prefer people make from reading my observation

Comment by Milan W (weibac) on Milan W's Shortform · 2025-04-05T22:47:10.666Z · LW · GW

"agentic" and "power seeker" (when applied to a person) form a pair of russell conjugates

Comment by Milan W (weibac) on AI for Epistemics Hackathon · 2025-03-15T14:37:43.989Z · LW · GW

I am interested in the space. Lots of competent people in the general public are also interested. I had not heard of this hackathon. I think you probably should have done a lot more promotion/outreach.

Comment by Milan W (weibac) on LLM Applications I Want To See · 2025-03-08T05:56:54.373Z · LW · GW

Here is a customizable LLM-powered feed filter for X/Twitter: https://github.com/jam3scampbell/Promptable-Twitter-Feed

Comment by weibac on [deleted post] 2025-03-07T23:49:05.857Z

This reads like marketing content. However, when read at a meta level, it is a good demonstration of LLMs being already deployed in the wild.

Comment by Milan W (weibac) on Sergii's Shortform · 2025-03-06T23:35:08.505Z · LW · GW

Maybe for a while.
Consider, though, that correct reasoning tends towards finding truth.

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-05T19:21:54.047Z · LW · GW

In talking with the authors, don't be surprised if they bounce off when encountering terminology you use but don't explain. I pointed you to those texts precisely so you can familiarize yourself with pre-existing terminology and ideas. It is hard but also very useful to translate between (and maybe unify) frames of thinking. Thank you for your willingness to participate in this collective effort.

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-03T21:37:48.354Z · LW · GW

Let me summarize so I can see whether I got it: So you see "place AI" as body of knowledge that can be used to make a good-enough simulation of arbitrary sections of spacetime, where are events are precomputed. That precomputed (thus, deterministic) aspect you call "staticness".

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-02T18:31:28.297Z · LW · GW

How can a place be useful if it is static? For reference I'm imagining a garden where blades of grass are 100% rigid in place and water does not flow. I think you are imagining something different.

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-02T18:10:38.492Z · LW · GW

I think you may be conflating between capabilities and freedom. Interesting hypothesis about rules and anger though, has it been experimentally tested?

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-02T18:06:36.133Z · LW · GW

Hmm i think i get you a bit better now. You want to build human-friendly and even fun and useful-by-themselves interfaces for looking at the knowledge encoded in LLMs without making them generate text. Intriguing.

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-02T17:54:19.613Z · LW · GW

I'm not sure I follow. I think you are proposing a gamification of interpretability, but I don't know how the game works. I can gather something about player choice making the LLM run and maybe some analogies to physical movement, but I can't really grasp it. Could you rephrase it from it's basic principles up instead of from an example?

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-02T15:54:10.673Z · LW · GW

Build software tools to help @Zvi do his AI substack. Ask him first, though. Still if he doesn't express interest then maybe someone else can use them. I recommend thorough dogfooding. Co-develop an AI newsletter and software tools to make the process of writing it easier.

What do I mean by software tools? (this section very babble little prune)
- Interfaces for quick fuzzy search over large yet curated text corpora such as the openai email archives + a selection of blogs + maybe a selection of books
- Interfaces for quick source attribution (rhymes with the above point)
- In general, widespread archiving and mirroring of important AI safety discourse (ideally in Markdown format)
- Promoting existing standards for the sharing of structured data (ie those of the semantic web)
- Research into the Markdown to RDF+OWL conversion process (ie turning human text into machine-computable claims expressed in a given ontology).

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-02T15:31:09.507Z · LW · GW

Study how LLMs act in a simulation of the iterated prisoner's dilemma.

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-02T15:29:36.856Z · LW · GW

A qualitative analysis of LLM personas and the Waluigi effect using Internal Family Systems tools

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-02T15:25:57.672Z · LW · GW

Reversibility should be the fundamental training goal. Agentic AIs should love being changed and/or reversed to a previous state.

That idea has been gaining traction lately. See the Corrigibility As a Singular Target (CAST) sequence here on lesswrong. I believe there is a very fertile space to explore at the intersection between CAST and the idea that Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals. Also probably add in Self-Other Overlap: A Neglected Approach to AI Alignment to the mix. A comparative analysis of the models and proposals presented in these three pieces I just linked could turn out to be extremely useful.

Comment by Milan W (weibac) on Share AI Safety Ideas: Both Crazy and Not · 2025-03-02T15:14:33.001Z · LW · GW

What if we (somehow) mapped an LLM's latent semantic space into phonemes?

What if we then composed tokenization (ie word2vec) with phonemization (ie vec2phoneme) such that we had a function that could translate English to Latentese?

Would learning Latentese allow a human person to better interface with the target LLM the Latentese was constructed from?

Comment by Milan W (weibac) on Yair Halberstadt's Shortform · 2025-02-24T19:31:59.592Z · LW · GW

Anthropic is calling it an "hybrid reasoning model". I don't know what they mean by that.

Comment by Milan W (weibac) on A fable on AI x-risk · 2025-02-24T08:38:13.601Z · LW · GW

I think it is not that unlikely that they are roughly as biologically smart as us and have advanced forms of communication, but that they are just too alien and thus we haven't deciphered them yet.

Comment by Milan W (weibac) on A fable on AI x-risk · 2025-02-24T04:04:36.171Z · LW · GW

Also, if whales could argue like this, whale relations with humans would be very different

Why?

Comment by Milan W (weibac) on shawnghu's Shortform · 2025-02-23T23:07:25.905Z · LW · GW

I have also seen this.

Comment by Milan W (weibac) on Yudkowsky on The Trajectory podcast · 2025-02-23T18:57:19.909Z · LW · GW

Update 2025-02-23: Sam Altman has a kid now. link, mirror.

Comment by Milan W (weibac) on Where should one post to get into the training data? · 2025-02-22T23:55:49.515Z · LW · GW

If you have a big pile of text that you want people training their LLMs on, I recommend compiling and publishing it as a Huggingface dataset.

Comment by Milan W (weibac) on Yonatan Cale's Shortform · 2025-02-20T19:14:43.718Z · LW · GW

I see. Friction management / affordance landscaping is indeed very important for interface UX design.

Comment by Milan W (weibac) on Yonatan Cale's Shortform · 2025-02-20T13:15:35.426Z · LW · GW

Seems like just pasting into the chat context / adding as attachments the relevant info on the default Claude web interface would work fine for those use cases.

Comment by Milan W (weibac) on how do the CEOs respond to our concerns? · 2025-02-18T18:01:16.112Z · LW · GW

Main concern right now is very much lab proliferation, ensuing coordination problems, and disagreements / adversarial communication / overall insane and polarized discourse.

Google Deepmind: They are older than OpenAI. They also have a safety team. They are very much aware of the arguments. I don't know about Musk's impact on them.
Anthropic: They split from OpenAI. To my best guess, they care about safety at least roughly as much as them. Many safety researchers have been quitting OpenAI to go work for Anthropic over the past few years.
xAI: Founded by Musk several years after he walked out from OpenAI. People working there have previously worked at other big labs. General consensus seems to be that their alignment plan (as least as explained by Elon) is quite confused.
SSI: Founded by Ilyia Sutskever after he walked out from OpenAI, which he did after participating in a failed effort to fire Sam Altman from OpenAI. Very much aware of the arguments.
Meta AI: To the best of my knowledge, aware of the arguments but very dismissive of them (at least at the upper management levels).
Mistral AI: I don't know much but probably more or less the same or worse than Meta AI.
Chinese labs: No idea. I'll have to look into this.

I am confident that there are relatively influential people within Deepmind and Anthropic who post here and/or on the Aligment Forum. I am unsure about people from other labs, as I am nothing more than a relatively well-read outsider.

Comment by Milan W (weibac) on Yonatan Cale's Shortform · 2025-02-18T17:32:03.374Z · LW · GW

The Pantheon interface features comments by different LLM personas.

Comment by Milan W (weibac) on Mo Putera's Shortform · 2025-02-18T17:24:31.900Z · LW · GW

@dkl9 wrote a very eloquent and concise piece arguing in favor of ditching "second brain" systems in favor of SRSs (Spaced Repetition Systems, such as Anki).

Try as you might to shrink the margin with better technology, recalling knowledge from within is necessarily faster and more intuitive than accessing a tool. When spaced repetition fails (as it should, up to 10% of the time), you can gracefully degrade by searching your SRS' deck of facts.
If you lose your second brain (your files get corrupted, a cloud service shuts down, etc), you forget its content, except for the bits you accidentally remember by seeing many times. If you lose your SRS, you still remember over 90% of your material, as guaranteed by the algorithm, and the obsolete parts gradually decay. A second brain is more robust to physical or chemical damage to your first brain. But if your first brain is damaged as such, you probably have higher priorities than any particular topic of global knowledge you explicitly studied.
I write for only these reasons:
to help me think
to communicate and teach (as here)
to distill knowledge to put in my SRS
to record local facts for possible future reference
Linear, isolated documents suffice for all those purposes. Once you can memorise well, a second brain becomes redundant tedium.

Comment by Milan W (weibac) on On the Rebirth of Aristocracy in the American Regime · 2025-02-18T09:11:48.088Z · LW · GW

Now some object-level engagement with your piece:

Very interesting. There are indeed well-read people who see Thiel as the ideological core of this Trump administration, and who view this as a good thing. I was under the (I now see, wrong) impression that Thiel-centrality was an hallucination by paranoid leftists. Thank you very much for providing a strong and important update to my world model.

Your personal website states that you are of Syrian extraction. Thiel is gay. Both of these facts point to a worldview that has trascended identity politics. I believe that identity politics as currently practised is mostly dumb and harmful, so I guess this is good news. Maybe even extremely good news. However, I am unsure how far it applies.

This ideological development is extremely interesting.

May I ask a series of questions?

Is Echelon a thing that exists right now, or is it a thing that Thiel wants to build?
Do you think Trump understands Thiel's ideas?
Same question as above, but for Musk.
Same as above, but for Sam Altman.

Comment by Milan W (weibac) on On the Rebirth of Aristocracy in the American Regime · 2025-02-18T08:44:13.566Z · LW · GW

To restate my criticism in a more thorough way:

Your post reads like you are trying to vibe with a reader who already agrees with you. You cannot assume that in an open forum. There are many reasonable people who disagree with you. Such is the game you have decided to play by posting here. In this corner of the internet, you may find libertarians, socialists, conservatives, antinatalists, natalists, vegans, transhumanists, luddites, and more engaging in vigorous yet civilized debate. We love it.

Try to make the reader understand what you are trying to convey and why you believe it is true before vibing. It is useless to broadcast music that will be heard as noise by most of your audience. Help them tune their receivers to the correct frequency first.

Show your work. How did you come to believe what you believe? Why do you think it is true? What evidence would convince you that it is false?

We come here to search for truth, and hate vibing over false things. You have not given us good evidence that the thing you are vibing about is true.

Welcome. Do better and post again.

Comment by Milan W (weibac) on On the Rebirth of Aristocracy in the American Regime · 2025-02-18T08:15:22.881Z · LW · GW

This post is pretty much devoid of world-modeling. It is instead filled to the brim with worldview-assertions.

Dear author, if I were to judge only by this post I would be forced to conclude that your thought process is composed solely of vibing over quotations. I hazard the guess that you can maybe do better.

Comment by Milan W (weibac) on Yonatan Cale's Shortform · 2025-02-18T06:46:26.615Z · LW · GW

The nearest thing I can think of off the top of my head is the Pantheon interface. Probably more unconventional than what you had in mind, though.

Comment by Milan W (weibac) on Ascetic hedonism · 2025-02-17T22:24:45.787Z · LW · GW

Upon reflection, I think I want to go further in this direction, and I have not done so due to akratic / trivial inconveniences reasons. Here is a list of examples:

I used to take only cold showers, unless I needed to wash my hair. May be a good idea to restart that.
I've wanted to center my workflow around CLI / TUI programs (as opposed to GUI) programs for a while now. It is currently in a somewhat awkward hybrid state.
I used to use Anki and enjoy it. I dropped it during a crisis period in my life. The crisis has abated. It is imperative that I return.

Comment by Milan W (weibac) on Ascetic hedonism · 2025-02-17T22:07:31.710Z · LW · GW

I strongly agree with this post, and feel like most people would benefit from directionally applying its advice. Additional examples from my own life:

One time, a close friend complained about the expense and effort required to acquire and prepare good coffee, and about the suffering incurred whenever he drank bad coffee. I have since purposefully avoided developing a taste in coffee. I conceive of it as a social facilitator, or as a medium to simultaneously ingest caffeine, water and heat.
Back during my teenage years, one day I decided I would drink just water from then on. I have since dropped the fanaticism, and occasionally partake in other beverages. Soda now tastes obscenely strong, if not outright gross. I am healthier and wealthier than the counterfactual, since water is healthier and cheaper than all alternatives.

Comment by Milan W (weibac) on Why we're not founding a human-data-for-alignment org · 2025-02-17T21:23:16.952Z · LW · GW

However, the assumption that high-quality high-skill human feedback is important and neglected by EAs has not been falsified.

To your best guess, is this still true?

Comment by Milan W (weibac) on Ebenezer Dukakis's Shortform · 2025-02-12T17:14:11.445Z · LW · GW

Maybe one can start with prestige conservative media? Is that a thing? I'm not from the US and thus not very well versed.

Comment by Milan W (weibac) on Claude is More Anxious than GPT; Personality is an axis of interpretability in language models · 2025-02-10T19:58:25.133Z · LW · GW

Interesting. Do you have the code published somewhere?

Comment by Milan W (weibac) on Gary Marcus now saying AI can't do things it can already do · 2025-02-09T14:49:47.448Z · LW · GW

I applaud the scholarship, but this post does not update me much on Gary Marcus. Still, checking is good, bumping against reality often is good, epistemic legibility is good. Also, this is a nice link to promptly direct people who trust Gary Marcus to. Thanks!

Comment by Milan W (weibac) on the devil's ontology · 2025-02-08T22:16:41.750Z · LW · GW

Hi sorry for soft-doxxing you but this information is trivially accesible from the link you provided and helps people evaluate your work more quickly:
danilovicioso.com

Cheers!

Comment by Milan W (weibac) on Thread for Sense-Making on Recent Murders and How to Sanely Respond · 2025-02-08T22:04:12.579Z · LW · GW

Oh. That's nice of her.

Comment by Milan W (weibac) on the devil's ontology · 2025-02-08T22:01:21.163Z · LW · GW

In the gibbs energy principle quote you provide, are you implying the devil is roughly something like "the one who wishes to consume all available energy"? Or something like "the one who wishes to optimize the world such that no energy source remains untapped"?

Comment by Milan W (weibac) on the devil's ontology · 2025-02-08T21:56:17.800Z · LW · GW

This post is explicitly partisan and a bit hard to parse for some people, which is why I think they bounced off and downvoted, but I think this writer is an interesting voice to follow. I mean, a conservative who knows deleuze and cybernetics? Sign me up! (even though I'm definitively not a conservative)

Comment by Milan W (weibac) on the devil's ontology · 2025-02-08T21:49:42.133Z · LW · GW

Hi! Welcome! Is your thesis roughly this?:
"The left latched into the concept of "diversity" to make the right hate it, thus becoming more homogeneous and dumber"

Comment by Milan W (weibac) on the devil's ontology · 2025-02-08T21:48:51.393Z · LW · GW

I think the thesis of the poster is roughly: The left latched into the concept of "diversity" to make the right hate it, thus becoming more homogeneous and dumber. Seems plausible, yet a bit too clever to be likely.

Comment by Milan W (weibac) on Preserving Epistemic Novelty in AI: Experiments, Insights, and the Case for Decentralized Collective Intelligence · 2025-02-08T20:55:59.589Z · LW · GW

All of it. Thinking critically about AI outputs (and also human outputs), and taking mitigating measures to reduce the bullshit in both.

Comment by Milan W (weibac) on henophilia's Shortform · 2025-02-08T20:52:11.420Z · LW · GW

Yeah people in here (and in the EA Froum) are participating in a dicussion that has been going on for a long time, and thus we tend to assume that our interlocutors have a certain set of background knowledge that is admittedly quite unusual and hard to get the hang of. Have you considered applying to the intro to EA program?

Comment by Milan W (weibac) on Preserving Epistemic Novelty in AI: Experiments, Insights, and the Case for Decentralized Collective Intelligence · 2025-02-08T20:46:49.486Z · LW · GW

Thank you for doing, that and please keep doing it. Maybe also run a post draft trough another human before posting, though.

Comment by weibac on [deleted post] 2025-02-08T20:43:15.758Z

Huh. Maybe. I think the labs are already doing something like this, though. Some companies pay you to write stuff more interesting than internet mediocrity. They even pay extra for specialist knowledge. Those companies then sell that writing to the labs, who use it to train their LLMs.

Side point: Consider writing shorter posts, and using LLMs to critique and shorten rather than to (co)write the post itself. Your post is kind of interesting, but a lot longer than it needs to be.

Comment by Milan W (weibac) on henophilia's Shortform · 2025-02-08T20:29:16.139Z · LW · GW

Huh. OK that looks like a thing worth doing. Still, I think you are probably underestimating how much smarter future AIs will get, and how useful intelligence is. But yes, money is also powerful. Therefore, it is good to earn money and then give it away. Have you heard of effective altruism?

User info

Posts

Comments