If you are assuming Software works well you are dead

post by Johannes C. Mayer (johannes-c-mayer) · 2024-05-04T12:54:17.675Z · LW · GW · 12 comments

Contents

12 comments

Suno Version

I say this because I can hardly use a computer without constantly getting distracted. Even when I actively try to ignore how bad software is, the suggestions keep coming.

Seriously Obsidian? You could not come up with a system where links to headings can't break? This makes you wonder what is wrong with humanity. But then I remember that humanity is building a god without knowing what they will want.

So for those of you who need to hear this: I feel you. It could be so much better. But right now, can we really afford to make the ultimate <programming language/text editor/window manager/file system/virtual collaborative environment/interface to GPT/...>?

Can we really afford to do this while our god software looks like...

May this find you well.

12 comments

Comments sorted by top scores.

comment by Dagon · 2024-05-04T17:23:16.938Z · LW(p) · GW(p)

Agreed, but it's not just software.  It's every complex system, anything which requires detailed coordination of more than a few dozen humans and has efficiency pressure put upon it.  Software is the clearest example, because there's so much of it and it feels like it should be easy.

Replies from: Viliam
comment by Viliam · 2024-05-04T22:15:40.010Z · LW(p) · GW(p)

Consider the pressures and incentives. Adding new features can help you sell the software to more users. Fixing bugs... unless the application is practically falling apart, it does not make much of a difference. After all, the bugs will only get noticed by people who already use your application, i.e. they already paid for it.

For the artificial intelligence, I assume the "killer app" will be its integration with SharePoint.

Replies from: Dagon
comment by Dagon · 2024-05-07T03:10:55.347Z · LW(p) · GW(p)

In theory, competition should counteract a lot of those incentives. Since software generally has low marginal costs, the ones with better functionality for passing users should get more market share, and investing in becoming/staying best will be rewarded.

For a lot of it, noise and short-term metrics overwhelm the quality drive, unfortunately. That’s likely because most software is too cheap (because many customers prefer inexpensive crap, so good things don’t get made).

Replies from: Viliam
comment by Viliam · 2024-05-07T06:42:34.964Z · LW(p) · GW(p)

In software, network effects are strong. A solution people are already familiar with has an advantage. A solution integrated with other solutions you already use has an advantage (and it is easier to do the integration when both solutions are made by you). You can further lock the users in by e.g. creating a marketplace where people can sell plugins to your solution. Compared to all of this, things like "nice to use" remain merely wishes.

comment by Ustice · 2024-05-04T13:27:50.338Z · LW(p) · GW(p)

I don’t know about making god software, but human software is a lot of trial and error. I have been writing code for close to 40 years. The best I can do is write automated tests to anticipate the kinds of errors I might get. My imagination just isn’t as strong as reality.

There is provably no way to fully predict how a software system of sufficient complexity. With careful organization it becomes easier to reason about and predict, but unless you are writing provable software (it’s a very slow and complex process, I hear), that’s the best you get.

I feel you on being distracted by software bugs. I’m one of those guys that reports them, or even code change suggestions (GitHub Pull Requests).

Replies from: johannes-c-mayer
comment by Johannes C. Mayer (johannes-c-mayer) · 2024-05-04T14:42:29.848Z · LW(p) · GW(p)

I don’t know about making god software, but human software is a lot of trial and error. I have been writing code for close to 40 years. The best I can do is write automated tests to anticipate the kinds of errors I might get. My imagination just isn’t as strong as reality.

I think it is incorrect to say that testing things fully formally is the only alternative to whatever the heck we are currently doing. I mean there is property-based testing as a first step (which maybe you also refer to with automated tests but I would guess you are probably mainly talking about unit tests).

Maybe try Haskell or even better Idris? The Haskell compiler is very annoying until you realize that it loves you. Each time it annoys you with compile errors it actually says "Look I found this error here that I am very very sure you'd agree is an error, so let me not produce this machine code that would do things you don't want it to do".

It's very bad at communicating this though, so it's words of love usually are blurted out like this:

Don't bother understanding the details, they are not important.

So maybe Haskell's greatest strength, being a very "noisy" compiler, is also its downfall. Nobody likes being told that they are wrong, well at least not until you understand that your goals and the compiler's goals are actually aligned. And the compiler is just better at thinking about certain kinds of things that are harder for you to think about.

In Haskell, you don't really ever try to prove anything about your program in your program. All of this you get by just using the language normally. You can then go one step further with Agda, Idris2, or Lean, and start to prove things about your programs, which easily can get tedious.

But even then when you have dependent types you can just add a lot more information to your types, which makes the compiler able to help you better. Really we could see it as an improvement to how you can tell the compiler what you want.

But again, you what you can do in dependent type theory? NOT use dependent type theory! You can use Haskell-style code in Idris whenever that is more convenient.

And by the way, I totally agree that all of these languages I named are probably only ghostly images of what they could truly be. But I guess humanity cares more about making javascript run REALLY REALLY FAST.

And I got to be careful not to go there.

I feel you on being distracted by software bugs. I’m one of those guys that reports them, or even code change suggestions (GitHub Pull Requests).

Yeah, I also do this. My experience so far generally has been very positive. It's really cool when I make an issue with "I would think it would be nice if this program does X", and then have it do x in 1 or 2 weeks.

I don't know where to open an issue though about that I think it would be better to not build a god we don't comprehend. Maybe I haven't looked hard enough.

Replies from: faul_sname
comment by faul_sname · 2024-05-04T19:25:37.905Z · LW(p) · GW(p)

Haskell is a beautiful language, but in my admittedly limited experience it's been quite hard to reason about memory usage in deployed software (which is important because programs run on physical hardware. No matter how beautiful your abstract machine, you will run into issues where the assumptions that abstraction makes don't match reality).

That's not to say more robust programming languages aren't possible. IMO rust is quite nice, and easily interoperable with a lot of existing code, which is probably a major factor in why it's seeing much higher adoption.

But to echo and build off what @ustice said earlier [LW(p) · GW(p)]:

The hard part of programming isn't writing a program that transforms simple inputs with fully known properties into simple outputs that are meet some known requirement. The hard parts are finding or creating a mostly-non-leaky abstraction that maps well onto your inputs, and determining what precise machine-interpretable rules produce outputs that look like the ones you want.

Most bugs I've seen come at the boundaries of the system, where it turns out that one of your assumptions about your inputs was wrong, or that one of your assumptions about how your outputs will be used was wrong.

I almost never see bugs like this

  • My sort(list, comparison_fn) function fails to correctly sort the list"
  • My graph traversal algorithm skips nodes it should have hit
  • My pick_winning_poker_hand() function doesn't always recognize straights

Instead, I usually see stuff like

  • My program assumes that when the server receives an order_received webhook, and then hits the server to fetch the order details from the vendor's API for the order identified in the webhook payload, the vendor's API will return the order details and not a 404 not found"
  • My server returns nothing at all when fetching the user's bill for this month, because while the logic is correct (determine the amount due for each order and sum), this particular user had 350,000 individual orders this month so the endpoint takes >30 seconds, times out, and returns nothing.
  • The program takes satellite images along with Metadata that includes the exact timesamp, which satellite took the picture, and how the satellite was angles. It identifies locations which match a specific feature, and spits out a latitude, longitude, label, and confidence score. However, when viewing the locations on a map, they appear to be 100-700 meters off, but only for points within the borders of China (because the programmer didn't know about GCJ-02)

Programming languages that help you write code that is "correct" mostly help prevent the first type of bug, not the second.

Replies from: johannes-c-mayer
comment by Johannes C. Mayer (johannes-c-mayer) · 2024-05-05T10:32:52.740Z · LW(p) · GW(p)

Another thing that Haskell would not help you at all with is making your application good. Haskell would not force obsidian to have unbreakable references.

comment by Johannes C. Mayer (johannes-c-mayer) · 2024-05-04T13:00:53.028Z · LW(p) · GW(p)

"If you are assuming Software works well you are dead" because:

  • If you assume this you will be shocked by how terrible software is every moment you use a computer, and your brain will constantly try to fix it wasting your time.
  • You should not assume that humanity has it in it to make the god software without your intervention.
  • When making god software: Assume the worst.
comment by Nevin Wetherill (nevin-wetherill) · 2024-05-04T17:26:10.168Z · LW(p) · GW(p)

I have been contemplating Connor Leahy's Cyborgism [LW · GW] and what it would mean for us to improve human workflows enough that aligning AGI looks less like:

Sisyphus attempting to roll a 20 tonne version of The One Ring To Rule Them All into the caldera of Mordor while blindfolded and occasionally having to bypass vertical slopes made out of impossibility proofs that have been discussed by only 3 total mathematicians ever in the history of our species - all before Sauron destroys the world after waking up from a restless nap of an unknown length.

I think this is what you meant by "make the ultimate <programming language/text editor/window manager/file system/virtual collaborative environment/interface to GPT/...>"

Intuitively, the level I'm picturing is:

A suite of tools that can be booted up from a single icon on the home screen of a computer which then allows anyone who has decent taste in software to create essentially any program they can imagine up to a level of polish that people can't poke holes in even if you give a million reviewers 10 years of free time.

Can something at this level be accomplished?

Well, what does coding look like currently? It seems to look like a bunch of people with dark circles under their eyes reading long strings of characters in something basically the equivalent to an advanced text editor, with a bunch of additional little windows of libraries and graphics and tools.

This is not as domain where human intelligence performs with as much ease as in other domains like spearfishing or bushcraft.

If you want to build Cyborgs, I am pretty sure where you start is by focusing on building software that isn't god-machines, throwing out the old book of tacit knowledge, and starting over with something that makes each step as intuitive as possible. You probably also focus way more on quality over quantity/speed.

So, plaintext instructions on what kind of software you want to build, or a code repository and a plaintext list of modifications? Like, start with an elevator pitch, see the raw/AI generated material, critique in a step-by-step organized fashion where debugging/feature analysis checklists are generated and scored on whether they included everything you would have thought of/stuff that is valid which you didn't think of.

I think the point in this post is valid, though a bit more in the realm of "angsty shower-thought" rather than a ready-to-import heuristic for analysing the gap between competence-in-craft and unleashed-power-levels.

There is a bit of a bootstrapping problem with Cyborgism. I don't think you get the One Programming Suite To Rule Them All by plugging in a bunch of different LLMs fine tuned to do one part of the process really well - then packaging the whole thing up and running it on 6 gaming GPUs. That is the level of super-software that seems in reach, and it just seems doomed to be full of really hard-to-perceive holes like a super high-dimensional block of swiss cheese.

Does that even get better if we squeeze the weights of LLMs to get LLeMon juice:

Python script that does useful parts of the cool smart stuff LLMs do without all the black box fuzzy embedding-spaces/vectors + filter the juice for pulp/seeds (flaws in the specific decoded algorithm that could cause errors via accidental or deliberate adversarial example) + sweeten it (make the results make sense/be understandable to humans)

... then plug a bunch of that LLeMonade into a programming suite such that the whole thing works with decently competent human programmer(s) to reliably make stuff that actually just works & alert people ahead of time of the exhaustive set of actual issues/edge cases/gaps in capability of a program?

This problem does seem difficult - and probably the whole endeavor just actually won't work well enough IRL, but it seems worth trying?

Like, what does it look like to throw everything and the kitchen sink at Alignment? It probably looks at least a little like the Apollo program, and if you're doing NASA stuff properly, then you end up making some revolutionary products for everyone else.

I think those products - the random externalities of a healthy Alignment field - look more like tools that work simply and reliably, rather than the giant messy LLM hairballs AI labs keep coughing up and dumping on people.

Maybe all of this helps flesh out and make more useful the flinchy heuristic of "consumer software works terribly -> ... -> AI destroys the future."

Alignment as a field goes out ahead of the giant rolling gooey hairball of spaghetti-code Doom - untangles it and weaves it into a beautiful textile - or we are all dead.

comment by RetepRennelk (reteprennelk) · 2024-05-04T13:33:33.624Z · LW(p) · GW(p)

FWIW, you can right-click on a heading and click rename. Afterwards the link will be renamed globally.

Replies from: johannes-c-mayer
comment by Johannes C. Mayer (johannes-c-mayer) · 2024-05-05T10:29:38.200Z · LW(p) · GW(p)

Yes, but now try moving the heading to a different file.