Hallucinations Are a Feature, Not a Bug

One thing that becomes hard to ignore with generative AI once you get past the initial wave of amazement is their tendency to hallucinate. Inaccuracies in answers and artifacts in images reveal the AI’s true lack of understanding. Remember how much trouble they had generating hands early last year? Money and engineering effort are pouring into the space to address this issue so that these tools can be used for more critical applications, with code generation being a major focus area.

But, striving to eliminate hallucinations from LLMs misses the point; these quirks are where the real transformative potential lies. To really take advantage of the potential productivity gains in software development, we need to be building our programs so that whatever code the LLM produces is working code, rather than tirelessly tweaking the LLM to output exactly what we have in mind. In entertainment, a 99% perfect image or video is fine; any imperfections add character if they’re noticed at all. Yet, in programming, while the program needs to accomplish a goal, embracing the unpredictable output of LLMs can lead to new innovative solutions at orders of magnitude higher productivity than traditional approaches.

A Bit of Programming History

One way we can look back at the history of software development is as a pendulum swinging back and forth between a precise, scientific, formal approach and a more informal, interpretationist, hermeneutic approach. Each approach has its strengths and weaknesses, and for exactly this reason the dominant approach in the industry changes back and forth as new capabilities are discovered and the strengths of one side are favored over the other. 

If you’d like to go further into these two schools of thought, Avdi Grimm has a wonderful talk called The Soul of Software where he does a great job of introducing the concepts. To go even further, he references a book Object Thinking by David West that goes into a lot more detail.

If we start our brief history at structured programming in the 70s we start more on the formal side of the spectrum. We can then look at the rise of the “fuzzy intuition” approach of Object Oriented programming and Smalltalk as the pendulum swinging the other direction. This enabled the industry to really explore the capabilities of the increased performance of our computers, thanks in part to Moore’s Law, and the invention of the GUI for consumer and business applications. 

Over time as useful techniques and patterns were discovered, the pendulum swung back to the formal side and we saw C++ and Java take center stage for a while. These languages and their frameworks systematized the looser approaches of their predecessors, reducing the variation in implementation and enabling teams to grow larger and move quicker for a while.

Then right around the mid-2000’s we had another swing back to the informal. This swing coincided with the rise of Web 2.0 – Ruby and Javascript were the big drivers here. Smaller teams were able to do more more quickly, and we saw an explosion of ideas and experimentation on how best to interact with this new paradigm of fast internet, cheap storage, and interactive web apps. It took roughly until the mid-late 2010s for things to stabilize again, it is no coincidence that Typescript, Rust, and Go really rose in popularity when they did.

So here we are again in early 2024 with the industry over on the formal side, while at the same time LLMs and generative AI provide a brand new territory to explore.

Horseshoes and Hand Grenades, or “Close Enough”

So, how exactly do we embrace the hallucinations of generative AI in software in an informalist way while still building working software? There are two main principles we’ve found so far to operate this way that we’ll be digging into in this post. 

First, from a UX perspective – you need to design things in a way where you have to expect things to be at least a little wrong. This means presenting approvals to users and giving them the ability to easily edit and correct anything that comes back.

One great example of this is the Meals.chat Telegram bot by James Potter. This chatbot lets you send it a photo of what you’re eating and it estimates the calories, macros, and most likely ingredients. Built into the experience of it is a prompt from the bot that asks whether anything looks wrong, and if it does, allowing the user to reply with any changes to be made. 

For dealing with hallucinations from a code perspective, we can take cues from Postel’s Law: “be conservative in what you do, be liberal in what you accept from others”. More dynamic languages like Ruby make following this principle a lot easier. 

One example of this in action is this Ruby module named synonllm. When working with code from an LLM, you can generally be pretty confident in it intending to do what you asked it to do, but the trouble starts when you get into the fine details. Sometimes it doesn’t get method names right, like giving you something camelCased instead of snake_cased or just using the wrong method name completely.

Ruby gives us easy to use hooks to allow us to be liberal in what we accept from the LLM. We’re able to introspect on the action we’re trying to perform and figure out if there’s some way for us to accomplish the task the LLM wants to accomplish even if it isn’t exactly right.

For the programmers in the audience, you can see it in action below (Github Gist):

An example of ruby code using a module named "synonllm" showing an example of calling methods using the synonyms of methods that exist on the class and the LLM taking care of finding and calling the correct method.

This post is already getting pretty long, but I have plenty more examples of how we’re doing this in code, this is just the tip of the iceberg. Think of how much more fun we can have with method_missing, or maybe with monkey patching.

Changing Perceptions and the Path Forward

In “Programming Extremism”, Michael McCormick calls the formal versus informal split I mentioned above “an ancient, sociological San Andreas Fault that runs under the software community”. I see it as an important tension and valuable back and forth depending on what is needed at the moment. And in the moment we’re in right now, I see a lot of opportunity in taking a more informal approach when building software with LLMs. Opportunity in making applications simpler to implement, easier to modify, and inventing new types of applications we haven’t been able to imagine before.

For those of you who are drawn to this approach, we’re building a place for you to share, experiment, and grow. Join our community of like-minded programmers who are exploring the untapped potential of generative AI. Whether you’re just starting out or have been a programmer through some of the pendulum swings I mentioned, we’d love to have you join us and share your insights and experiences. Come join our Promptable Architecture Discord and keep the conversation going!

Is Software Engineering Dead?

The question I get asked the most about Sublayer is “so do you think we won’t need programmers anymore?”

There’s a lot baked into this question depending on who is asking. For the purposes of this post, I’ll focus on the software engineer, wondering what role they have to play when LLMs can write code faster and cheaper than any human possibly could.

For askers who are asking because they want to build products without all those expensive programmers, I don’t have great news for you. The “good news” is that yes, the engineering cost of building your product is going to fall close to $0. The bad news is that it falls close to $0 for the rest of the world as well. A few screenshots and prompts copied from X or Reddit or LinkedIn are all it will take for thousands of competitors to spring up in a day. If you thought there were a lot of Groupon clones, just wait.

But back to programmers. If LLMs are able to write code lightning fast, what place do we have in this new world?

The answer lies in developing and honing your sense of taste.

First, Some Background

It can be instructive to look at other creative fields where something very similar has happened: Music and Photography.

Synthesizers have been available since the 70s. GarageBand is available for free on your Mac. You can buy Fruity Loops for $150. If you want to take it further you can download incredible software-defined instruments from Native Instruments. 

None of these have stopped people from creating music. If anything it has opened music creation up to more people – you no longer have to spend years practicing the same scales and drills over and over to be able to create the music that is in your head. The only thing to really develop is a sense of taste for what sounds good.

Will Suno be the end of humans creating music? No, the same way that Native Instruments wasn’t. But I am pretty excited to hear the new genres that talented musicians create now that the tool is available, the same way Fruity Loops did with dubstep.

Have digital cameras or Photoshop killed photography? Not even close. What they have ended is the mechanical skill barrier to entry. You don’t have to fuss around with lenses or mixing the chemicals in a darkroom to develop your film anymore if you don’t want to.

You can focus on the medium and tell a story, focus on capturing unique perspectives and interesting compositions and subjects. None of the other things have to get in the way anymore.

What all of these technologies have done is shrink what Ira Glass calls “The Gap”. The gap between your taste and your ability to make things that you think are any good. For some, who have put the time and effort into learning the mechanics of a particular discipline there may be a sense of the newcomers having it too easy and not paying their dues learning the intricacies of the tools before being able to make great art. But the flipside is that more people than ever before are going to be able to participate in that art form and take it in brand new directions with new perspectives that have never been seen before.

To me that’s very exciting.

But What is “taste” in Software Engineering?

When I was entering undergrad for computer science I distinctly remember the conversation at the time: Java had won, it was going to take over the industry, and all the US software engineer jobs were going to be outsourced. If there were going to be jobs for us when we graduated, the best we could hope for was working for a place like Initech with Bill Lumbergh and our other managers coming around asking us to file TPS reports.

That happened for some, but by and large the future didn’t end up looking like that. Why not?

Software engineering can be broken down into two core activities: building abstractions and using abstractions. Few professions have the capabilities (much less the mandate) to build tools to make them more efficient at their jobs. Even fewer have done it as quickly and as widespread and recursively as the programming world. [Aside: I have a Moore’s Law like analogy in my head that I can’t exactly get to a place I like. Software is not only eating the world, but it also constantly eats and re-eats itself]

Programmers are constantly becoming more and more capable of building bigger and more complex programs with the same or even less effort than before. Building new abstractions for you and others to use is what makes this all possible—and each one makes it possible to go even further. This is what the people in the early 2000’s got wrong about Java – the idea that “now that we’ve found The One True Abstraction, that’s the end of things”. We all know how that played out.

Building simple, powerful abstractions has always been a core aspect of software engineering, but like Richard Gabriel notes in Patterns of Software, this aspect has mostly been the domain of expert “programmer poets” inventing new words for concepts for the rest of us to use. The very best abstractions enable us to operate on larger and larger chunks of functionality without needing to concern ourselves with the implementation details. It is really nice to be able to operate on a string as a string and not have to think about it as a character array.

Where do LLMs Fit?

Which brings us to LLMs and how they fit into all of this. LLMs are great users of abstractions. They will mostly accept the state of the world you create for them, and then operate inside of it. You can invent anything you’d like and they’ll try to work with it.

This is the same thing that drives the conversation every time “the end of programming” comes up. The abstractions from Java were straightforward to *teach*. And so large numbers of people were taught Java, its idiomatic abstractions, and how to apply them in a systematic way to business use cases. We saw a similar phenomenon around the end of the last decade  with the rise of coding bootcamps, teaching great numbers of people how to apply the abstractions from Ruby on Rails or React. 

Don’t get me wrong – this is a good thing! It’s just not an end state. For a long time now, the economy has desperately demanded more users of abstractions than the available supply—and every so often a new abstraction is created that is so powerful that it unlocks even *more* demand than before. This has led to the massive growth in the industry, and what LLMs are positioned to solve: the economy will finally be able to match the supply of users of abstraction to the demand (it just won’t be with people).

What Now?

So, programmer asking whether “software engineering” is dead: if you’re currently comfortable as an abstraction user, yes that side of programming is going to be taken over by generative AI. LLMs are cheaper, faster, and have more energy to apply abstraction after abstraction than you do.

But there’s still hope! Your taste and ability to create good, usable abstractions will still be in demand and—crucially—take on an even greater importance. Luckily, the feedback loop has also been shortened to almost instant now as well. Previously, you had to just believe that certain principles for good code would be rewarded. Now, you will be able to know almost immediately whether your abstractions hold up, because your LLM will either give you working code or it won’t.

I’m sure there will be all new types of patterns and abstractions coming out as more people explore programming with LLMs, and like the new genres of music that technology has enabled, I couldn’t be more excited for them.

Why We Need a New Product Management App

Since the first days of building Sublayer, we’ve been asked why we decided to build a new product management app instead of just integrating with the existing ones. It wasn’t a decision we decided to take on lightly – the first proof of concept as we were exploring the idea was actually integrated with Pivotal Tracker (demo video here), but the further we took the idea the more it became clear to us that even though we’re starting with the product management interface metaphor, where we were going to end up was going to look much different. We needed the freedom to go down directions that just weren’t available through other services’ APIs.

But Why Start With Product Management?

The simplest answer is that the UX of the product management app is already a pretty optimized UX for building and changing software over time. You can even think of it as a prompt builder – the person writing the stories or specs fills in the state of the world, some motivation and description for the change, and the expected output sometimes in the form of acceptance criteria. If you’re a product manager, you’re already a prompt engineer, you’re just prompting humans to change your software.

We believe the way that we’re going to collaborate with AI to build software is going to feel a lot like product management does today, with the PM role shifting to be a little closer to engineering and the engineer role shifting to become something a lot closer to PM than it does today. But, if that’s all it was, the existing solutions out there would easily be up to the task of integrating generative AI and there would be no need for anything new.

Engineering as the Bottleneck

As Goldratt taught us in The Goal – any improvements made anywhere besides the bottleneck are an illusion. PMs writing stories faster, designers designing faster, customer interviews summarized faster are all great, but none of it will lead to increased output and to increased revenue. The current system is so ingrained in the way we think about building product that until someone breaks the bottleneck of engineering, it’s hard to guess where it is going to move to next. This is the only step in the process that matters right now.

In most, if not all, product organizations, engineering is the bottleneck. It is the most expensive, and slowest step in the process of delivering software. Everything in these organizations is set up around the delivery cadence of and capacity of this bottleneck. With preliminary numbers coming out around usage of Copilot at over 50% increased task completion speeds [1] we’re only seeing the tip of the iceberg for how this is going to transform the work of software engineering. With the ideas we’ve shared around Promptable Architecture we see the possibility of breaking the bottleneck of engineering completely, and are excited to see where it moves to next and get to work tackling that one.

An Opinionated Process

The other thing we touched on in Promptable Architecture is that the way we organize is going to need to change. The current products on the market are flexible and allow their customers to bring whichever process they like to developing products, and while we all laugh at the iterated waterfall / agilefall monstrosities that some companies have created, the variation in process at the current level just won’t exist in the future. When the risk of getting it wrong means a task taking orders of magnitude more time and money, the winning workflow tools will be the ones that make it hard, if not impossible, to do things the wrong way.

This is why we need a new product management application. To guide and teach users on the best ways to get the massive benefits available in using generative AI.

At Sublayer we’re obsessed with the product development process, and are extremely excited by what we’ve already seen with our product. If this post has reached you, sign up and try our product out, or join our discord and say hi!