Waste Inferences!

Back in the 1970’s, Caltech professor Carver Mead suggested that, given the implications of Moore’s Law (which he coined!), we should embrace the growing abundance of transistors and “waste” them. Computing power was becoming cheaper at an exponential rate, and what that meant was that we should work to create more powerful, flexible, and innovative designs, even if it meant that they seemed inefficient at first glance.

Just a couple weeks ago Ethan Mollick released a newsletter (What Just Happened, What is Happening Next) that started off with the line “The current best estimates of the rate of improvement in Large Language Models show capabilities doubling every 5 to 14 months”. Not only are LLMs getting more powerful at an incredible rate, but the costs of using them are decreasing at a similar pace.

While many things are different this generation, the implication of the exponential growth of computing power and reduction in cost remains the same. We should embrace it and use it to create more powerful, flexible, and innovative designs even if it means they seem inefficient at first glance. In other words, we should be “wasting inferences” – using the abundant computing power of LLMs and plummeting costs to experiment with novel applications and designs, even if they don’t seem optimally efficient initially.

Thin Wrappers

The main way to dismiss anything anyone was building on LLMs over the last year was “isn’t this just a thin wrapper on top of GPT4?”. My answer was always some variation of “Well isn’t Salesforce just a thin wrapper on top of a relational database?”. It wasn’t until I read Venkatesh Rao’s A Camera, Not an Engine that I realized I wasn’t taking that line of thinking far enough. In the newsletter, he makes the case that we should be thinking of these generative AI models more as a discovery rather than an invention. Jeff Bezos made a similar observation right around the same time.

If we think about these models as a discovery rather than an invention, building thin wrappers over them is exactly what should be done! So many useful things in our daily lives are thin wrappers on top of discoveries. Lightbulbs, toasters, and air conditioners are all just thin wrappers on top of electricity. Building wrappers that provide new interfaces and uses for discoveries are how they are able to make meaningful change in people’s lives. This principle applies not only to physical inventions but also to groundbreaking digital discoveries like LLMs.

Beyond Conversational

If your exposure to LLMs since the release of ChatGPT has been limited to various chatbots, you might find the comparison between LLMs and the discovery of electricity to be an exaggeration. However, the potential applications of LLMs extend far beyond conversational interfaces. Once you start building software with them, and you realize that what you have access to is a near-universal function of string to string, you start to grasp how transformative these things truly are.

Obie Fernandez wrote about this recently in his post The Future of Ruby and Rails in the Age of AI. In the post he describes a component he’s building for his AI-powered consultants platform Olympia and ends the section with “The internal API for this component is plain text. I’m literally taking what would have been dozens or hundreds of lines of code and letting AI handle the job in a black-box fashion. Already. Today.”

Things that previously would have required teams of people, taken multiple sprints, or even quarters worth of work, are now an API call away. Obie Fernandez’s example demonstrates how LLMs can significantly reduce development time and effort. As more developers recognize the potential of “wasting inferences” on innovative applications, we’ll likely see a surge in powerful, AI-driven solutions across many different domains.

Where To Start

Ok so if you’re still with me, you may be thinking: “Where do I start? How can I waste inferences and make more thin wrappers?” Well I’m glad you asked! My recommendation is to start small. At Sublayer we’re building a Ruby AI framework that works with all available models to do just that. 

At its simplest, the way you work with language models can be summed up as:
Gather information -> Send information to an LLM -> Do something with the LLM’s output

Our framework is designed to help you build programs with that flow as simply as possible, and we have tutorials up for how to build simple things like a TDD bot with an LLM writing code for you, a simple voice chat application in Rails, and coming soon, a tutorial for how we built CLAG, a command line program that generates command line commands for you with it. We’re constantly making new tutorials, so keep an eye out and let us know if there’s anything you build with it, we’d love to help spread the word!

These tutorials are just examples though. They’re meant to show how quickly you’re able to create new applications and “waste inferences” on powerful, flexible, and innovative things that may not seem the most efficient at first glance.

Make sure to also check out our docs site that has interactive code generation built right into it to make getting started even faster.

Learn More

Ready to learn more? 

We spend most of our time in the Sublayer discord, so if you have questions, requests for more tutorials or features, and want to learn more about how to “waste inferences”, come join us! You’ll have the opportunity to collaborate with like-minded developers, get support for your projects, and stay up-to-date with the latest advancements in AI-powered development. We’d love to meet you and push the limits of what these models are capable of together!

There’s also a larger, more general community of Rubyists forming around AI that we’re a part of. Join the Ruby AI Builders discord to connect with developers who are exploring various applications of AI in the Ruby ecosystem. It’s a great place to exchange ideas, share your projects, and learn from the experiences of others

In a future post we’ll go into more details about why we think Ruby is a sleeping giant in the world of AI application development and is perfect for “Wasting Inferences”

Introducing Blueprints: A New Approach to AI-Assisted Coding

Today, we’re excited to officially announce the release of Blueprints! A new, open-source, (and soon, model-agnostic!) approach to AI-assisted coding that helps you leverage patterns in your existing codebase for code generation. Personalized to you and your team’s unique style.

Introduction

There is a lot of excitement these days around AI programming assistants like Github Copilot – the promise of an AI auto completing the code you’re writing has massive potential to speed up development work. And they absolutely deliver on speed. There is one downside to this speed. If you’re working with or writing bad or unclear code, AI assistants will simply make it faster to produce more bad code. So, it’s no surprise that we’re seeing reports that programming copilots are having a downward pressure on code quality.

Unlike traditional AI programming assistants that amplify errors and are trained on existing code no matter the quality, Blueprints introduces a novel, model-agnostic approach, leveraging your own code’s patterns to ensure speed without sacrificing the quality and uniqueness of your team’s coding style.

Introducing Blueprints

Blueprints are those patterns that are already in your code base that you look to when working on a new problem. Today we’re releasing a Blueprints Server you can run locally, and text editor plugins for Vim, VSCode, IntelliJ Idea, and Sublime Text to get you started.

We enable you to capture these patterns right inside your editor, and store them. Behind the scenes, your blueprints server generates a description of the code and indexes it using vector embeddings. The next time you need to work on something similar, you simply need to describe it, and the vector search will find the closest match, and use that for generating the solution to your problem. As context windows for models expand; this technique will also become more powerful as we can add more examples, making the models even better at generating custom code for you.

We have plenty more demo videos coming, but you can see it in action here where I use it to generate a new index page in a rails app using my own custom styles and in the new view library Phlex (with only a single example!).

The Inspiration: Leverage How We Already Code Today

When starting to work on a new feature, one of the first things you do is look through your code base to find similar patterns that already exist, places you’ve already solved similar problems, to use as a starting point for this new work. You open these files, learn how it works, maybe copy and paste some parts, and repurpose past work to help solve this problem in a way that is consistent with the way your team does things. This helps us maintain consistency, work efficiently, and make sure we don’t miss anything unique to this particular program. 

With Blueprints, we’ve taken this process of leveraging existing code patterns and automated it for you.

What’s Next?

What we’re releasing today is just the beginning. There are many different directions to take this, and we’ve just scratched the surface of what’s possible. We’re hard at work building more of this out, and over the next few releases you should expect to see:

Sharable Packages 

Download packages of blueprints from our site to use in your blueprints server. These packages can be anything from niche, library-specific patterns like Daisy UI Phlex components to NextJS SaaS starter kits to iOS Swift Game and everything in between. With these, you don’t need to always start from scratch when building something new and see first hand what makes good, promptable code. This will also allow you to more easily share patterns between teammates and allow for community creation and contribution of new packages.

Blueprints-of-Blueprints

Not only is there usually a similar pattern to use as reference in your code base when working on something new, but there are similar steps you take to build on those features, and the Blueprints method works just as well in those cases. As we build out more packages of individual blueprints, expect to see the ability to chain these together at a higher level and generate new step-by-step instructions based on the steps you’ve defined in the past.

Auto-discover Blueprints

We’re heavily focused on what makes a good blueprint and pattern for the LLM to generate. As the principles emerge, we’ll soon be able to detect patterns that already exist in your codebase that are good candidates to capture and present those to you for storing. We’ll also be able to highlight areas where the code isn’t as promptable and are likely to become liabilities once AI is generating more code.

We’re also very interested in hearing any feature requests you have as well! Come join us in our Discord and let us know, we’d love to chat and hear about what you’re building.

Our Vision: Augmenting, Not Replacing


(source)

In “Is Software Engineering Dead?”, we talked about how we see the role of software engineers changing in this post-LLM world, and Blueprints is our first step toward that future state. Software engineering doesn’t go away, but the focus of the role shifts to designing abstractions that AI can more effectively leverage. Through tools like this, engineers will be able to focus more on creating and refining these high-level abstractions, rather than getting bogged down in lower-level implementation details.  

Get Started!

We’re thrilled to launch Blueprints today. You can get started by either going to blueprints.sublayer.com or directly from the github repo. Finally, you can now have an AI assistant that produces high quality code in the style your team already writes in. We’ve been working toward this moment for a long time now and are so excited to see what you build with it!

Hallucinations Are a Feature, Not a Bug

One thing that becomes hard to ignore with generative AI once you get past the initial wave of amazement is their tendency to hallucinate. Inaccuracies in answers and artifacts in images reveal the AI’s true lack of understanding. Remember how much trouble they had generating hands early last year? Money and engineering effort are pouring into the space to address this issue so that these tools can be used for more critical applications, with code generation being a major focus area.

But, striving to eliminate hallucinations from LLMs misses the point; these quirks are where the real transformative potential lies. To really take advantage of the potential productivity gains in software development, we need to be building our programs so that whatever code the LLM produces is working code, rather than tirelessly tweaking the LLM to output exactly what we have in mind. In entertainment, a 99% perfect image or video is fine; any imperfections add character if they’re noticed at all. Yet, in programming, while the program needs to accomplish a goal, embracing the unpredictable output of LLMs can lead to new innovative solutions at orders of magnitude higher productivity than traditional approaches.

A Bit of Programming History

One way we can look back at the history of software development is as a pendulum swinging back and forth between a precise, scientific, formal approach and a more informal, interpretationist, hermeneutic approach. Each approach has its strengths and weaknesses, and for exactly this reason the dominant approach in the industry changes back and forth as new capabilities are discovered and the strengths of one side are favored over the other. 

If you’d like to go further into these two schools of thought, Avdi Grimm has a wonderful talk called The Soul of Software where he does a great job of introducing the concepts. To go even further, he references a book Object Thinking by David West that goes into a lot more detail.

If we start our brief history at structured programming in the 70s we start more on the formal side of the spectrum. We can then look at the rise of the “fuzzy intuition” approach of Object Oriented programming and Smalltalk as the pendulum swinging the other direction. This enabled the industry to really explore the capabilities of the increased performance of our computers, thanks in part to Moore’s Law, and the invention of the GUI for consumer and business applications. 

Over time as useful techniques and patterns were discovered, the pendulum swung back to the formal side and we saw C++ and Java take center stage for a while. These languages and their frameworks systematized the looser approaches of their predecessors, reducing the variation in implementation and enabling teams to grow larger and move quicker for a while.

Then right around the mid-2000’s we had another swing back to the informal. This swing coincided with the rise of Web 2.0 – Ruby and Javascript were the big drivers here. Smaller teams were able to do more more quickly, and we saw an explosion of ideas and experimentation on how best to interact with this new paradigm of fast internet, cheap storage, and interactive web apps. It took roughly until the mid-late 2010s for things to stabilize again, it is no coincidence that Typescript, Rust, and Go really rose in popularity when they did.

So here we are again in early 2024 with the industry over on the formal side, while at the same time LLMs and generative AI provide a brand new territory to explore.

Horseshoes and Hand Grenades, or “Close Enough”

So, how exactly do we embrace the hallucinations of generative AI in software in an informalist way while still building working software? There are two main principles we’ve found so far to operate this way that we’ll be digging into in this post. 

First, from a UX perspective – you need to design things in a way where you have to expect things to be at least a little wrong. This means presenting approvals to users and giving them the ability to easily edit and correct anything that comes back.

One great example of this is the Meals.chat Telegram bot by James Potter. This chatbot lets you send it a photo of what you’re eating and it estimates the calories, macros, and most likely ingredients. Built into the experience of it is a prompt from the bot that asks whether anything looks wrong, and if it does, allowing the user to reply with any changes to be made. 

For dealing with hallucinations from a code perspective, we can take cues from Postel’s Law: “be conservative in what you do, be liberal in what you accept from others”. More dynamic languages like Ruby make following this principle a lot easier. 

One example of this in action is this Ruby module named synonllm. When working with code from an LLM, you can generally be pretty confident in it intending to do what you asked it to do, but the trouble starts when you get into the fine details. Sometimes it doesn’t get method names right, like giving you something camelCased instead of snake_cased or just using the wrong method name completely.

Ruby gives us easy to use hooks to allow us to be liberal in what we accept from the LLM. We’re able to introspect on the action we’re trying to perform and figure out if there’s some way for us to accomplish the task the LLM wants to accomplish even if it isn’t exactly right.

For the programmers in the audience, you can see it in action below (Github Gist):

An example of ruby code using a module named "synonllm" showing an example of calling methods using the synonyms of methods that exist on the class and the LLM taking care of finding and calling the correct method.

This post is already getting pretty long, but I have plenty more examples of how we’re doing this in code, this is just the tip of the iceberg. Think of how much more fun we can have with method_missing, or maybe with monkey patching.

Changing Perceptions and the Path Forward

In “Programming Extremism”, Michael McCormick calls the formal versus informal split I mentioned above “an ancient, sociological San Andreas Fault that runs under the software community”. I see it as an important tension and valuable back and forth depending on what is needed at the moment. And in the moment we’re in right now, I see a lot of opportunity in taking a more informal approach when building software with LLMs. Opportunity in making applications simpler to implement, easier to modify, and inventing new types of applications we haven’t been able to imagine before.

For those of you who are drawn to this approach, we’re building a place for you to share, experiment, and grow. Join our community of like-minded programmers who are exploring the untapped potential of generative AI. Whether you’re just starting out or have been a programmer through some of the pendulum swings I mentioned, we’d love to have you join us and share your insights and experiences. Come join our Promptable Architecture Discord and keep the conversation going!

Is Software Engineering Dead?

The question I get asked the most about Sublayer is “so do you think we won’t need programmers anymore?”

There’s a lot baked into this question depending on who is asking. For the purposes of this post, I’ll focus on the software engineer, wondering what role they have to play when LLMs can write code faster and cheaper than any human possibly could.

For askers who are asking because they want to build products without all those expensive programmers, I don’t have great news for you. The “good news” is that yes, the engineering cost of building your product is going to fall close to $0. The bad news is that it falls close to $0 for the rest of the world as well. A few screenshots and prompts copied from X or Reddit or LinkedIn are all it will take for thousands of competitors to spring up in a day. If you thought there were a lot of Groupon clones, just wait.

But back to programmers. If LLMs are able to write code lightning fast, what place do we have in this new world?

The answer lies in developing and honing your sense of taste.

First, Some Background

It can be instructive to look at other creative fields where something very similar has happened: Music and Photography.

Synthesizers have been available since the 70s. GarageBand is available for free on your Mac. You can buy Fruity Loops for $150. If you want to take it further you can download incredible software-defined instruments from Native Instruments. 

None of these have stopped people from creating music. If anything it has opened music creation up to more people – you no longer have to spend years practicing the same scales and drills over and over to be able to create the music that is in your head. The only thing to really develop is a sense of taste for what sounds good.

Will Suno be the end of humans creating music? No, the same way that Native Instruments wasn’t. But I am pretty excited to hear the new genres that talented musicians create now that the tool is available, the same way Fruity Loops did with dubstep.

Have digital cameras or Photoshop killed photography? Not even close. What they have ended is the mechanical skill barrier to entry. You don’t have to fuss around with lenses or mixing the chemicals in a darkroom to develop your film anymore if you don’t want to.

You can focus on the medium and tell a story, focus on capturing unique perspectives and interesting compositions and subjects. None of the other things have to get in the way anymore.

What all of these technologies have done is shrink what Ira Glass calls “The Gap”. The gap between your taste and your ability to make things that you think are any good. For some, who have put the time and effort into learning the mechanics of a particular discipline there may be a sense of the newcomers having it too easy and not paying their dues learning the intricacies of the tools before being able to make great art. But the flipside is that more people than ever before are going to be able to participate in that art form and take it in brand new directions with new perspectives that have never been seen before.

To me that’s very exciting.

But What is “taste” in Software Engineering?

When I was entering undergrad for computer science I distinctly remember the conversation at the time: Java had won, it was going to take over the industry, and all the US software engineer jobs were going to be outsourced. If there were going to be jobs for us when we graduated, the best we could hope for was working for a place like Initech with Bill Lumbergh and our other managers coming around asking us to file TPS reports.

That happened for some, but by and large the future didn’t end up looking like that. Why not?

Software engineering can be broken down into two core activities: building abstractions and using abstractions. Few professions have the capabilities (much less the mandate) to build tools to make them more efficient at their jobs. Even fewer have done it as quickly and as widespread and recursively as the programming world. [Aside: I have a Moore’s Law like analogy in my head that I can’t exactly get to a place I like. Software is not only eating the world, but it also constantly eats and re-eats itself]

Programmers are constantly becoming more and more capable of building bigger and more complex programs with the same or even less effort than before. Building new abstractions for you and others to use is what makes this all possible—and each one makes it possible to go even further. This is what the people in the early 2000’s got wrong about Java – the idea that “now that we’ve found The One True Abstraction, that’s the end of things”. We all know how that played out.

Building simple, powerful abstractions has always been a core aspect of software engineering, but like Richard Gabriel notes in Patterns of Software, this aspect has mostly been the domain of expert “programmer poets” inventing new words for concepts for the rest of us to use. The very best abstractions enable us to operate on larger and larger chunks of functionality without needing to concern ourselves with the implementation details. It is really nice to be able to operate on a string as a string and not have to think about it as a character array.

Where do LLMs Fit?

Which brings us to LLMs and how they fit into all of this. LLMs are great users of abstractions. They will mostly accept the state of the world you create for them, and then operate inside of it. You can invent anything you’d like and they’ll try to work with it.

This is the same thing that drives the conversation every time “the end of programming” comes up. The abstractions from Java were straightforward to *teach*. And so large numbers of people were taught Java, its idiomatic abstractions, and how to apply them in a systematic way to business use cases. We saw a similar phenomenon around the end of the last decade  with the rise of coding bootcamps, teaching great numbers of people how to apply the abstractions from Ruby on Rails or React. 

Don’t get me wrong – this is a good thing! It’s just not an end state. For a long time now, the economy has desperately demanded more users of abstractions than the available supply—and every so often a new abstraction is created that is so powerful that it unlocks even *more* demand than before. This has led to the massive growth in the industry, and what LLMs are positioned to solve: the economy will finally be able to match the supply of users of abstraction to the demand (it just won’t be with people).

What Now?

So, programmer asking whether “software engineering” is dead: if you’re currently comfortable as an abstraction user, yes that side of programming is going to be taken over by generative AI. LLMs are cheaper, faster, and have more energy to apply abstraction after abstraction than you do.

But there’s still hope! Your taste and ability to create good, usable abstractions will still be in demand and—crucially—take on an even greater importance. Luckily, the feedback loop has also been shortened to almost instant now as well. Previously, you had to just believe that certain principles for good code would be rewarded. Now, you will be able to know almost immediately whether your abstractions hold up, because your LLM will either give you working code or it won’t.

I’m sure there will be all new types of patterns and abstractions coming out as more people explore programming with LLMs, and like the new genres of music that technology has enabled, I couldn’t be more excited for them.

Why We Need a New Product Management App

Since the first days of building Sublayer, we’ve been asked why we decided to build a new product management app instead of just integrating with the existing ones. It wasn’t a decision we decided to take on lightly – the first proof of concept as we were exploring the idea was actually integrated with Pivotal Tracker (demo video here), but the further we took the idea the more it became clear to us that even though we’re starting with the product management interface metaphor, where we were going to end up was going to look much different. We needed the freedom to go down directions that just weren’t available through other services’ APIs.

But Why Start With Product Management?

The simplest answer is that the UX of the product management app is already a pretty optimized UX for building and changing software over time. You can even think of it as a prompt builder – the person writing the stories or specs fills in the state of the world, some motivation and description for the change, and the expected output sometimes in the form of acceptance criteria. If you’re a product manager, you’re already a prompt engineer, you’re just prompting humans to change your software.

We believe the way that we’re going to collaborate with AI to build software is going to feel a lot like product management does today, with the PM role shifting to be a little closer to engineering and the engineer role shifting to become something a lot closer to PM than it does today. But, if that’s all it was, the existing solutions out there would easily be up to the task of integrating generative AI and there would be no need for anything new.

Engineering as the Bottleneck

As Goldratt taught us in The Goal – any improvements made anywhere besides the bottleneck are an illusion. PMs writing stories faster, designers designing faster, customer interviews summarized faster are all great, but none of it will lead to increased output and to increased revenue. The current system is so ingrained in the way we think about building product that until someone breaks the bottleneck of engineering, it’s hard to guess where it is going to move to next. This is the only step in the process that matters right now.

In most, if not all, product organizations, engineering is the bottleneck. It is the most expensive, and slowest step in the process of delivering software. Everything in these organizations is set up around the delivery cadence of and capacity of this bottleneck. With preliminary numbers coming out around usage of Copilot at over 50% increased task completion speeds [1] we’re only seeing the tip of the iceberg for how this is going to transform the work of software engineering. With the ideas we’ve shared around Promptable Architecture we see the possibility of breaking the bottleneck of engineering completely, and are excited to see where it moves to next and get to work tackling that one.

An Opinionated Process

The other thing we touched on in Promptable Architecture is that the way we organize is going to need to change. The current products on the market are flexible and allow their customers to bring whichever process they like to developing products, and while we all laugh at the iterated waterfall / agilefall monstrosities that some companies have created, the variation in process at the current level just won’t exist in the future. When the risk of getting it wrong means a task taking orders of magnitude more time and money, the winning workflow tools will be the ones that make it hard, if not impossible, to do things the wrong way.

This is why we need a new product management application. To guide and teach users on the best ways to get the massive benefits available in using generative AI.

At Sublayer we’re obsessed with the product development process, and are extremely excited by what we’ve already seen with our product. If this post has reached you, sign up and try our product out, or join our discord and say hi!