Exploitation vs. Exploration

Building a Bridge#

Imagine that the San Francisco Golden Gate Bridge has been rendered unusable by the California Wildfires and the years of Bay Area traffic. The government has tasked you with rebuilding the bridge. You have little, if any, background in bridge building, but this task is really important because you will be building something that should last generations. How would you go about this task?

Well, it probably depends on the timeline of the task. If the deadline to finish the bridge is next week, you’d better get started right away. If it’s before you die (assuming you’re fairly young), you can take a more measured approach. You’d be better off strolling to the library and picking up some books on architecture, talking to a few relevant experts, and wait until you have a better idea of your task at hand before starting.

It also depends on your personality. Upon starting given a reasonable timeline, some people would dive straight into carving out steel while others might rush towards the cubicles in the library.

There’s some wiggle room as to what works, but clearly the very extremes of this exploration-exploitation tradeoff do not work. If you spend all your time reading about bridges (exploration) but not building them, you won’t build much of a bridge. Conversely, if you spend all your time building the bridge (exploitation), you’ll find yourself reinventing the wheel over again.

This brings us to wondering what the ideal approach might be. I suspect it might look something like this:

You want to spend a reasonable amount of time learning about what makes certain bridges great and consulting experts. You might also want to build out a few bridges to test the models that you’ve learned and get some real-world practice. After you feel like you have a good grasp, you can finally start building a great bridge. You do a fair amount of exploration but not too much and then you execute.

A Simple Diagram#

Imagine you have the Start state and the Goal state. You want to reach the Goal as soon as you can. Each of your attempts are logged as vectors, put you somewhere else, ideally close to the Goal state.

Starting by over-optimizing for exploitation might look like this:

Over Exploiting

Eventually, your path towards the Goal may look like this.

Eventual Over Exploiting

It’s not great because you’re taking a lot of detours.

Over-optimizing for exploring on the other hand:

Over exploiting

Your vectors are too small because you’re not doing enough “exploiting.”

Ideally, finding a healthy balance between the two might yield something like this:

Balance

You find a path that’s short but not the shortest, but you’ll get this way the fastest.

I’ve been spending the past day or so thinking about the exploration-exploitation tradeoff and how it’s applied to my life. To understand this idea, take the following analogy:

Personal Reflections#

This tradeoff has applied to my past experiences, particularly when it comes to writing software. I’ve observed that I have a tendency to over-exploit before exploring, and as a consequence, my end results sometimes don’t bake out in great form.

Here’s an example: I was once building out a product for a healthtech venture my sophomore year in college. The user-facing product was an app written in React Native, which I had to learn from scratch. I went into the task with the short-term mentality of “build and ship as quickly as possible,” so I didn’t do my due diligence with properly going through the docs, reading other people’s works, and etc…

As you can imagine, the end result was a disaster and product took longer to ship. We improved over time, but I can’t help but think that had I taken the time to explore more, we could’ve avoided a lot of mistakes with the product.

Our very first internal release, I didn’t consider that components might behave differently between operating systems and screen sizes. The result was a bunch of texts and buttons running off the screen.

In our following release, I chose to use ImmutableJS to mutate our redux states, but it ended up being really heavy. Had I taken the few minutes to learn about Immer or hooks like useReducer, we wouldn’t have the app crashing from memory leaks every couple sessions from running crazy tree mutations. I was too impatient and opted for the choices of what I knew, which was a small subset of the available options.

This tendency towards exploitation over exploration continued. I only learned about React Hooks midway through building [poisson.us](http://poisson.us) and important concepts like ORM, the LLVM compiler backend, and React Contexts while building contour.so. By setting ambitious deadlines, I was writing more code, but learning less and building stuff with less quality.

How to Find the Healthy Balance#

So this brings the question: how do we know when to start? In the case of software, how do we know when we should start building after preparing to build?

My thoughts are that there are a few relevant factors that come to play:

Deadlines. If the deadline feels relatively tight and you feel like you have a good idea of how to execute through it, you should optimize for starting soon.
Background knowledge. If you have a good mental model, there’s less of a need to “explore.” On the other hand, if you’re in unfamiliar territory, it certainly pays dividends to learn what’s part of the landscape.
Extensibility. Are you building using frameworks that you’re interested in using in the future as well? Is it valuable for you to be good at building what you’re currently building/planning to build?

I strongly believe that if you’re building something of value and thus taking a long time to build, exploration pays compounding dividends down the line. So for these types of projects, it’s really important to take the time to know what you’re doing because early decisions you make will cost you hundreds of hours down the line.

That is not to dismiss the value of shipping things quickly and see how customers react to it. This methodology works in the case where you don’t have a great idea of what the end product might look like especially because of how users will react to it.

With this, I’m ending this post with a new found optimism to learn more and write less code. I want to optimize for more exploration and hopefully the results show for themselves.