Chunk Honey is a more accessible machine learning model

I never set out to create a machine learning application. Sometimes, as an engineer, you end up making a thing, simply because a thing needs to exist, it doesn’t exist, and you end up being the person who conjures a thing into existence. For me, Chunk Honey is that thing. It was a labor of love, frustration, consternation, and constant headaches.

So how did I end up here?

In November of 2017, an email landed in my inbox, and it covered Amazon‘s new machine learning platform, called Sagemaker. There are plenty of upsides to Sagemaker, and Amazon covers them well on their website, so I won’t re-hash them here — but there’s one important note, and I owe it to my dad for impressing an idea on me, and it has stuck with me for my entire life.

Computers aren’t an arms race to the top. “Human Factors” is a particular branch of computer science, and how human beings actually use computers, and get the most value from them. As a society, we tend to fixate on the highest specifications, of the newest devices and software. There will always be a better camera on the new iPhone, and new GPU’s will be able to render 8K games at 150 frames per second, blowing your mind with detail, shading and light.

For me, I have always been more interested on what value we can derive from the lowest specifications. How can a Raspberry Pi or an Arduino do time saving tasks? How can a simple application make the most impact, with the quickest deployment time? In the past year alone, I have written two extremely dirty (and effective) applications that have delivered millions of dollars of value to small businesses in Make Message, and Scan Dem Digits. Both were deployed in weeks, and then iterated until they were polished around the edges.

The point of Chunk Honey was to make machine learning more accessible for small and medium-sized businesses. I’m not trying to sound overly-broad here, but machine learning hasn’t exactly been accessible to the masses. In fact, I don’t really interact in many machine learning engineer circles, because… uhm… I think sometimes when you have a lot of people who are super-insular, they get intimidated by ideas outside of their comfort zone. The Reddit page for machine learning is cool sometimes, but other times I just scratch my head and wonder what we’re doing with this powerful technology.

Don’t get me wrong. I love music videos. I’ve made… quite a few in my life. But come on, man. A recent (and very popular) post on the board was about how we can help indie bands be more creative and cool. The (appropriately named) TRASH APP is a machine learning tool for making sick music videos, dawg. No offense, but I spent 14 years in the music business, and this app is dead on arrival. Anyway…

Business tools are boring. I get it. It’s way cooler to say “I built this sick music video stitching tool” than “I can help your business do better lead acquisition by modeling your customers and their various psychographic profiles” — trust me. I know. Like I said, I worked a long time in music. The origins of Chunk Honey actually started when I was still working full time in music. I had this idea that I would figure out which listeners were most likely to buy merch and concert tickets. Then I said to myself…

“You know, if I’m already doing this predictive model, why not use it for something that’s more useful, for more people, across more business types?”

The first iteration of Chunk Honey was called Drake, named after the rapper and singer, Drake. (As a joke, I even got Drake tattooed on me. That was fun, right?) I had this hypothesis that there was no such thing as a “fan of a certain genre”, but rather we had types, and those types were a spectrum of total available hours you could listen to music in a day. Basically, the number of hours you can listen to music in a single day is your “ear inventory”, and that can also be populated by podcasts, and music videos. I wanted to figure out how to divide up that inventory, and if there were any correlations between different “listener types” among “inventory” trends. I know it sounds strange, but what you listen to is inventory.

(I was going to try to create a way to visualize the data, but realized I’d end up looking like this…)

That’s where it all started coming together for me. Everything we consume is inventory. If I wanted to create an Excel spreadsheet of your total “annual listening hours” — we’d be talking about inventory. It’s a model of consumption, and that’s when I had that “YEAH!” moment in my head. My machine learning model needed to model inventory trends. How inventory is added, how inventory is depleted, and which inventory is the most popular.

Then I wanted to know what types of people had the same inventory use profiles. Do I learn something about a person when I see what they’re consuming? Can I make inferences from that information?

Allow me to help you understand what I mean here. It’s about to get a little more detailed, but I think you’ll get it.

Say you walk into a grocery store, and you go to the chip aisle. You have potato chips, corn chips, chips in tubes, pork rinds, pretzels, little bags of chips, and within those categories, each one has flavors and sizes.

If you were to watch that chip aisle with a camera, you’d witness us all do the same thing: Be overwhelmed with options, and yet still come to a conclusion. Because the reality is, we all like something different, even though many of the products are fundamentally similar is price, value, and taste. The point of Chunk Honey is to learn about you, based upon what chips you buy, what juice flavors you like best, what cheese you pick out, or your favorite bread — metaphorically speaking.

To quote famed Bay Area rapper E-40, “Everybody got choices.” The choices we make are pictures of who we are. And truly, we’re all different, sure — but you’d be surprised how all of our shared experiences make us the same. That’s why we are able to laugh at the same jokes; we’ve all had an idiot boss who drives us nuts, or laughed at some idiot family member oversharing on social media in the most cringey way.

Eight weeks ago, I created the first widely successful deployment of Chunk Honey for a small business in the concrete sector. I had to immerse myself in everything related to concrete, poring over YouTube videos, brochures, and how-to videos. Before I created a learning model, I had to learn about what I was supposed to find.

That’s where it comes full circle, back to Human Factors. Machine learning applications are still only as smart as the person programming them. Someone has to inject the humanity into the code, and make these models learn the correct information. You can’t just set machine learning loose in the ether, and hope it works out. Chunk Honey is smaller and more modular than other models, and that’s a function of human factors. Chunk Honey needs a little more input than other models. Making your own framework is harder, sure — but that’s why Sagemaker was so attractive to me as a basis for everything. Sagemakers’s “BYOF” (bring your own framework) made sense in my head, because I didn’t need someone else’s solutions. I needed my own, and yeah… it took me two years to get it right.

(Long pause. Holy shit, well, not quite. But it has been like, one year and ten months getting this thing right. Wow.)

Chunk Honey (as a framework and machine learning model) is something that will likely remain small, and it’s an application that will likely only see internal use within Bee Morris Group. It’s not going to end up on Github, because I’m not inclined to give away almost two years of my work. Even as a Linux, open-source dude, I know what I have made would likely be something I compete against, and I’d rather maintain my competitive advantage.

If you think this all sounds cool, I’d love to tell you more about it.