LLMs: great for business but bad business

The true value proposition of LLMs lies in their ability to convert unstructured data from sources like websites and documents into structured information with reasonably high accuracy. Yet, the real profit lies in the products built on top of LLM technology. Each year, approximately 4 million books are published worldwide. On average, a book contains fewer than 120,000 words, translating to less than 160,000 tokens in LLM (Large Language Model) terms. Imagine if every single one of these books were generated by GPT-4鈥攊t would amount to an astounding 640 billion tokens. At $5 per million tokens, generating all these books would tally up to about $3.2 million! Let鈥檚 say the book market represents only about 1% of the total LLM text generation opportunity. Even then the total addressable market of LLM text generation is approximately $300 million annually鈥攁 modest figure when compared to AWS, which raked in $90 billion in 2023 as the cloud market leader.

When to commit Generated code to version control

Generated code, ideally, should not be committed to version control. Committing generated code can sometimes speed up testing and code generation but it is a design smell. It is better to cache generated code via CI caching. Committing generated code to version control is the worst as it is hard to even detect the difference. However, there are a few specific circumstances where committing generated code/config/data to version control is worth it. ...

Some data on podcasting

A few years back, I scraped data on podcasters from iTunes. The data was a bit underwhelming and made me realize that podcasters can鈥檛 be a potential market. It is a bit dated but I believe is still relevant.

Timing

Two cryptocurrency exchanges came out early on from Y Combinator. One is 2012. One in 2013. One returned 1500X to early investors. The other one ceased to exist after 2 years. What happened?

Real vs Theoretical Engineering Productivity

Some engineering productivity is real. Some are theoretical.

Play-to-earn games

With Axie Infinity, there is a sudden rise of play-to-earn games. At least the way I see is that the core idea is that financial benefits of in-game purchases accrue to the network instead of just the game studio. But most casual games have a very short shelf life of a few years.

Too much documentation is harmful

As code changes, documentation becomes stale over time. This happens at big companies. This happens at small companies. Unlike code, documentation is not compiled or tested. The code is executed. If the code execution fails or produces incorrect results, it is fixed with much higher urgency.

Engineering stack

Most startups think of the engineering stack as if it is a single cohesive thing. However, I believe that there are three different engineering stacks that are loosely coupled to each other.

Engineering Guardrails

Guardrails are meant to protect us from tripping over. The same can be said about engineering guardrails.

VCs are anti-personas for a B2C startup

The early adopters of Instagram were not VCs.