Docker: Be careful about the scratch image

After I wrote my previous post, some suggested that I can cut down the image size further by using a “scratch” image. And that’s true, “scratch i”s a reserved 0-sized image with nothing in it. And utilizing a scratch binary image did cut down the size of the final Docker image from 13MB to 7.5MB. Pretty good, right? Except the image cannot do an SSL cert verification because of the missing SSL certs!!!

Failed to reach google.com: Get https://google.com: x509: certificate signed by unknown authority

Read More

Indeterminate Progress bar is an inferior UX design

60 milliseconds is when we notice something isn’t immediate. Any user interaction, that involves sending data over the network or doing heavy computation on it, usually takes way longer than 60 milliseconds. So, we end with a progress bar. There are two broad categories of progress bars, one that shows the absolute/relative progress, a determinate progress bar, and one that does not an indeterminate progress bar.


Fancy but will this ever finish?

Now, here’s what’s the state transition looks like,
task started -> task progressed -> task finished/failed.

So, ideally, a software engineer should be implementing all these states. Most don’t. Some push the error handling to a later point. And then forget.

In such a situation, the indeterminate progress bar just stays there, spinning and spinning. While a determinate progress bar actually stops. So, a determinate progress bar not only communicates the speed of progress to the user but also, communicates if the progress has stalled.

So, why aren’t they more popular? Probably because it is a bit more work, you have to decide what the unit of work is. And if you are too off, the progress can look arbitrary. In my belief, even if the progress looks like a car navigating a bunch of speed breakers, it is superior since it communicates more actionable information to the end-user.


Slow towards the end but the user can see the rate of progress and if the job is stuck

How many source-code repositories should a startup have

Recently, this question came up during the discussion. “How many source-code repositories should a startup have?”

There are two extreme answers, a single monorepo for all the code or repository for each library/microservice. Uber, for example, had 8000 git repositories with only 200 engineers!

I think both extremes are wrong. Too many repositories make it hard to find code and one single repository makes it harder to do simple things like testing, bisecting (to find buggy commit), deciding repository owners.

Here’s what I have seen works best.

  1. Backend code – ideally, one single repository.
  2. Frontend (web) code – one single repository. This can be separately tested and even outside contractors can access this. The web code is anyways shipped to the client, so, leaking it isn’t that big of a worry.
  3.  Mobile code – one repository per platform (Android, iOS, etc.) if there is no code dependency. Or one single repository if a common codebase like React Native or Flutter is being used. Again, those who access the mobile code won’t need access to the backend code. And given that this code in compiled form is sent to the users, there is less worry around leaking it.
  4. Open-source code – One repository per open-source package.

As opposed to a monorepo setup, this setup makes it harder to make simultaneous changes to backend and frontend, or backend and mobile apps. And that should be the right behavior anyways. Since backend services and frontend can be, or eventually will be, deployed independently of each other. Mobile apps are deployed independently anyways, so, the backend has to provide backward-incompatibility for that.

The two-step approach to big code modifications

We all have to make significant code changes from time to time. Most of these code changes are large. Consider the scenario that you merged one such significant change, and then other team members made a few more changes on top. Then a major bug is detected. You desperately make the fix. It makes it in. You declare a victory, and a few hours later, your colleague notices another bug/crash/performance regression. Your commit cannot be reverted. It isn’t just about you. Many others have built on top of the change you made—the code sloths along in this broken state for a few days before you eventually fix it. Everyone has faced this issue at some point or the other.

If the code change is small, this is a non-issue, you can revert it and fix it at your own pace. Therefore, when making significant code changes, always try to do it in two different commits (pull requests). In the commit, you add the new code, add a switch (command-line flag or a constant) to turn it on, and keep that switch off by default. In the second commit, turn the switch on. Now, if there is a problem, you immediately turn the switch off and start working on a fix. No one else will deal with the broken code. It would be even better if the switch is a command-line flag since you can turn on the flag for 10% of the machines (or users) and see the behavior for a few days before rolling it out to 100%. It big teams, it is usually good to add a comment to switch mentioning when it should expire or else you will end with Uber scale problem.

There are a few cases where this cannot be done, for example, during big code refactoring. I think big code refactoring touching code written by multiple teams is almost rarely justified in a single commit.

Examples of some cases where this is useful

  1. Switching over from consuming data via v1 of some API to v2
  2. Switching over from returning computed results to stored results (for faster response time but possibly inaccurate outcome)
  3. Switching over from JSON to Protocol Buffer/Thrift
  4. Switching over from VMs to Kubernetes (don’t delete the old code yet, you might regret it)

Incremental testing: save time and money on CI for monorepo

To use monorepo or not is an eternal debate. Each has its pros and cons. Let’s say you decide to go with monorepo, one major issue you will face over time is slow testing. Imagine a monorepo, consisting of an Android app, an iOS app, some backend code, some web frontend code. In only very few occasions will someone modify more than one of those simultaneously.

Further, add to the fact that most of these projects confined to their directories would be using different build systems as well, for example, gradle for Android, yarn/npm for Javascript, go/rust/java/npm for the backend. The total build time, as well as test time, will only grow over time. It annoys developers making small modifications to their part of the codebase. And it slows down the development velocity drastically.

Read More

How to deploy side projects as web services for free

In 2020, the web is still the most accessible permission-less platform. For the past few months, I have been playing and building side-projects to simplify my life. I started with a Calendar Bot for scheduling events, DeckSaver for downloading decks from Docsend, AutoSnoozer for email management, and StayInTouch for maintaining follow-ups.

When I started on this journey, I had the following in my mind.

  1. Cost of domain ~ 12$ a year or 1$ a month
  2. Cost of a VM ~ 10$ a month

Read More

Docker 101: A basic web-server displaying hello world

A basic webserver

Docker containers are small OS images in themselves which one can deploy and run without worrying about dependencies or interoperability. All the dependencies are packed in the same container file. And the docker runtime takes care of the interoperability. You are not tied to using a single language or framework. You can write code in Python, Go, Java, Node.js, or any of your favorite languages and pack it in a container.

Consider a simple example of a Go-based webserver

Read More

The first two statements of your BASH script should be…

#!/usr/bin/env bash
set -euo pipefail

The first statement is a Mac, GNU/Linux, and BSD portable way of finding the location of the bash interpreter. The second statement combines

    1. “set -e” which ensures that your script stops on first command failure. By default, when a command fails, BASH executes the next command. Looking at the logs, you might feel that the script executed successfully while some commands might have failed. Caveat: Be careful about applying it to existing scripts.
    2. “set -u” which ensures that your script exits on the first unset variable encountered. Otherwise, bash replaces the unset variables with empty default values.
    3. “set -o pipefail” which ensures that if any command in a set of piped commands failed, the overall exit status is the status of the failed command. Otherwise, the exit status is the status of the last command.

References:

  1. Unofficial Bash strict mode
  2. ExplainShell

Keep your dotfiles bug-free with Continuous Integration

Update: As of April 2020, I have switched over to GitHub Actions. Travis CI has become buggy and flaky over time and I got tired of trying to keep the builds green. My GitHub action scripts can be seen here.

Just like many software engineers, I maintain my config files for GNU/Linux and Mac OS in a git repository. Given that, I wrote a fair bit of them in interpreted code, notably, Bash, it is a bit hard to ensure that it is bug-free. The other problem I face is that packages on homebrew, the Mac OS package manager becomes obsolete and gets deleted from time to time.

I added CI testing on Travis CI to prevent these breakages and to ensure that my dotfiles are always in good shape for installation. The great thing about Travis CI is that it is entirely free for open-source repositories even for testing on Mac OS containers.

Read More

Subscribe For Latest Updates

Signup for our newsletter and get member-only articles




-->