If you are running Python in production , you will almost certainly have to decide which web framework to use.

Let’s consider a rudimentary Hello world based test comparing the performance of two popular web frameworks for Python - FastAPI and Flask .

I will intentionally use Docker for benchmarking as most deployments today will explicitly or implicitly rely on Docker.

For Flask, I will use this

Dockerfile
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Build: docker buildx build -t python-flask -f Dockerfile_python .
# Size: docker image inspect python-flask --format='{{.Size}}' | numfmt --to=iec-i
# Run: docker run -it --rm --cpus=1 --memory=100m -p 8001:8001 python-flask
FROM python:3.13-slim AS base

WORKDIR /app
RUN pip3 install --no-cache-dir flask gunicorn
SHELL ["/bin/bash", "-c"]
RUN echo -e "\
from flask import Flask\n\
app = Flask(__name__)\n\
\
@app.get('/')\n\
def root():\n\
  return 'Hello, World!'\n\
" > /app/web_server.py

ENTRYPOINT ["gunicorn", "web_server:app", "--bind=0.0.0.0:8001", "--workers=4", "--threads=32"]

And for FastAPI, I will use this

Dockerfile
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Build: docker buildx build -t python-fastapi -f Dockerfile_python .
# Size: docker image inspect python-fastapi --format='{{.Size}}' | numfmt --to=iec-i
# Run: docker run -it --rm --cpus=1 --memory=100m -p 8002:8002 python-fastapi
FROM python:3.13-slim AS base

WORKDIR /app
RUN pip3 install --no-cache-dir fastapi gunicorn uvicorn
SHELL ["/bin/bash", "-c"]
RUN echo -e "\
from fastapi import FastAPI\n\
app = FastAPI()\n\
@app.get('/')\n\
async def root():\n\
    return {'message': 'Hello World'}\
" > /app/web_server.py

ENTRYPOINT ["gunicorn", "web_server:app", "--bind=0.0.0.0:8002", \
  "-k uvicorn.workers.UvicornWorker", "--workers=4", "--threads=32"]

As you will notice the only difference is the framework used. The rest of the code is exactly the same.

Comparing the result, we can see FastAPI is 3X faster in these basic requests.

Bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
$ h2load --h1 -n1000 -c40 'http://localhost:8001'  # Flask
...
                      min         max         mean         sd        +/- sd
time for request:      406us     83.37ms     12.56ms     22.46ms    86.70%
...
req/s           :      75.98       83.04       79.36        2.08    60.00%

$ h2load --h1 -n1000 -c40 'http://localhost:8002'  # FastAPI results
...
                      min         max         mean         sd        +/- sd
 time for request:      825us     29.78ms      4.07ms      6.13ms    91.70%
# req/s           :     231.07      256.41      241.98        8.86    52.50%
...

So, in terms of throughput, I am fairly convinced that no one should choose Flask over FastAPI for new projects.

Even if query serving time is only 20% of the end to end latency, one gets about 15% performance improvement by choosing FastAPI over Flask.

Even if you decide to use Flask, given its single threaded nature, use it behind gunicorn, the way I used above

It is definitely concerning though that FastAPI , as of 2025, is not stable release yet.