Measuring Web Performance — TTFB, Page Weight, Image Audit with curl, sips, and Python
TL;DR: You can profile page speed, total weight, and image bloat with a 30-line script — no Lighthouse, no WebPageTest. Drop it into CI and regressions get caught automatically.
Why measure directly
"It feels slow…" is a dead end for prioritization. Three numbers actually matter:
- TTFB (Time To First Byte) — server-to-first-byte latency. The infra / CDN / SSR signal.
- Total page weight — HTML + JS + CSS + images. The raw input to LCP and INP.
- Image weight per page — almost always the culprit on slow sites. (Card thumbnail rendered at 200×200 but downloading the 1024×1024 original.)
Lighthouse is great for the holistic score, but environmental variance (local network, throttling) makes scores wobble each run. The curl + sips approach is deterministic, millisecond-accurate, and cron/CI-friendly. Use Lighthouse for the big quarterly review, this script for daily audits.
Tools — all built-in or standard
| Tool | How to get it | Used for |
|---|---|---|
curl |
macOS/Linux preinstalled | TTFB, total time, transfer speed, page size |
sips |
macOS built-in (Linux: identify from ImageMagick) |
Image dimensions, bytes |
python3 |
macOS/Linux preinstalled | HTML parsing, asset URL extraction, aggregation |
No installs needed.
Part 1 — curl -w for TTFB, total, size
The key: -w (write-out) format string
curl -w prints templated timing and response info after the request finishes. Seven variables matter:
curl -w "
DNS: %{time_namelookup}s
Connect: %{time_connect}s
TLS: %{time_appconnect}s
TTFB: %{time_starttransfer}s
Total: %{time_total}s
Size: %{size_download} bytes
Speed: %{speed_download} bytes/s
" -o /tmp/out.html -s "https://taystudios.com/blog/en/"
-o /tmp/out.html writes the body to file. -s silences the progress bar.
What each variable means
| Variable | What it measures |
|---|---|
time_namelookup |
End of DNS resolution |
time_connect |
End of TCP handshake (includes namelookup) |
time_appconnect |
End of TLS handshake (HTTPS only) |
time_pretransfer |
Just before sending the request |
time_starttransfer |
TTFB — first byte of response received |
time_total |
End of body download |
size_download |
Body bytes |
speed_download |
Average download speed |
Each value is cumulative, so per-phase = next − previous:
DNS: time_namelookup
TCP: time_connect - time_namelookup
TLS: time_appconnect - time_connect
Server: time_starttransfer - time_appconnect ← actual server processing + response start
Transfer: time_total - time_starttransfer ← body transfer
Real measurement (taystudios.com/blog/en/)
DNS: 0.014286s ← 14 ms DNS
Connect: 0.048982s ← 35 ms TCP
TLS: 0.089744s ← 41 ms TLS
TTFB: 0.337152s ← 247 ms server response (GitHub Pages CDN)
Total: 0.337994s ← 0.8 ms body transfer
Size: 21702 bytes ← HTML 22 KB
Reading: TTFB at 247 ms is 73% of total time — CDN edge response. Improvement path: same-edge cache warming. The 22 KB HTML body transferred in 0.8 ms — negligible.
Cache-busting — force fresh every time
Both CDN and browser caches kick in on second request. For baselining, force a cache miss:
# Cache busting: timestamp query string
curl -s "https://taystudios.com/blog/en/?ts=$(date +%s)"
# Or via Cache-Control header
curl -s -H "Cache-Control: no-cache" "https://taystudios.com/blog/en/"
# Both — safest
curl -s -H "Cache-Control: no-cache" "https://taystudios.com/blog/en/?ts=$(date +%s)"
A dummy ?ts= query changes the CDN cache key → forces a fresh response. The Cache-Control header signals upstream proxies.
Part 2 — sips for image dimensions and bytes
sips is macOS's built-in image tool. One-liner for dimensions:
sips -g pixelWidth -g pixelHeight cover.jpg
# Output:
# /path/cover.jpg
# pixelWidth: 1024
# pixelHeight: 1024
Bytes from stat:
stat -f %z cover.jpg # macOS
stat -c %s cover.jpg # Linux
Audit a whole folder (Python)
import subprocess
from pathlib import Path
for f in sorted(Path('blog/assets/posts').rglob('cover.*')):
sz = f.stat().st_size
out = subprocess.check_output(
['sips', '-g', 'pixelWidth', '-g', 'pixelHeight', str(f)],
stderr=subprocess.DEVNULL
).decode()
dims = [l.split(':')[1].strip() for l in out.split('\n') if 'pixel' in l]
w, h = dims[0], dims[1]
flag = '🔴' if sz > 200_000 else '🟡' if sz > 50_000 else '✓'
print(f'{sz/1024:>7.1f}KB {w}x{h} {flag} {f}')
Real output (this blog's audit, partial)
416.5KB 1332x2088 🔴 aws-summit-2024/cover.jpg ← weird aspect, og:image
375.0KB 1024x1024 🔴 hypothesis-testing/cover.jpg
375.0KB 1024x1024 🔴 uncertainty-variance/cover.jpg
375.0KB 1024x1024 🔴 t-test/cover.jpg
4.5KB 200x200 ✓ career-ktds-newgrad-2021/cover.jpg ← company logo
4.1KB 225x225 ✓ kaist-grad-mech-written-set1/cover.png
Diagnosis: 7 statistics-series covers are all 1024×1024 / 375 KB — but the card thumbnail only renders at 200×200. 25× more data downloaded than needed. Immediate fix candidate.
Part 3 — Per-page image weight (HTML parsing + aggregation)
Pull all <img src> from a fetched HTML and sum the sizes:
import re
from pathlib import Path
# Per-page cover weight from built HTML
for page in ['blog/en/index.html', 'blog/en/category/data/index.html']:
h = open(page).read()
imgs = re.findall(r'src="([^"]*cover\.[a-z]+)"', h)
total = 0
for src in imgs:
f = Path('blog') / src
if f.exists():
total += f.stat().st_size
print(f'{page}: {len(imgs)} covers · {total/1024:.1f} KB')
Real output (this blog)
| Page | Covers | Total |
|---|---|---|
/blog/en/ (home, first page, 10 posts) |
10 | 1524 KB |
/blog/en/category/data/ (7 statistics posts) |
7 | 2625 KB 🔴 |
/blog/en/category/reviews/career/ (6 company logos) |
6 | 250 KB ✓ |
The statistics category alone is 2.6 MB. On a 4G mobile connection (~50 Mbps), that's +0.4 s of download — direct LCP hit.
💡 "HTML loads fast (TTFB 247 ms), but the page feels slow" — the curl + Python combo diagnoses this in under a minute. That's the real value.
Part 4 — Full-page asset weight (live URLs)
When you don't have the built HTML locally (auditing a live URL), curl down the page and pull asset URLs, then size each:
#!/bin/bash
URL="https://taystudios.com/blog/en/"
TMPDIR=$(mktemp -d)
# 1. Fetch HTML
curl -s "$URL?ts=$(date +%s)" -o "$TMPDIR/page.html"
HTML_SIZE=$(stat -f %z "$TMPDIR/page.html")
# 2. Extract image, JS, CSS URLs (resolve to absolute)
python3 -c "
import re, sys
from urllib.parse import urljoin
base = '$URL'
html = open('$TMPDIR/page.html').read()
patterns = [
r'<img[^>]+src=[\"\']([^\"\']+)[\"\']',
r'<script[^>]+src=[\"\']([^\"\']+)[\"\']',
r'<link[^>]+href=[\"\']([^\"\']+)[\"\'][^>]+rel=[\"\']stylesheet[\"\']',
]
for p in patterns:
for m in re.findall(p, html):
if m.startswith(('http','//')):
print(m if not m.startswith('//') else 'https:'+m)
else:
print(urljoin(base, m))
" > "$TMPDIR/urls.txt"
# 3. Measure each asset
TOTAL=$HTML_SIZE
while read u; do
sz=$(curl -o /dev/null -s -w "%{size_download}" "$u")
echo "$sz $u"
TOTAL=$((TOTAL + sz))
done < "$TMPDIR/urls.txt"
echo
echo "HTML: $((HTML_SIZE / 1024)) KB"
echo "Total: $((TOTAL / 1024)) KB ($(echo "scale=2; $TOTAL/1048576" | bc) MB)"
Drop this into CI per PR for quantitative regression detection.
Part 5 — Mapping to Core Web Vitals
| Measurement | CWV metric | Threshold |
|---|---|---|
| TTFB | (part of FCP) | <200 ms = good, >600 ms = poor |
| Page weight | LCP (Largest Contentful Paint) | <2.5 s = good (4G/mobile baseline) |
| Image weight | LCP and INP | LCP is image-driven on most sites |
Rough back-of-envelope (3G slow): - HTML 50 KB + images 500 KB ≈ ~3 s LCP (good) - HTML 50 KB + images 2 MB ≈ ~8 s LCP (poor)
Image weight is the primary LCP variable. That's why this methodology emphasizes it.
Part 6 — How this compares to Lighthouse
| curl + sips audit | Lighthouse | |
|---|---|---|
| Output type | Quantitative (bytes, ms) | Composite score (0-100) |
| Environment | Deterministic (only network varies) | Variable (CPU throttling, simulated) |
| Speed | <5 s | 30-60 s/run |
| CI fit | ✓ Easy to automate | △ Requires headless Chrome |
| Catches | Heavy assets, TTFB | + JS execution, CLS, a11y, SEO score |
Use both: - Daily audits and CI regression — curl + sips - Quarterly deep-dive — Lighthouse + PageSpeed Insights
CI automation — one-liner
# .github/workflows/perf-check.yml
- name: Cover image size guard
run: |
fail=0
for f in blog/assets/posts/*/cover.*; do
sz=$(stat -c %s "$f")
if [ "$sz" -gt 100000 ]; then
echo "::error::$f is $((sz/1024))KB (>100KB) — optimize"
fail=1
fi
done
exit $fail
PR turns red when a cover exceeds 100 KB. Regression caught without manual review.
Recap
| Step | Tool | One-liner |
|---|---|---|
| TTFB / page time | curl -w |
curl -w "%{time_starttransfer}s\n" -o /dev/null -s URL |
| Image audit | sips + find + stat |
find . -name cover.\* -exec sips -g pixelWidth -g pixelHeight {} + |
| Per-page weight | Python re.findall |
HTML → <img src> set → sum each file's size |
| CI guard | bash + stat -c %s |
Auto-check threshold per PR |
The real value: run bash perf-baseline.sh before and after every deploy and you get quantitative before/after comparisons. Debates stop being about feelings and start being about numbers.
📝 The measurement data above is the actual audit result from this blog (
taystudios.com/blog/). It surfaced that the statistics-series covers were 1024×1024 PNG at 375 KB each. They were re-encoded to 800×800 WebP, saving ~1 MB+ per page on average.
Comments