Animating Camera Physics: Using Manim to Explain FF vs. MFT Aperture

The full-frame vs. Micro Four Thirds debate produces more heat than light online. The part that trips most people up is this: if both systems can shoot at f/2.8, are they actually equivalent? The answer is “it depends on what you mean by equivalent” — and that nuance is almost impossible to convey in a forum post.

So I built three short animations with Manim to make it visual.

What is Manim?

Manim is a Python library originally written by Grant Sanderson for his 3Blue1Brown math videos, now maintained as the community edition. You describe a scene in code — shapes, labels, transformations, timing — and Manim renders it to video. No keyframes, no timeline scrubbing. Just Python.

pip install manim
manim -pqh my_scene.py MyScene   # 1080p60 output

The output is deterministic. Tweak a number, re-render, the rest of the animation is unchanged. For technical explainers where you want pixel-precise control over what the viewer sees, it beats every visual tool I have tried.

The physics, briefly

Two things are true simultaneously and people routinely treat them as contradictory:

  1. f/2.8 means f/2.8 — the f-number is a ratio (focal length ÷ entrance pupil diameter). It defines the light intensity per mm² hitting the sensor. An f/2.8 lens on an MFT body produces the same exposure as an f/2.8 lens on a full-frame body. Same shutter speed needed. ISO for ISO, the exposure is identical.

  2. The depth of field is not the same — to get the same field of view on MFT (2× crop factor) you use roughly half the focal length: 25mm on MFT frames like 50mm on FF. The depth-of-field formula is approximately:

    DoF ∝ (N × c) / f²

    where N is the f-number, c is the circle of confusion (scales with sensor size, so ~½ for MFT), and f is focal length (also ~½ for MFT). Plug in the halved values: (1 × ½) / (½)² = (½) / (¼) = 2×. At equivalent framing and the same f-number, MFT gives roughly twice the depth of field of full frame.

The rule of thumb: multiply the f-number by the crop factor to find the FF equivalent for depth of field. MFT f/2.8 ≈ FF f/5.6 for blur. But for exposure they are exactly the same.

Neither system is better — they make different trade-offs. But you need to understand both sides before that means anything.

The animations

I built three scenes. The video below covers all three:

Play

Scene 1: Sensor size comparison

The first scene overlays the MFT sensor (17.3 × 13 mm) on the full-frame rectangle (36 × 24 mm) to make the 2× crop factor concrete.

SCALE = 0.15   # 1 unit = ~6.7 mm on screen

ff_rect = Rectangle(width=36 * SCALE, height=24 * SCALE, color=BLUE)
ff_rect.set_fill(BLUE, opacity=0.25)

mft_rect = Rectangle(width=17.3 * SCALE, height=13 * SCALE, color=RED)
mft_rect.set_fill(RED, opacity=0.35)
# mft_rect shares the same center as ff_rect by default

self.play(Create(ff_rect), Write(ff_label))
self.play(FadeIn(mft_rect))
self.play(Create(corners))   # corner brackets around the MFT boundary
self.play(Write(Text("2× Crop Factor", color=YELLOW)))

The corner brackets are the detail that makes it land: they show exactly where the MFT sensor sits inside the FF frame without the viewer having to mentally parse it.

Scene 2: Light cone and aperture

This scene shows a cross-section of a lens projecting a light cone onto the sensor plane, then swaps the FF sensor for the MFT sensor while keeping the lens and f-number unchanged.

# 100 photon dots scattered uniformly inside the cone
for _ in range(100):
    t = rng.uniform(0.08, 1.0)
    x = LENS_X + t * (SENSOR_X - LENS_X)
    y_bound = (1 - t) * lens_half_h + t * (image_h / 2)
    y = rng.uniform(-y_bound * 0.97, y_bound * 0.97)
    photon_dots.add(Dot([x, y, 0], radius=0.022, color=YELLOW_A))

After the MFT sensor appears, the dots outside the smaller sensor’s height are dimmed to grey. The cone itself is unchanged — same f/2.8, same intensity per mm² — but the MFT sensor intercepts less of it in total.

The caption sequence walks through the conclusion in three steps: FF captures the full cone → swap to MFT at the same f/2.8 → cone unchanged, so intensity per mm² (= exposure) is unchanged, but total photons hitting the sensor are fewer.

Scene 3: Depth of field zone

The final scene is the most visual. A small camera sits at the left edge; a distance axis runs right with objects (trees, a person, coloured circles) at 1.5 m, 2.5 m, 3 m, 3.5 m, and 4.5 m. The focus is locked to 3 m.

In FF mode a green band spanning 0.5 m of depth is drawn, and everything outside it is dimmed to 30% opacity. Then the scene transitions to MFT mode: the band expands to 1.0 m, the two flanking objects (at 2.5 m and 3.5 m) sharpen back to full opacity, and the label updates to “DoF ≈ 1.0 m (2× wider)”.

ff_near, ff_far = 2.75, 3.25   # 0.5 m band
mft_near, mft_far = 2.5, 3.5   # 1.0 m band

# Transition: band grows, objects in the wider zone sharpen
self.play(
    Transform(mode_label, mft_label),
    Transform(ff_band, mft_band),
    Transform(ff_dof_label, mft_dof_label),
    *[obj.animate.set_opacity(1.0)
      for obj, d in zip(scene_objects, distances)
      if mft_near <= d <= mft_far and not ff_near <= d <= ff_far],
    run_time=1.8,
)

The simultaneous band expansion and sharpening is the animation doing work that text cannot: the viewer watches the physics happen rather than reading about it.

What Manim is good at

Iteration is cheap. The manim -pql flag renders at 480p/15fps in seconds. I ran each scene 20+ times while adjusting positions, timing, and label text. That feedback loop is fast enough to feel interactive.

Layout is tedious but precise. Manim’s coordinate system is not WYSIWYG. You specify positions in scene-space units, then render and check. Text that looks fine in the constructor can overlap something else at runtime. The tradeoff is that once it looks right, it stays right regardless of output resolution.

Python is the right abstraction here. The photon dots above are a loop; the DoF band is a function that takes near and far as arguments. The sensor size comparison uses real millimetre values multiplied by a scale factor. The code is the source of truth, not a locked-down project file.

Learnings

A few things that were not obvious going in:

Small font sizes break character spacing. Below roughly font_size=20, Manim’s text renderer produces noticeably awkward spacing — letters bunch or spread in ways that look wrong regardless of output resolution. The fix is to declare the Text at a large size and scale it down with .scale():

# bad — spacing artifacts at small sizes
Text("Full Frame", font_size=24)

# good — render large, scale down
Text("Full Frame", font_size=48).scale(0.5)

The visual result is identical but the spacing stays clean. Every label in these scenes follows this pattern.

Seed your RNG when randomness is involved. The photon dots in the light cone scene are positioned randomly, but using np.random.default_rng(42) means every re-render produces exactly the same dot layout. Without a seed, a minor code change elsewhere forces you to re-evaluate whether the new dot arrangement still looks good — friction you don’t need.

LaggedStartMap is worth knowing. Animating a collection of objects one-by-one with a tiny offset between each creates motion that feels organic rather than mechanical. The photon dots fading in use lag_ratio=0.008 across 100 objects; the distance axis tick marks use lag_ratio=0.08 across 6. The same call, a different feel depending on the ratio and group size.

self.play(LaggedStartMap(FadeIn, photon_dots, lag_ratio=0.008, run_time=1.5))

Use Transform to cycle caption text in a single slot. Rather than stacking successive text objects on screen, the light cone scene keeps one annotation object at the bottom edge and Transforms it through a sequence of captions. The old text morphs into the new one in place, which keeps the layout clean and draws the eye without adding clutter.

note = Text("FF captures the full cone ...", ...).to_edge(DOWN, buff=0.5)
self.play(FadeIn(note))
# ...later...
self.play(Transform(note, Text("Cone unchanged → same exposure", ...)))

The full source for all three scenes is on GitHub.


← Back to blog