Two inputs, two neurons, and a curved frontier.
A neural network with three layers and a handful of weights — the smallest contraption that can solve XOR, the problem that famously embarrassed the perceptron for two decades. Drag any edge to change its weight. The decision boundary on the right re-draws itself as you pull. Press step to run one round of backprop on a batch of points and watch the gradients ripple backward, edge by edge.
Why XOR mattered
§ 02 · Notes(0, 1) — friendly but slow to learn at the edges. Tanh is the same shape, recentred on zero. ReLU is a hinge that turns off at zero, easy to train but blind below the elbow.Implementation notes — forward, backward, and the update rule
The network is a two-layer MLP: h = φ(W₁ x + b₁) followed
by ŷ = σ(W₂ h + b₂), where φ is the chosen
hidden activation and σ is sigmoid at the output for
binary classification. The loss is binary cross-entropy. All matrices
and biases live in plain JavaScript arrays — no library.
Backprop falls out of one line of calculus. Output error
δ₂ = ŷ − y; hidden error δ₁ = (W₂ᵀ δ₂) · φ′(z₁).
Gradients are ∂L/∂W = δ · aᵀ averaged over the batch.
Every step the page animates the per-edge magnitude of those
gradients along the line connecting the two neurons.
The decision boundary is rendered by sampling the network on a
80 × 80 grid and drawing the result with nearest-neighbour
zoom. At 80 × 80 the recompute is cheap enough to run every frame
while you are dragging a weight or watching training.
Weights initialise with a He-style scale — 𝒩(0, √(2/n)).
Drag any edge with the mouse to override it. Hidden width is clamped
to one through eight neurons; with one hidden unit XOR is provably
unsolvable, which is itself a useful thing to demonstrate.