1-Parameter Classifier

Progressively reducing the model budget for image-level person classification on EUPE-ViT-B features. Each stage is a deeper reduction or transformation of the previous. The classifier shrinks across stages while the backbone it draws features from is attacked in parallel.

Stage 0: Baseline

A 1-free-parameter image-level person classifier on the frozen EUPE-ViT-B backbone. The classifier reads 20 pre-selected person-positive and 20 pre-selected person-negative feature dimensions, sums the positives, subtracts the negatives, and compares the result to one learned threshold. F1 = 0.889 on COCO val 2017 image-level person presence, measured through the live Argus forward pass at 768 pixel input.

See stage_0/ for the classifier config, discovery pipeline, and full characterization of the person axis in the backbone.

Roadmap

Stage	Name	What changes	Status	Result
0	Baseline 1-param classifier	Uses the full EUPE-ViT-B backbone unchanged	shipped	F1 0.889 · 85.64M backbone · 1 free param
1	Output-channel pruning	Slice the 40 dims the classifier reads; fuse the head	shipped	F1 0.889 (parity) · same backbone · cleaner interface
2	Attention-head pruning	Ablate heads that do not contribute to those dims	shipped	F1 0.916 (+0.022) at K=10 heads pruned · 1.97M params masked
2b	Structural head removal	Physically shrink qkv/proj tensors, reduce per-block `num_heads`	shipped	F1 0.9159 preserved · backbone 85.64M → 83.68M (1.97M saved, 2.30 %)
3	Depth reduction	Drop transformer blocks that do not route signal	shipped	F1 0.876 at K=1 block · F1 collapses at K≥3 · hard ceiling
4	Specialist backbone	Train a small student that emits only the target dims	shipped	3.27M-param student · F1 0.717 · proof of concept, gap to baseline
4b	Bigger specialist, cosine loss	15.67 M student, cosine similarity on full 768-D pooled teacher	shipped	F1 0.726 (+0.009 over Stage 4) · gap to baseline persists
4c	Direct scalar supervision	Same 3.27 M student, MSE on the classifier sum-difference scalar	shipped	F1 0.734 · threshold converges to 25.0 (teacher 25.3) · calibration aligned
5	Circuit-level synthesis	Synthesize the Stage 0 classifier to gates	shipped	3,220 gates (1,172 AND + 1,318 NOT + 730 XOR)
5b	Popcount reformulation	Per-dim INT8 threshold → popcount → comparator	shipped	907 gates (−71 % vs Stage 5 folded), F1 0.876 (−0.008)

Headline numbers

Stage 2 pruning improves the classifier: removing 10 redundant / noise-injecting attention heads raises F1 from 0.894 (1K-image calibration) to 0.916 on the same calibration pool.
Stage 3 shows the backbone is depth-critical: only 1 of 12 blocks is cleanly removable.
Stage 4 specialist student fits the full person-classification pipeline in 3.27M parameters at F1 0.717, 26× smaller than the teacher (full path forward in the stage_4 README).
Stage 4C's direct scalar supervision on the same 3.27M student lifts F1 to 0.734 at the same footprint, with the student's threshold converging to 25.0 against the teacher's 25.3.
Stage 5 puts the decision circuit at 3,220 universal gates. Sub-millisecond combinational latency; sub-milliwatt power. Fits as a camera-ISP block.
Stage 5b's popcount reformulation drops that to 907 gates (−71 %) at F1 0.876, with most of the saving coming from eliminating the signed 8-bit adder tree.

Source backbone

EUPE-ViT-B from Meta FAIR (arXiv:2603.22387, Zhu et al., March 2026), distilled from PEcore-G + PElang-G + DINOv3-H+ via a 1.9B proxy teacher. License: FAIR Research License (non-commercial). The 1-parameter classifier is an artifact derived from that backbone's feature geometry.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for phanerozoic/1-parameter-classifier

Base model

facebook/EUPE-ViT-B

Finetuned

(6)

this model

Dataset used to train phanerozoic/1-parameter-classifier

Paper for phanerozoic/1-parameter-classifier

Efficient Universal Perception Encoder

Paper • 2603.22387 • Published Mar 23 • 9