Sourcing is a different question from testing
Before the ladder, one thing worth separating cleanly: where your cells come from is a different question from how you test them, and the two get conflated all the time. The NDAA's procurement provisions, the American Security Drone Act of 2023, the coming U.S. Department of Defense procurement restrictions on batteries from named foreign manufacturers, the FCC Covered List, and the EU Battery Regulation's due-diligence obligations all gate supplier eligibility. They don't tell you anything about how to sample. If you're selling into federal markets, you have to clear those sourcing gates regardless of your testing posture. The rest of this piece is about testing, not sourcing.
How much testing is enough? It's a ladder
The honest answer to how much you should test is not all of it and not none. It's a ladder, and the rung you sit on should be determined by who you want to sell to in 24 months. Each rung up costs more — and unlocks a larger market, not because a regulation names it, but because the customers at the next tier expect it and the consequences of being caught short are bigger.
The cell-testing ladder. Your rung is set by who you'll sell to in 24 months — and where your cells are sourced gates a separate axis.
- Rung 1 — Trust the data sheet. Fine for a hobbyist build, a 50-unit commercial-inspection pilot, an internal prototype. Not fine for anything you're shipping at scale to a customer who can call a lawyer.
- Rung 2 — Initial qualification of the cell type at design, then no further cell-level testing. OCV/ACIR for matching at pack assembly. Most small drone OEMs sit here today. It's adequate for state and local law enforcement, fire, EMS, and early commercial work. Sophisticated customers buying small quantities won't push back; high-volume or high-stakes customers will.
- Rung 3 — Initial qualification plus recurring batch sampling against a documented spec, with reject thresholds. Discretionary best practice, not regulation. The point at which you start catching batch-to-batch drift before your customers do, and the point at which a field failure becomes defensible. Increasingly expected by sophisticated commercial buyers and federal civilian customers — and for systems above 2 kWh shipping into the EU after February 2027, the only realistic way to keep the performance and durability data in your battery passport current.
- Rung 4 — Sampling plus 100% incoming OCV/ACIR characterization plus per-pack documentation. Where defense primes and DoD program offices expect their pack suppliers to operate in practice. No single regulation mandates this depth; the customers at this tier do, through contract flow-down. It's also the data foundation underneath any serious pack-level qualification campaign (standards such as MIL-PRF-32565, IEC 62133, and UL 2054).
- Rung 5 — Full incoming characterization plus end-to-end serial-number traceability. Cell SN to pack SN to product SN to field event. What it credibly takes to chase DoD combat programs and Drone Dominance-class platforms. Also what makes a pack-level qualification package for permanently installed aviation batteries defensible — DO-311A is itself a pack qualification and explicitly doesn't require cell-level testing, but the OEMs who pass it on the first try are the ones who can show characterized cells underneath the pack-level data.
- Rung 6 — Lifecycle telemetry correlated back to manufacturing batch. Field operating data closing the loop with the batch the cell came from. The data discipline the most mature aviation and energy-storage programs are working toward; almost nobody is fully there yet.
These boundaries are approximate, and the procurement officers and program managers who actually enforce them are the source of truth — the rungs are not crisp. But the shape of the ladder is real, and the asymmetry that matters is this: each step up costs more, but it also generates the paper trail that proves you were operating at that rung when the field event happened. You can't backfill that. The day after a contract win is the wrong day to start testing batches you shipped a year ago.
If your endgame is state and local law enforcement and commercial inspection, rung 2 is fine — there's no need to pretend otherwise. Stay there, organize the data you already produce so that when a recall happens you don't panic, and revisit in 24 months if your customer mix changes. If you want to bid on the bigger contracts — defense and government UAS, DoD programs — the question is where you are on the ladder versus where you need to be, and how long the climb takes. That's the conversation most small drone companies haven't had with themselves yet. It's worth having before the RFP shows up.
Once you've decided which rung you're climbing to, the question becomes how to actually stand the program up — the spec, the test plan, and a place for the data to live. Part 3 covers starting from zero.