PolyCheck and NPoly

Gregory M. Kapfhammer

November 17, 2025

Learning objectives

Learning Objectives for Theoretical Machines

CS-204-1: Use both intuitive analysis and theoretical proof techniques to correctly distinguish between problems that are tractable, intractable, and uncomputable.
CS-204-2: Correctly use one or more variants of the Turing machine (TM) abstraction to both describe and analyze the solution to a computational problem.
CS-204-3: Correctly use one or more variants of the finite state machine (FSM) abstraction to describe and analyze the solution to a computational problem.
CS-204-4: Use a formal proof technique to correctly classify a problem according to whether or not it is in the P, NP, NP-Hard, and/or NP-Complete complexity class(es).
CS-204-5: Apply insights from theoretical proofs concerning the limits of either program feasibility or complexity to the implementation of both correct and efficient real-world Python programs.

PolyCheck and NPoly are about verification!

PolyCheck and NPoly help with learning objectives CS-204-1 and CS-204-4. We will define two complexity classes that capture problems whose solutions are easy to verify even if they’re hard to find. This chapter reveals a surprising result: these two classes, defined in completely different ways, are identical! As proofgrammers, let’s explore verification and nondeterminism to understand what can be efficiently checked. Okay, let’s dive in!

Review: Poly and Expo classes

Poly: Problems solvable in polynomial time \(O(n^k)\)
Expo: Problems solvable in exponential time \(O(2^{n^k})\)
Key relationship: Poly \(\subseteq\) Expo for computational complexity
Tractability: Poly problems are feasible, Expo problems often are not
Examples: Sorting in Poly, factoring in Expo (no known poly algorithm)
Question: What about problems that are hard to solve but easy to check?

The Poly/Expo boundary separates tractable from intractable: but some problems are easy to verify even when hard to solve. For instance, factoring a number is hard, but checking if a number is a factor is easy! This observation leads us to PolyCheck! The word Check connects to “verification”!

Verifiers check proposed solutions

Verifier: Program that checks if a proposed solution is correct
Three inputs: Instance \(I\), proposed solution \(S\), hint \(H\)
Two outputs: correct or unsure (never incorrect!)
Key idea: Verification can be easier than solving
Example: Checking factors is easier than finding them

Verifiers formalize the notion of “easy to check”: given a proposed solution, can we quickly verify its correctness? The asymmetry between positive and negative instances is crucial and connects to the prior concept of nondeterminism!

Verifier for FACTOR problem

Easy to verify: Check if \(m\) divides \(M\) in polynomial time
Hard to find: Finding factors requires searching many candidates
Note: Returns unsure when the “quick check” is not clear

Why `unsure` not `incorrect`?

Two reasons for failure:
- Proposed solution \(S\) is actually incorrect
- Hint \(H\) was wrong (even if \(S\) is correct)
Cannot distinguish: No way to tell which case occurred
Asymmetry: Positive instances verifiable, negative instances never are
Connection: Similar to nondeterministic computation!

The unsure output reflects fundamental asymmetry: verifiers can confirm correctness but cannot definitively reject; they are “helpful” but not perfect. This asymmetry mirrors nondeterministic computation where positive solutions are found but negative instances only yield no!

Review of verification

Verifier \(V(I, S, H)\) receives instance \(I\), solution \(S\), and hint \(H\)
Always halts, returning correct or unsure
Positive instances verifiable with some correct \(S\) and \(H\)
Negative instances never verifiable, always return unsure

Diagrammatic view of verification

Top branches: Positive instance with a correct solution
Bottom branch: Negative instance always returns unsure
Asymmetry: Only positive instances can return correct

Hints help verification

Hint string \(H\): Additional information to help verification
Example: For TSPD, hint provides the Hamilton cycle
Efficiency: Without hint, checking all cycles takes exponential time
With hint: Simply verify the given cycle in polynomial time
Key insight: Hints make exponential searches into polynomial checks

Hints are the secret to efficient verification: they tell us where the needle is in the haystack, so we can quickly verify it’s a needle without searching the entire haystack. This makes verification much easier than solving!

Concrete example of hints for TSPD

Instance \(I\): Graph "a,b,1 b,c,2 c,a,3;8" with threshold \(L = 8\)
Hamilton cycle: Path visiting each vertex exactly once and returning to start
TSPD: Asks “yes/no” whether a Hamilton cycle of length \(\leq L\) exists
Question: Does a Hamilton cycle of length \(\leq 8\) exist?
Proposed solution \(S\): "yes"
Hint \(H\): "a,b,c", which is the actual Hamilton cycle, hooray!
Hint helps: TSPD only asks yes/no, but verifying requires the actual cycle
Verification: Check cycle visits all vertices once
Check length: \(1 + 2 + 3 = 6 \leq 8\)
Result: "correct" in \(O(n^2)\) time!
Without hint, need exponential time as there are \(n!\) possible cycles to check

Polytime verifiers run in polynomial time

Polytime verifier: Runs in \(O(n^k)\) time for instance length \(n\)
Crucial detail: Only \(|I|\) matters, not \(|S|\) or \(|H|\)
Protection needed: Must reject excessively long \(S\) and \(H\)
Why?: Without this check, processing long \(S\) or \(H\) could take exponential time, which defeats the purpose of quick verification
Solution: Reject if \(|S| > |I|\) or \(|H| > |I|\)
Let’s see what this would look like for FACTOR!

Polytime verifier for FACTOR

Added check: Reject if \(|S| > |I|\) or \(|H| > |I|\)
Guarantees polynomial time: Bounded by \(|I|\)
Same verification logic: Otherwise identical to basic verifier

Polytime verifiers must protect against long inputs: if \(S\) has length \(2n\), just converting it to an integer takes \(2n\) steps, which is too expensive!

Formal definition of polytime verifier

Verifier \(V(I, S, H)\) must halt in polynomial time as function of \(|I|\)
Lengths of \(S\) and \(H\) are irrelevant to running time bound
This definition enables the PolyCheck complexity class

Can you intuitively see the benefits of verification? Do you have a strong intuitive understanding of the concept of a polytime verifier?

Efficient verification

PolyCheck: Complexity class containing problems whose positive instances can be correctly verified in polynomial time
Definition: Problem \(F \in\) PolyCheck if there exists a polytime verifier for \(F\)
Examples: FACTOR, TSPD, PACKING, SUBSETSUM, and PARTITION
Key insight: Many problems have a verification easier than solving
Practical meaning: Solutions are hard to find but easy to check

PolyCheck captures “easy to verify”: these are problems where checking a proposed solution is polynomial-time, even if finding the solution might be exponential-time. This is the class of problems where verification helps!

Formal definition of PolyCheck

Computational problem \(F\) is in PolyCheck if polytime verifier exists
Verifier can use hint \(H\) to make verification efficient
Examples include many “classic hard problems” like FACTOR and TSPD
Question: How does this relate to nondeterminism in prior classes?

Can you intuitively grasp the definition of PolyCheck? Can you connect this to a non-deterministic Turing machine? Can you guess the benefits?

“Haystack analogy” for PolyCheck

Searching for needles: Finding factors among vast possibilities
Recognizing needles: Quickly verify if something is a needle
PolyCheck problems: Can efficiently verify needles when found
Exponential haystack: More possibilities than atoms in universe
Key point: Verification is not as “costly” as discovery

Finding versus verifying are different tasks: for a 10,000-digit number, there are vastly more non-factors than atoms in the observable universe. PolyCheck means we can glance at a candidate and quickly determine if it’s a factor, without searching the entire space! See the benefits? Now, let’s define the NPoly class!

NPoly: nondeterministic polynomial time

NPoly: Solved by nondeterministic programs in polynomial time
Nondeterministic running time: Time until last thread completes
Parallel search: Launch thread for each possible solution
Key idea: Check all candidates simultaneously in parallel

NPoly uses parallel computation: imagine launching a separate thread for every possible solution and checking them “all at once” in a parallel fashion. Intuitively, if the longest thread runs in polynomial time, the problem is in NPoly!

Nondeterministic running time

Definition: Maximum steps from root to any leaf in computation tree
Parallel perspective: All threads run simultaneously
Polynomial bound: All threads must finish in \(O(n^k)\) time
Key: Count steps in longest thread, not total across all threads
Important questions:
- What problems are in NPoly?
- How does this relate to PolyCheck?

FACTOR is in NPoly

Nondeterministic approach: Launch thread for each possible factor
Number of threads: \(O(M) = O(10^n)\) for \(n\)-digit number
Binary tree: Each thread launches at most two children
Tree depth: \(O(\log M) = O(n)\) levels deep
Each thread: Checks one factor in \(O(n^2)\) time (due to division)
Total nondeterministic time: \(O(n) \times O(n^2) = O(n^3)\)

Exponentially many threads, polynomial time: although we launch \(O(10^n)\) threads, they’re organized in a binary tree of depth \(O(n)\). Each path from root to leaf takes polynomial time, so FACTOR is in NPoly!

The remarkable result: PolyCheck = NPoly

Main theorem: PolyCheck and NPoly are identical
Surprising: Two completely different definitions!
- PolyCheck: Can verify solutions in polynomial time
- NPoly: Can solve with nondeterministic polynomial time
Proof strategy: Show mutual containment
- Part 1: PolyCheck \(\subseteq\) NPoly
- Part 2: NPoly \(\subseteq\) PolyCheck
Implication: Verification and nondeterminism are equivalent!

Efficient verification or nondeterministic solving is the same complexity class!

PolyCheck: Problems with efficient verification
NPoly: Problems solvable with non-determinism

Key result summary

Claim 1: Every NPoly problem is in PolyCheck
Claim 2: Poly is a subset of PolyCheck/NPoly
Claim 3: PolyCheck/NPoly is a subset of Expo

Middle ground: PolyCheck/NPoly

Hierarchy: Poly \(\subseteq\) PolyCheck/NPoly \(\subseteq\) Expo
First inclusion: Every Poly problem is in PolyCheck/NPoly
- Easy to verify by just solving and comparing!
Second inclusion: Every PolyCheck/NPoly problem is in Expo
- Can solve by trying all possible solutions (exponential time)
Open questions: Are these proper subsets?
- Is Poly \(\subset\) PolyCheck/NPoly?
- Is PolyCheck/NPoly \(\subset\) Expo?

Decision problem relationships

Known: Poly \(\subseteq\) PolyCheck/NPoly \(\subseteq\) Expo
Known: Poly \(\subset\) Expo (proper subset)
Unknown: Are all these individual inclusions proper?
This diagram holds for decision problems only

Key slogan about nondeterminism

Nondeterminism does not alter computability, but it is widely believed to improve computational efficiency.

Computability: Nondeterminism does not enhance computability
Efficiency: Nondeterminism likely adds exponential speedup
Open problem: Proving this rigorously is extremely difficult
Importance: Understanding this boundary drives complexity theory
Key beliefs that are as yet unproven:
- P \(\neq\) NP
- NP \(\neq\) Exp

Complexity class relationships

P: Decision problems in Poly
NP: Decision problems in NPoly (equivalent to PolyCheck)
Exp: Decision problems in Expo
Relationships: Can you identify key relationships in this figure?

Key insights for proofgrammers

Verifiers: Formalize “easy to check” for solutions
PolyCheck: Problems with polytime verification
NPoly: Problems with nondeterministic polytime solutions
Equivalence: PolyCheck = NPoly (remember, this is proven!)
Sandwich: Poly \(\subseteq\) PolyCheck/NPoly \(\subseteq\) Expo
P versus NP: Is Poly \(\subset\) PolyCheck/NPoly?

Understanding verification unlocks complexity theory: PolyCheck/NPoly captures problems where solutions are hard to find but easy to verify. This class sits between Poly and Expo, leading to the most famous open question in computer science! We will explore more details soon! Stay tuned!