Skip to main content
Zero-Knowledge Threat Modeling

Playing with Shadows: Threat Modeling Zero-Knowledge Proofs in Production

Zero-knowledge proofs (ZKPs) promise a revolution in privacy-preserving computation, but moving from academic papers to production systems introduces unique threat vectors that few teams are prepared to handle. This comprehensive guide explores the shadow side of ZKPs—the subtle vulnerabilities, trust assumptions, and operational risks that emerge when you actually run these systems at scale. We dissect eight critical areas: from flawed circuit designs and toxic waste management to prover-client side-channel attacks and recursive proof aggregation pitfalls. Through anonymized composite scenarios, we illustrate how teams have accidentally leaked private inputs, failed to verify proof freshness, and misconfigured gas limits on-chain. The article provides actionable threat models, mitigation strategies, and decision frameworks for whether ZKPs are even the right tool for your problem. Whether you are deploying zk-Rollups, private identity systems, or confidential smart contracts, this guide helps you anticipate the shadows before they become outages. Written for senior engineers and architects, it emphasizes practical hardening over theoretical perfection. Last reviewed May 2026.

The Unspoken Risks of Zero-Knowledge Proofs in Production

Zero-knowledge proofs (ZKPs) have moved from cryptographic curiosities to production-grade tools powering scaling solutions, privacy protocols, and identity systems. But as teams rush to integrate ZKPs, a dangerous gap persists: most threat models focus on the protocol layer while ignoring the unique failure modes that emerge when you operate these systems in real-world infrastructure. This article draws on patterns observed across multiple deployments to highlight the often-overlooked attack surfaces that can compromise your ZKP system's integrity, privacy, or availability. We assume you understand basic ZKP concepts; our focus is on the operational shadows—the places where theory meets messy reality.

One core tension is that ZKPs are not monolithic; different schemes (Groth16, PLONK, STARKs) carry different trust assumptions and failure modes. For instance, Groth16 requires a one-time trusted setup, and if that setup is compromised, all proofs generated with those parameters are worthless. In production, teams often reuse setup parameters across multiple circuits, amplifying the blast radius of a setup compromise. Similarly, recursive proofs (like those in Halo or Nova) introduce new verification complexities that can hide soundness bugs. The key insight is that threat modeling for ZKPs must extend beyond the circuit itself to encompass the entire lifecycle: parameter generation, proof construction, verification, and key management. Many teams treat the ZKP as a black box, only to discover that the real vulnerabilities lie in the interfaces between the proof system and the surrounding application.

Consider a common scenario: a DeFi protocol uses a ZKP to prove solvency without revealing customer balances. The circuit is audited and seems correct. But the production deployment fails to validate that the proof is generated against the latest state root, allowing an attacker to replay a stale proof. This is not a cryptographic failure—it is a systems integration failure. The threat model must account for state synchronization, proof expiration, and the ordering of operations between the prover and verifier. These are the shadows we will illuminate in this guide.

Why Traditional Threat Models Fall Short

Standard threat modeling frameworks like STRIDE or PASTA were designed for conventional software, not cryptographic systems with unique properties like zero-knowledge and soundness. For example, the 'information disclosure' category in STRIDE maps imperfectly to ZKPs: a proof that leaks zero knowledge by design may still leak metadata (e.g., proof size, proving time) that reveals private inputs statistically. Similarly, 'tampering' threats in ZKPs can manifest as proof malleability—where an attacker can convert a valid proof into a different valid proof without knowing the witness. Many production systems do not implement non-malleability checks, assuming the protocol guarantees it. The lesson: you need a domain-specific threat model that considers cryptographic primitives as components with their own failure modes.

Another gap is the assumption that the prover is trusted. In many ZKP applications (e.g., private transactions), the prover is the user, and the verifier is a smart contract. The verifier must assume the prover may be malicious and attempt to forge proofs. But the prover also faces risks: a malicious verifier might try to extract information from the proof transcript or the interaction pattern. In interactive protocols, a verifier could deviate from the protocol to learn more than allowed. These mutual trust assumptions are rarely documented in production threat models. We must shift from a binary 'trusted vs. untrusted' model to a nuanced spectrum where each party has incentives to cheat in specific ways.

", "

Core Frameworks: Deconstructing ZKP Attack Surfaces

To build a robust threat model, we need a systematic way to categorize the attack surfaces unique to ZKPs. I propose a framework organized around four layers: the arithmetic circuit, the proving system, the protocol integration, and the operational environment. Each layer has its own failure modes that can cascade into higher-level vulnerabilities. Let us examine each layer with concrete examples drawn from real incidents (anonymized).

Arithmetic Circuit Vulnerabilities

The circuit defines the computation to be proved. Bugs here are the most fundamental: if the circuit does not correctly enforce the desired statement, then even a perfectly generated proof is worthless. Common circuit-level issues include under-constrained gates (where a gate output does not fully depend on its inputs), over-constrained paths (which can cause soundness failures), and incorrect range checks. For instance, a circuit that verifies an age > 21 might accidentally accept age = 0 because the comparison logic uses a signed integer comparator without handling negative numbers. Another risk is the 'toxic waste' from trusted setups: if the randomness used to generate parameters is not destroyed, an attacker can create fake proofs. In one case, a team used a shared setup ceremony for multiple circuits, and when one circuit was later found to have a vulnerability, the parameters for all circuits were considered compromised. Mitigations include formal verification of circuits, thorough test coverage with edge cases, and using transparent setups (like STARKs) where possible.

Proving System Weaknesses

The proving system itself (e.g., Groth16, PLONK) can have implementation bugs or protocol-level attacks. For example, a malleability attack on Groth16 proofs allows an attacker to take a valid proof and produce a different valid proof for the same statement, which can break applications that rely on proof uniqueness (e.g., to prevent double-spending). Another example: some implementations of PLONK have suffered from 'selector' bugs where the verifier does not correctly check polynomial identities, allowing forged proofs. These are not hypothetical; several high-severity CVEs have been issued for popular ZKP libraries. The mitigation is to use well-audited implementations from reputable teams, apply principle of least privilege to proof systems (e.g., use different parameters for different applications), and implement non-malleability checks by hashing the proof with a unique identifier.

Protocol Integration Pitfalls

This is the layer where most production incidents occur. The ZKP is embedded in a larger system (smart contract, API, database), and the interfaces become the weakest link. Common issues include: not verifying that the public inputs used in proof generation match the current on-chain state; accepting proofs that are too old; failing to enforce proof expiration; and not checking that the prover is authorized to generate proofs for a particular statement. For example, a voting system using ZKPs to prove eligibility might accept a proof generated for a voter ID that has been revoked. Another integration risk is the 'replay attack' where an attacker captures a valid proof and resubmits it later. Mitigations include adding nonces or timestamps to the public inputs, using unique proof identifiers, and maintaining a set of used proofs to detect replays. Additionally, the verifier must be implemented correctly—especially in gas-constrained environments like Ethereum, where verifier circuits are optimized and may skip important checks.

Operational Environment Threats

Finally, the infrastructure running the prover and verifier introduces classical security issues: side-channel attacks (timing, power, electromagnetic) that leak information about the witness; storage of proving keys (if not properly protected); and network attacks that intercept or modify proofs in transit. For instance, a prover running in a shared cloud environment might leak private inputs via CPU cache timing attacks during proof generation. Similarly, if proving keys are stored on disk without encryption, an attacker with file system access can steal them and generate arbitrary proofs. Mitigations include using hardware security modules (HSMs) for key storage, running provers in isolated environments, and applying constant-time programming techniques to prevent timing leaks. The operational threat model should also consider denial-of-service attacks: an attacker could submit many invalid proofs to exhaust verifier resources, or generate extremely large proofs (e.g., in recursive proof systems) to exceed gas limits.

", "

Execution: A Step-by-Step Threat Modeling Process

Now that we have a framework, let us walk through a practical threat modeling exercise for a typical ZKP deployment: a private asset transfer system on a blockchain. The goal is to identify threats, prioritize them, and design mitigations. We will follow a four-phase process: asset inventory, trust boundary mapping, threat enumeration, and risk assessment. This process should be repeated whenever the system changes significantly.

Phase 1: Asset Inventory

First, list all assets that need protection: the private inputs (sender, receiver, amount), the proving keys, the verification keys, the proof transcripts, and the system's reputation/availability. For each asset, identify its value and the impact of compromise. For example, if proving keys are leaked, an attacker can forge proofs for arbitrary transactions, leading to theft of funds. If private inputs are leaked, user privacy is violated. This inventory drives the entire threat model.

Phase 2: Trust Boundary Mapping

Draw a diagram showing all components: the user's client (prover), the blockchain (verifier), the relayer or sequencer (if any), the key generation ceremony participants, and any off-chain oracles. Mark trust boundaries where data moves from one trust domain to another. For example, the user's client is trusted to generate proofs correctly, but is untrusted by the verifier. The key generation ceremony is a single point of failure: if any participant is dishonest, the entire setup is compromised. The relayer (which forwards proofs to the chain) is a potential censorship point. Mapping these boundaries clarifies where attacks can originate and what assumptions we are making.

Phase 3: Threat Enumeration

For each trust boundary, enumerate threats using a ZKP-specific taxonomy: soundness (can a prover convince a verifier of a false statement?), zero-knowledge (can a verifier learn more than the validity of the statement?), malleability (can a proof be transformed into another valid proof?), and freshness (can a proof be replayed?). Also consider availability (can the system be forced offline?) and privacy from metadata (can an observer infer information from proof sizes, timings, or network patterns?). For each threat, document the attacker's capability, the attack vector, and the potential impact.

Phase 4: Risk Assessment and Mitigation

Assign a risk level (low, medium, high, critical) to each threat based on likelihood and impact. Then design mitigations. For high-risk threats like soundness bugs, mitigations include multiple independent circuit audits, formal verification, and using multiple proof systems (e.g., a STARK for soundness and a SNARK for efficiency). For medium risks like malleability, add a hash of the proof with a unique nonce to the public inputs. For low risks like timing side-channels, use constant-time implementations and run provers in isolated environments. Document all assumptions, especially about trust: for example, 'we assume the key generation ceremony participants are honest and the randomness is destroyed'. If these assumptions are violated, the threat model must be updated.

Example Walkthrough: Private Asset Transfer

Consider a system where Alice sends tokens to Bob privately using a ZKP. The circuit proves that the sender has sufficient balance and that the transaction does not create or destroy tokens. Threats include: (1) Alice could double-spend by generating two proofs for the same balance commitment; mitigation: include a unique nullifier in the circuit. (2) An attacker could replay Alice's proof; mitigation: include a timestamp or sequence number. (3) The relayer could censor Alice's transaction; mitigation: allow users to submit directly to the blockchain. (4) A malicious prover could create a valid proof for an invalid transaction if the circuit has a bug; mitigation: rigorous testing and audit. This process should be documented and reviewed by independent security experts.

", "

Tools, Stack, and Maintenance Realities

Choosing the right tools and maintaining them over time is a critical but often underestimated part of ZKP threat modeling. The ecosystem evolves rapidly, and libraries that were secure a year ago may have disclosed vulnerabilities. Teams must adopt a proactive maintenance strategy that includes continuous monitoring of security advisories, regular updates, and periodic re-audits. Let us examine the key components of a production ZKP stack and the maintenance challenges for each.

Proving System Libraries

Popular libraries include Bellman (Rust, Groth16), gnark (Go, Groth16/PLONK), and Circom (JavaScript, Groth16/PLONK). Each has its own security track record. For example, Bellman had a critical vulnerability in 2023 related to incorrect handling of scalar multiplication, which could lead to soundness breaks. Teams using Bellman needed to update to the patched version immediately. The maintenance burden is non-trivial: you must track all upstream changes, test compatibility with your circuits, and redeploy verifiers (which may be on-chain smart contracts). One strategy is to use a wrapper library that abstracts away the underlying prover, allowing you to swap implementations without changing your circuit code. Another is to pin specific versions and run your own CI that alerts you to new releases and CVEs. Do not rely solely on automated dependency scanning; many ZKP vulnerabilities are protocol-level rather than implementation-level, and scanners may not detect them.

Smart Contract Verifiers

If your verifier runs on Ethereum (or another chain), you face the challenge of upgradability and gas optimization. Many deployed verifier contracts are not upgradeable, meaning that if a vulnerability is found, you cannot fix it without a migration. Some teams use proxy patterns to allow contract upgrades, but this introduces trust assumptions (the upgrade mechanism could be abused). Additionally, gas optimization may lead to verifier circuits that skip some checks; you must verify that the optimized verifier still enforces all necessary constraints. For example, a verifier might omit the check that the proof is fresh (i.e., that it references the latest state root), relying instead on the application layer—but if the application layer forgets, the system is vulnerable. Maintenance includes monitoring the chain for potential replay attacks and updating the verifier if the proving system changes.

Key and Parameter Management

Proving keys and verification keys are sensitive assets. Proving keys must be kept secret; if leaked, an attacker can forge proofs. Verification keys must be integrity-protected; if tampered with, an attacker can accept invalid proofs. In practice, proving keys are often stored in configuration files or environment variables, which is insecure. Better approaches: use a key management service (KMS) or hardware security module (HSM) to store keys, and load them into memory only when needed. For verification keys, store them on-chain (for blockchain applications) or in a signed configuration file. When keys expire (e.g., after a circuit update), you must securely delete old keys and rotate to new ones. This process must be automated and logged. Additionally, the trusted setup ceremony produces 'toxic waste'—the random values used to generate parameters. These must be destroyed securely (e.g., by a multi-party protocol that ensures no single party retains the secret). If you reuse parameters from a public ceremony, verify that the ceremony was conducted correctly and that the waste was destroyed.

Monitoring and Incident Response

Even with perfect threat modeling, incidents will happen. You need monitoring to detect attacks in progress: for example, monitor for proofs that fail verification (could indicate soundness attacks), proofs with unusual sizes (could indicate malleability), or a spike in proof submissions (could indicate a replay attack). Set up alerts for these events. Have an incident response plan that includes: how to halt proof acceptance (e.g., pause the smart contract), how to roll back to a previous safe state, and how to communicate with users. Because many ZKP systems are immutable (especially on-chain), responses may require social consensus or hard forks. Document these procedures and practice them regularly.

", "

Growth Mechanics: Scaling Threat Modeling as Your System Evolves

As your ZKP system gains adoption, the threat landscape expands. New users bring new attack vectors, and system complexity increases the chance of integration mistakes. Threat modeling must be a living practice that evolves with the system. This section covers how to scale your threat modeling efforts, from initial deployment to mature ecosystem with multiple stakeholders.

Versioning and Dependency Management

Every time you update a circuit, proving system, or verifier, you should re-run your threat model. The most common cause of production incidents is a seemingly minor change that introduces a new trust assumption or breaks a security invariant. For example, adding a new public input to a circuit might inadvertently expose private data if the input is not properly blinded. Use semantic versioning for your circuits and proof systems, and maintain a security changelog that documents the impact of each change. When you upgrade a dependency (e.g., a ZKP library), review the release notes for security fixes and test the new version against your threat model. Automate this process as much as possible: include security checks in your CI/CD pipeline that verify that the new version does not break any documented assumptions.

Expanding Trust Domains

In early stages, you might have a single trusted team managing the entire system. As you grow, you may introduce multiple provers (e.g., user clients), multiple verifiers (e.g., different chains or off-chain services), and third-party integrations. Each new trust domain adds new attack surfaces. For example, if you allow third-party applications to verify your proofs, you must ensure that your verifier smart contract is robust against malicious inputs from those applications. Similarly, if you open-source your circuit, you must consider that attackers will study it for bugs. Document the trust model for each stakeholder: what can they be trusted to do, and what can they not be trusted to do? For instance, user clients can be trusted to generate proofs, but not to protect their own private keys. Use this documentation to guide security reviews and penetration testing.

Handling Multiple Proof Systems

Some systems use multiple proof systems for different purposes (e.g., STARKs for soundness, SNARKs for efficiency). This adds complexity: you must ensure that the proofs are linked correctly and that a vulnerability in one system does not compromise the other. For example, if you have a STARK that produces a proof of correct execution, and a SNARK that compresses that proof, an attacker might forge the SNARK if its circuit is not properly constrained. The threat model should cover the composition: what does each proof prove, and how are they combined? Consider using a 'universal' verifier that checks all proofs together, rather than separate verifiers that an attacker could trick into accepting a mismatch.

Community and Governance

Mature ZKP systems often have a community of users and developers who contribute to the protocol. Governance processes (e.g., DAO votes) can introduce new threats: for example, a malicious proposal could upgrade the verifier to accept forged proofs. The threat model must include governance attacks and require security reviews for all proposals that change the protocol. Additionally, community-run provers or relayers may have incentives to cheat, requiring reputation systems or slashing conditions. Threat modeling for decentralized systems is an active research area, but at minimum, you should document the governance process and the security checks that apply to each action.

", "

Risks, Pitfalls, and Mistakes: Lessons from the Shadows

Even with careful planning, teams make mistakes. This section catalogs the most common pitfalls we have observed in production ZKP deployments, along with concrete mitigations. Learning from others' mistakes is cheaper than making them yourself.

Pitfall 1: Ignoring Proof Freshness

Many systems accept proofs without verifying that they are based on the current state. For example, a private payment system might accept a proof that the sender had a certain balance at some point in the past, even if the balance has since been spent. This enables replay attacks and double-spending. Mitigation: always include a state root or timestamp in the public inputs, and verify it against the latest state. For blockchain applications, require that the proof references the latest block hash. For off-chain systems, use a trusted timestamping service.

Pitfall 2: Trusting the Prover's Randomness

In some ZKP schemes, the prover must generate random challenges (e.g., in interactive protocols that are made non-interactive via Fiat-Shamir). If the randomness is predictable, an attacker can forge proofs. This is a classic mistake in early implementations of Schnorr signatures, and it repeats in ZKP systems. Mitigation: use a cryptographically secure random number generator and include the public inputs in the Fiat-Shamir hash to prevent chosen-input attacks. Audit the randomness generation code carefully.

Pitfall 3: Overlooking Side Channels

Proof generation can leak information about the witness through timing, power consumption, or electromagnetic emissions. In a cloud environment, a co-located attacker might observe cache timing to infer private inputs. Mitigation: use constant-time implementations for all cryptographic operations, avoid data-dependent branches, and run provers in isolated environments (e.g., dedicated hardware). For high-security applications, consider using a trusted execution environment (TEE) for the prover.

Pitfall 4: Insufficient Testing of Edge Cases

Circuits often behave unexpectedly with edge-case inputs (zero, negative numbers, overflow, etc.). For example, a circuit that checks equality might fail if both inputs are zero because of an under-constrained gate. Mitigation: write extensive property-based tests that generate random inputs, and use formal verification tools to prove that the circuit is correct for all inputs. Also test the verifier with invalid proofs to ensure it rejects them.

Pitfall 5: Key Management Sloppiness

Proving keys left in source code repositories, verification keys not integrity-checked, and toxic waste not destroyed are common. Mitigation: use a key management policy that covers generation, storage, rotation, and destruction. Automate key rotation and use HSMs where possible. For trusted setup ceremonies, use multi-party computation (MPC) to ensure no single party holds the secret.

Pitfall 6: Assuming the Verifier is Correct

The verifier is the gatekeeper; if it has a bug, all bets are off. Verifier bugs have been found in production systems, such as missing constraints in the verifier circuit. Mitigation: formally verify the verifier, or at minimum, have it independently audited. Use multiple verifiers (e.g., on-chain and off-chain) and compare their outputs.

Pitfall 7: Not Planning for Compromise

Assume that at some point, a component will be compromised. Do you have a way to pause the system, revoke keys, or migrate to a new setup? Many systems lack emergency stop mechanisms. Mitigation: design for incident response from day one. Include a pause function (with appropriate access control), a process for key revocation, and a migration plan to a new trusted setup if needed.

", "

Mini-FAQ: Common Questions from Practitioners

This section addresses questions that frequently arise when teams start threat modeling their ZKP systems. The answers are based on patterns observed across multiple projects and are intended to guide your decision-making, not to replace professional security review.

Do I need a trusted setup? Can I avoid it?

Trusted setups are required for many efficient SNARKs (e.g., Groth16). They introduce a single point of failure: if the setup is compromised, all proofs are forgeable. Transparent setups (like STARKs) avoid this but have larger proof sizes and higher verification costs. Your choice depends on your threat model: if you cannot tolerate the risk of a compromised setup, use transparent or updatable setups (like PLONK with a universal setup). If you use a trusted setup, ensure it is conducted via an MPC ceremony with many participants, and that the toxic waste is destroyed. Document the ceremony and have it audited.

How often should I re-audit my circuits?

Re-audit whenever the circuit changes, and at least annually if it does not. Even without changes, new vulnerabilities in the proving system or library may affect your circuit. Also re-audit if the threat model changes (e.g., new use cases, new trust domains). Audits should be performed by independent firms with ZKP expertise. Do not rely solely on internal reviews.

Can I use the same proving key for multiple circuits?

No, unless the circuits are identical. Proving keys are tied to a specific circuit structure. Using the same key for different circuits would allow an attacker to generate proofs for one circuit and claim they are for another. If you have multiple circuits, generate separate keys for each. If you use a universal setup (like PLONK), you can use the same structured reference string (SRS) for many circuits, but each circuit still has its own verification key.

What about privacy from metadata?

ZKPs protect the content of proofs, but metadata (proof size, proving time, network traffic patterns) can leak information. For example, in a private voting system, the number of votes might be inferred from the number of proofs submitted. Consider using constant-size proofs and padding to hide the actual number. Also consider using mix networks or differential privacy to further obfuscate metadata.

How do I handle proof malleability?

Implement non-malleability by hashing the proof with a unique identifier (e.g., a nonce or transaction ID) and including that hash in the public inputs. This prevents an attacker from converting a valid proof into a different valid proof. Also ensure that the verifier checks that the proof is not malformed (e.g., that curve points are on the correct subgroup). Use well-tested libraries that handle malleability.

Should I use recursion? What are the risks?

Recursive proofs (proofs that verify other proofs) are powerful for scaling but introduce new attack surfaces. The verifier must correctly check that the inner proof is valid, and the recursion circuit must be correct. Bugs in recursion circuits have been found in the past. Start with simple recursion and thoroughly test each layer. Consider using a verified recursion library.

", "

Synthesis and Next Actions: Building a Resilient ZKP System

Threat modeling zero-knowledge proofs in production is not a one-time exercise but an ongoing discipline that must keep pace with system evolution and ecosystem changes. The key takeaway is that ZKPs introduce a new class of vulnerabilities that traditional cybersecurity training does not cover. Teams must invest in specialized knowledge, rigorous processes, and a culture of security that includes everyone from developers to operators. This final section synthesizes the core principles and provides a concrete action plan to strengthen your system.

First, adopt a layered threat model that covers the circuit, the proving system, the integration, and the operational environment. For each layer, document your trust assumptions and verify them through testing, auditing, and formal methods. Do not assume that a library or tool is secure because it is popular; verify its security track record and monitor for CVEs. Second, implement strong key management: use HSMs or KMS for proving keys, rotate them periodically, and destroy old keys securely. For trusted setups, use multi-party ceremonies and verify the destruction of toxic waste. Third, design for incident response: have a pause mechanism, a key revocation process, and a migration plan for when things go wrong. Practice these procedures regularly.

Fourth, invest in monitoring and alerting for anomalous proof activity. Set up dashboards that track proof submission rates, verification failure rates, and proof sizes. Use these metrics to detect attacks early. Fifth, engage with the ZKP security community: attend conferences, read security advisories, and share your own findings (anonymized). The ecosystem benefits from collective vigilance. Finally, remember that threat modeling is a team sport. Involve developers, operations, and security engineers in regular threat modeling sessions. Use the framework described in this article as a starting point, but adapt it to your specific context. The shadows are real, but with careful planning, you can play with them safely.

Next steps: schedule a threat modeling workshop for your team, review your current key management practices, and audit your most critical circuits. If you are using a trusted setup, verify that the ceremony was conducted securely. If you are not monitoring proof activity, set up basic logging today. The cost of prevention is far lower than the cost of a breach.

About the Author

This article was prepared by the editorial team for this publication. We focus on practical explanations and update articles when major practices change.

Last reviewed: May 2026

Share this article:

Comments (0)

No comments yet. Be the first to comment!