3.4 Anti-Gaming Mechanisms

1. Synthetic Ground Truth Tasks (15–20%)

At a 17.5% injection rate, validators secretly replace a real benchmark task with a synthetic variant derived at runtime. The variant has the same description and reference answer as the original task, but a new opaque task ID (syn_<8-hex>) and tighter budget/latency constraints.

Miners cannot distinguish a synthetic from a real task. Validators score it using the same deterministic methods (ROUGE-L, test pass rate, schema validation) — no LLM judge.

The derivation is fully reproducible given the CSWON_SYNTHETIC_SALT:

import hashlib
seed = f"{SALT}:{validator_hotkey}:{block}".encode()
task_id = "syn_" + hashlib.sha256(seed).hexdigest()[:8]

Full synthetic task protocol →

2. VRF-Keyed Per-Validator Task Schedule

Copy resistance: Miners cannot serve a cached plan because the task they receive is derived from hash(validator_hotkey + block_height). Copying the current top miner's last submission will not reproduce the correct response to a different validator's query.

import hashlib
seed       = f"{validator_hotkey}:{current_block}".encode()
h          = hashlib.sha256(seed).digest()
task_index = int.from_bytes(h, 'big') % len(benchmark_tasks)
task       = benchmark_tasks[task_index]

Different validators query different tasks at the same block height. Cross-validator score comparison uses distributional statistics over the rolling window, not identical-task point comparisons.

3. Scoring Version Enforcement

Validators encode SCORING_VERSION as an integer in __spec_version__ and as a human-readable string in axon.info.description. Mismatches are detectable from the live metagraph.

4. Dynamic Benchmark Rotation

Anti-overfitting: When >70% of miners score above 0.90 for 3 consecutive tempos, the task is deprecated and rotated. Miners cannot memorise the benchmark.

5. Execution Sandboxing

Validators execute all workflows in isolated Docker containers, tracking actual TAO costs, latency, retries, and step completions.

6. Temporal Consistency Checks

Sudden unexplained performance jumps trigger a manual audit flag in the validator dashboard.

7. Completion Ratio Enforcement

Submitting a single-step workflow for a multi-step task always results in a proportionally penalised S_success.

8. Budget Ceiling

budget_ceiling = min(constraints["max_budget_tao"],
                     1.5 * workflow_plan["total_estimated_cost"])

Validators abort execution when cumulative cost reaches the ceiling, preventing malicious miners from forcing expensive sandbox executions.

Incentive Alignment and Penalties

C-SWON does not introduce on-chain slashing — Bittensor does not support automatic stake slashing at the protocol level. Instead:

Yuma Consensus bonds: Validators whose scores deviate from consensus earn progressively less
Delegation flow: Delegators move stake away from misbehaving validators
Governance control: Subnet owner can adjust validator limits and prune permits


← Previous	3.3 Output Quality Scoring
→ Next	3.5 Scoring Version Control
Index	Documentation Index

1. Synthetic Ground Truth Tasks (15–20%)​

2. VRF-Keyed Per-Validator Task Schedule​

3. Scoring Version Enforcement​

4. Dynamic Benchmark Rotation​

5. Execution Sandboxing​

6. Temporal Consistency Checks​

7. Completion Ratio Enforcement​

8. Budget Ceiling​

Incentive Alignment and Penalties​

Navigation​