3.4 Anti-Gaming Mechanisms
1. Synthetic Ground Truth Tasks (15–20%)
At a 17.5% injection rate, validators secretly replace a real benchmark task with a synthetic variant derived at runtime. The variant has the same description and reference answer as the original task, but a new opaque task ID (syn_<8-hex>) and tighter budget/latency constraints.
Miners cannot distinguish a synthetic from a real task. Validators score it using the same deterministic methods (ROUGE-L, test pass rate, schema validation) — no LLM judge.
The derivation is fully reproducible given the CSWON_SYNTHETIC_SALT:
import hashlib
seed = f"{SALT}:{validator_hotkey}:{block}".encode()
task_id = "syn_" + hashlib.sha256(seed).hexdigest()[:8]
Full synthetic task protocol →
2. VRF-Keyed Per-Validator Task Schedule
Copy resistance: Miners cannot serve a cached plan because the task they receive is derived from
hash(validator_hotkey + block_height). Copying the current top miner's last submission will not reproduce the correct response to a different validator's query.
import hashlib
seed = f"{validator_hotkey}:{current_block}".encode()
h = hashlib.sha256(seed).digest()
task_index = int.from_bytes(h, 'big') % len(benchmark_tasks)
task = benchmark_tasks[task_index]
Different validators query different tasks at the same block height. Cross-validator score comparison uses distributional statistics over the rolling window, not identical-task point comparisons.
3. Scoring Version Enforcement
Validators encode SCORING_VERSION as an integer in __spec_version__ and as a human-readable string in axon.info.description. Mismatches are detectable from the live metagraph.
4. Dynamic Benchmark Rotation
Anti-overfitting: When >70% of miners score above 0.90 for 3 consecutive tempos, the task is deprecated and rotated. Miners cannot memorise the benchmark.
5. Execution Sandboxing
Validators execute all workflows in isolated Docker containers, tracking actual TAO costs, latency, retries, and step completions.
6. Temporal Consistency Checks
Sudden unexplained performance jumps trigger a manual audit flag in the validator dashboard.
7. Completion Ratio Enforcement
Submitting a single-step workflow for a multi-step task always results in a proportionally penalised S_success.
8. Budget Ceiling
budget_ceiling = min(constraints["max_budget_tao"],
1.5 * workflow_plan["total_estimated_cost"])
Validators abort execution when cumulative cost reaches the ceiling, preventing malicious miners from forcing expensive sandbox executions.
Incentive Alignment and Penalties
C-SWON does not introduce on-chain slashing — Bittensor does not support automatic stake slashing at the protocol level. Instead:
- Yuma Consensus bonds: Validators whose scores deviate from consensus earn progressively less
- Delegation flow: Delegators move stake away from misbehaving validators
- Governance control: Subnet owner can adjust validator limits and prune permits
Navigation
| ← Previous | 3.3 Output Quality Scoring |
| → Next | 3.5 Scoring Version Control |
| Index | Documentation Index |