3.2 Scoring Formula v1.0.0

All scoring is deterministic. No external model is called. Every validator running the same benchmark task produces identical scores for identical miner outputs. This is auditable by any participant.

Low evaluation variance: the rolling 100-task equal-weight window ensures that scoring noise does not blunt the incentive curve around the top miner.

Composite Score

Every workflow a miner submits is executed by validators. A composite score S ∈ [0, 1] is computed across four dimensions:

S = 0.50 × S_success + 0.25 × S_cost + 0.15 × S_latency + 0.10 × S_reliability

Sub-Dimensions

S_success   = output_quality_score × completion_ratio
              where completion_ratio = steps_completed / total_steps_in_dag

S_cost      = max(0, 1 − actual_tao / max_budget_tao)
              only scored when S_success > 0.7; else S_cost = 0

S_latency   = max(0, 1 − actual_seconds / max_latency_seconds)
              only scored when S_success > 0.7; else S_latency = 0

S_reliability = min(1.0, max(0, 1 − (unplanned_retries × 0.10
                                    + timeouts          × 0.20
                                    + hard_failures     × 0.50)))
              applied regardless of success gate

Where:

unplanned_retries = max(0, actual_retries − declared_retry_budget)
declared_retry_budget = sum of retry_count values in error_handling (0 if absent)
timeouts = step executions that exceeded declared timeout_seconds
hard_failures = steps that terminated after exhausting the declared retry budget

Performance Dimensions

Dimension	Weight	Formula
Task Success	50%	`output_quality × completion_ratio`
Cost Efficiency	25%	`max(0, 1 − actual/budget)` — gated at S_success > 0.7
Latency	15%	`max(0, 1 − actual_s/max_s)` — gated at S_success > 0.7
Reliability	10%	`min(1.0, max(0, 1 − unplanned×0.1 − timeouts×0.2 − failures×0.5))`

Key Design Decisions

Success-first gating: A workflow that fails the task cannot be considered "good" regardless of how cheap or fast it is. Cost and latency are only scored when S_success > 0.7.

Declared retries are free: A miner who writes "retry_count": 2 in error_handling is declaring defensive intent. Only unplanned retries — attempts beyond the declared budget — are penalized.

Partial DAG completion: If a 4-step workflow completes 3 steps before a hard failure, completion_ratio = 0.75. This prevents miners from submitting single-step workflows for multi-step tasks.

Score Aggregation

Scores are aggregated over a rolling 100-task window with equal weight per task. No exponential decay — a fixed equal-weight window is simpler to audit, harder to time-exploit. The window is capped at 15% max weight per miner before submission.


← Previous	3.1 Emission Structure
→ Next	3.3 Output Quality Scoring
Index	Documentation Index

Composite Score​

Sub-Dimensions​

Performance Dimensions​

Key Design Decisions​

Score Aggregation​

Navigation​

Composite Score

Sub-Dimensions

Performance Dimensions

Key Design Decisions

Score Aggregation

Navigation