Training-free Efficient Reasoning Online Judge

📝 Your Code

Model:

Dataset:

Implement your method using these functions:

Available methods:
• probe_new() - 开始探测一个新分支
  返回: (answer: str, index: int, is_finish: bool)
  answer: 当前probe得到的答案
  index: 分支索引（用于probe_more）
  is_finish: 该分支是否已完成

• probe_more(index: int) - 继续探测指定分支
  返回: (answer: str, is_finish: bool)
  answer: 继续probe得到的答案
  is_finish: 该分支是否已完成

• get_new_branch_final_answer() - 获取完整分支的最终答案
  返回: answer: str - 完整分支的最终答案

Your code should assign the final answer to result or answer

Example Implementations:

Parameter Sweep Configuration:

Configure parameter ranges to automatically evaluate and plot results.
X-axis: Average Cost, Y-axis: Accuracy

Parameter 1 Name:

Parameter 1 Range:

Min:

Max:

Step:

Enable Parameter 2 (2D sweep)

Parameter 2 Name:

Parameter 2 Range:

Min:

Max:

Step:

Base Code Template:

Use {param1} and {param2} as placeholders for parameters.
Example: n_samples = {param1}

Arena Configuration:

Compare two parameter sweep algorithms side by side.
Both algorithms will be evaluated and plotted on the same chart for comparison.

Algorithm 1

Algorithm Name:

Parameter 1 Name:

Parameter 1 Range:

Min:

Max:

Step:

Code Template:

Algorithm 2

Algorithm Name:

Parameter 1 Name:

Parameter 1 Range:

Min:

Max:

Step:

Code Template:

📊 Results

Write your code and click "Evaluate" to see results here.

Test Example Output:

Loading example...

This shows example branch probe results from a sample question.

🚀 Training-free Efficient Reasoning Online Judge

📝 Your Code

Algorithm 1

Algorithm 2

📊 Results