MathShepherdGenerator¶

Math Shepherd 解决方案生成器。

此任务负责为给定指令生成补全，格式为 Math Shepherd Completer 任务所期望的格式。属性使该任务可以灵活地用于不同类型的数据集和 LLM，但我们提供了原始论文中提出的 GSM8K 和 MATH 数据集的示例。在修改它们之前，请查看当前默认值，以确保正确生成补全。如果未提供给定问题的黄金解决方案，则可以使用此任务生成黄金解决方案，以及可能由 Math Shepherd Completer 标记的可能解决方案。根据 M 的值，只会生成 solutions 或 golden_solution 中的一个。

属性¶

system_prompt：在补全中使用的系统提示。默认提示已经过检查，使用 Llama 3.1 的 8B 和 70B 生成了良好的补全，但可以对其进行修改以使其适应所选模型和数据集。请注意，系统提示在 Jinja2 模板中包含 2 个变量 {{extra_rules}} 和 {{few_shot}}。这些变量用于包含额外的规则，例如引导模型朝向特定类型的响应，以及添加示例的少量示例。可以修改它们以使系统提示适应数据集和使用的模型，而无需更改完整的系统提示。
extra_rules：此字段可用于插入与数据集类型相关的额外规则。例如，在原始论文中，他们使用了 GSM8K 和 MATH 数据集，此字段可用于插入 GSM8K 数据集的规则。
few_shots：少量示例，以帮助模型生成补全，以您数据集所需的解决方案类型格式编写它们。
M：每个步骤要生成的补全数。默认设置为 1，这将生成“golden_solution”。在这种情况下，选择更强大的模型，因为它将用作标记期间的真实来源。如果 M 设置为大于 1 的数字，则任务将生成要由 Math Shepherd Completer 任务标记的补全列表。

输入和输出列¶

graph TD
    subgraph Dataset
        subgraph Columns
            ICOL0[instruction]
        end
        subgraph New columns
            OCOL0[golden_solution]
            OCOL1[solutions]
            OCOL2[model_name]
        end
    end

    subgraph MathShepherdGenerator
        StepInput[Input Columns: instruction]
        StepOutput[Output Columns: golden_solution, solutions, model_name]
    end

    ICOL0 --> StepInput
    StepOutput --> OCOL0
    StepOutput --> OCOL1
    StepOutput --> OCOL2
    StepInput --> StepOutput

输入¶

instruction (str)：任务或指令。

输出¶

golden_solution (str)：指令的逐步解决方案。如果 M 等于 1，则会生成它。
solutions (List[List[str]])：指令的可能解决方案列表。如果 M 大于 1，则会生成它。
model_name (str)：用于生成修订的模型名称。

示例¶

为给定指令生成解决方案（此处最好使用更强大的模型）¶

from distilabel.steps.tasks import MathShepherdGenerator
from distilabel.models import InferenceEndpointsLLM

llm=InferenceEndpointsLLM(
    model_id="meta-llama/Meta-Llama-3.1-70B-Instruct",
    tokenizer_id="meta-llama/Meta-Llama-3.1-70B-Instruct",
    generation_kwargs={
        "temperature": 0.6,
        "max_new_tokens": 1024,
    },
)
task = MathShepherdGenerator(
    name="golden_solution_generator",
    llm=llm,
)

task.load()

result = next(
    task.process(
        [
            {
                "instruction": "Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
            },
        ]
    )
)
# [[{'instruction': "Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
# 'golden_solution': '["Step 1: Janet sells 16 - 3 - 4 = <<16-3-4=9>>9 duck eggs a day.", "Step 2: She makes 9 * 2 = $<<9*2=18>>18 every day at the farmer\u2019s market.", "The answer is: 18"]'}]]

为给定指令生成 M 个补全（使用结构化输出生成）¶

from distilabel.steps.tasks import MathShepherdGenerator
from distilabel.models import InferenceEndpointsLLM

llm=InferenceEndpointsLLM(
    model_id="meta-llama/Meta-Llama-3.1-8B-Instruct",
    tokenizer_id="meta-llama/Meta-Llama-3.1-8B-Instruct",
    generation_kwargs={
        "temperature": 0.7,
        "max_new_tokens": 2048,
    },
)
task = MathShepherdGenerator(
    name="solution_generator",
    llm=llm,
    M=2,
    use_default_structured_output=True,
)

task.load()

result = next(
    task.process(
        [
            {
                "instruction": "Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
            },
        ]
    )
)
# [[{'instruction': "Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
# 'solutions': [["Step 1: Janet sells 16 - 3 - 4 = <<16-3-4=9>>9 duck eggs a day. -", "Step 2: She makes 9 * 2 = $<<9*2=18>>18 every day at the farmer\u2019s market.", "The answer is: 18"], ["Step 1: Janets ducks lay 16 eggs per day, and she uses 3 + 4 = <<3+4=7>>7 for eating and baking. +", "Step 2: So she sells 16 - 7 = <<16-7=9>>9 duck eggs every day. +", "Step 3: Those 9 eggs are worth 9 * $2 = $<<9*2=18>>18.", "The answer is: 18"]]}]]

参考文献¶

Math-Shepherd：无需人工注释即可逐步验证和加强 LLM