EvolInstruct¶

使用 LLM 进化指令。

WizardLM: Empowering Large Language Models to Follow Complex Instructions

属性¶

num_evolutions: 要执行的进化次数。
store_evolutions: 是否存储所有进化过程，还是仅存储最后一个。默认为 False。
generate_answers: 是否为进化的指令生成答案。默认为 False。
include_original_instruction: 是否在 evolved_instructions 输出列中包含原始指令。默认为 False。
mutation_templates: 用于进化指令的突变模板。默认为 utils.py 文件中提供的模板。
seed: 为 numpy 设置的种子，以便随机选择突变方法。默认为 42。

运行时参数¶

seed: 为 numpy 设置的种子，以便随机选择突变方法。

输入 & 输出列¶

graph TD
    subgraph Dataset
        subgraph Columns
            ICOL0[instruction]
        end
        subgraph New columns
            OCOL0[evolved_instruction]
            OCOL1[evolved_instructions]
            OCOL2[model_name]
            OCOL3[answer]
            OCOL4[answers]
        end
    end

    subgraph EvolInstruct
        StepInput[Input Columns: instruction]
        StepOutput[Output Columns: evolved_instruction, evolved_instructions, model_name, answer, answers]
    end

    ICOL0 --> StepInput
    StepOutput --> OCOL0
    StepOutput --> OCOL1
    StepOutput --> OCOL2
    StepOutput --> OCOL3
    StepOutput --> OCOL4
    StepInput --> StepOutput

输入¶

instruction (str): 要进化的指令。

输出¶

evolved_instruction (str): 如果 store_evolutions=False，则为进化的指令。
evolved_instructions (List[str]): 如果 store_evolutions=True，则为进化的指令列表。
model_name (str): 用于进化指令的 LLM 的名称。
answer (str): 如果 generate_answers=True 且 store_evolutions=False，则为进化指令的答案。
answers (List[str]): 如果 generate_answers=True 且 store_evolutions=True，则为进化指令的答案列表。

示例¶

使用 LLM 进化指令¶

from distilabel.steps.tasks import EvolInstruct
from distilabel.models import InferenceEndpointsLLM

# Consider this as a placeholder for your actual LLM.
evol_instruct = EvolInstruct(
    llm=InferenceEndpointsLLM(
        model_id="mistralai/Mistral-7B-Instruct-v0.2",
    ),
    num_evolutions=2,
)

evol_instruct.load()

result = next(evol_instruct.process([{"instruction": "common instruction"}]))
# result
# [{'instruction': 'common instruction', 'evolved_instruction': 'evolved instruction', 'model_name': 'model_name'}]

保留进化的迭代¶

from distilabel.steps.tasks import EvolInstruct
from distilabel.models import InferenceEndpointsLLM

# Consider this as a placeholder for your actual LLM.
evol_instruct = EvolInstruct(
    llm=InferenceEndpointsLLM(
        model_id="mistralai/Mistral-7B-Instruct-v0.2",
    ),
    num_evolutions=2,
    store_evolutions=True,
)

evol_instruct.load()

result = next(evol_instruct.process([{"instruction": "common instruction"}]))
# result
# [
#     {
#         'instruction': 'common instruction',
#         'evolved_instructions': ['initial evolution', 'final evolution'],
#         'model_name': 'model_name'
#     }
# ]

在单个步骤中为指令生成答案¶

from distilabel.steps.tasks import EvolInstruct
from distilabel.models import InferenceEndpointsLLM

# Consider this as a placeholder for your actual LLM.
evol_instruct = EvolInstruct(
    llm=InferenceEndpointsLLM(
        model_id="mistralai/Mistral-7B-Instruct-v0.2",
    ),
    num_evolutions=2,
    generate_answers=True,
)

evol_instruct.load()

result = next(evol_instruct.process([{"instruction": "common instruction"}]))
# result
# [
#     {
#         'instruction': 'common instruction',
#         'evolved_instruction': 'evolved instruction',
#         'answer': 'answer to the instruction',
#         'model_name': 'model_name'
#     }
# ]

EvolInstruct¶

属性¶

运行时参数¶

输入 & 输出列¶

输入¶

输出¶

示例¶

使用 LLM 进化指令¶

保留进化的迭代¶

在单个步骤中为指令生成答案¶

参考文献¶