EvolInstruct¶
使用 LLM
进化指令。
WizardLM: Empowering Large Language Models to Follow Complex Instructions
属性¶
-
num_evolutions: 要执行的进化次数。
-
store_evolutions: 是否存储所有进化过程,还是仅存储最后一个。默认为
False
。 -
generate_answers: 是否为进化的指令生成答案。默认为
False
。 -
include_original_instruction: 是否在
evolved_instructions
输出列中包含原始指令。默认为False
。 -
mutation_templates: 用于进化指令的突变模板。默认为
utils.py
文件中提供的模板。 -
seed: 为
numpy
设置的种子,以便随机选择突变方法。默认为42
。
运行时参数¶
- seed: 为
numpy
设置的种子,以便随机选择突变方法。
输入 & 输出列¶
graph TD
subgraph Dataset
subgraph Columns
ICOL0[instruction]
end
subgraph New columns
OCOL0[evolved_instruction]
OCOL1[evolved_instructions]
OCOL2[model_name]
OCOL3[answer]
OCOL4[answers]
end
end
subgraph EvolInstruct
StepInput[Input Columns: instruction]
StepOutput[Output Columns: evolved_instruction, evolved_instructions, model_name, answer, answers]
end
ICOL0 --> StepInput
StepOutput --> OCOL0
StepOutput --> OCOL1
StepOutput --> OCOL2
StepOutput --> OCOL3
StepOutput --> OCOL4
StepInput --> StepOutput
输入¶
- instruction (
str
): 要进化的指令。
输出¶
-
evolved_instruction (
str
): 如果store_evolutions=False
,则为进化的指令。 -
evolved_instructions (
List[str]
): 如果store_evolutions=True
,则为进化的指令列表。 -
model_name (
str
): 用于进化指令的 LLM 的名称。 -
answer (
str
): 如果generate_answers=True
且store_evolutions=False
,则为进化指令的答案。 -
answers (
List[str]
): 如果generate_answers=True
且store_evolutions=True
,则为进化指令的答案列表。
示例¶
使用 LLM 进化指令¶
from distilabel.steps.tasks import EvolInstruct
from distilabel.models import InferenceEndpointsLLM
# Consider this as a placeholder for your actual LLM.
evol_instruct = EvolInstruct(
llm=InferenceEndpointsLLM(
model_id="mistralai/Mistral-7B-Instruct-v0.2",
),
num_evolutions=2,
)
evol_instruct.load()
result = next(evol_instruct.process([{"instruction": "common instruction"}]))
# result
# [{'instruction': 'common instruction', 'evolved_instruction': 'evolved instruction', 'model_name': 'model_name'}]
保留进化的迭代¶
from distilabel.steps.tasks import EvolInstruct
from distilabel.models import InferenceEndpointsLLM
# Consider this as a placeholder for your actual LLM.
evol_instruct = EvolInstruct(
llm=InferenceEndpointsLLM(
model_id="mistralai/Mistral-7B-Instruct-v0.2",
),
num_evolutions=2,
store_evolutions=True,
)
evol_instruct.load()
result = next(evol_instruct.process([{"instruction": "common instruction"}]))
# result
# [
# {
# 'instruction': 'common instruction',
# 'evolved_instructions': ['initial evolution', 'final evolution'],
# 'model_name': 'model_name'
# }
# ]
在单个步骤中为指令生成答案¶
from distilabel.steps.tasks import EvolInstruct
from distilabel.models import InferenceEndpointsLLM
# Consider this as a placeholder for your actual LLM.
evol_instruct = EvolInstruct(
llm=InferenceEndpointsLLM(
model_id="mistralai/Mistral-7B-Instruct-v0.2",
),
num_evolutions=2,
generate_answers=True,
)
evol_instruct.load()
result = next(evol_instruct.process([{"instruction": "common instruction"}]))
# result
# [
# {
# 'instruction': 'common instruction',
# 'evolved_instruction': 'evolved instruction',
# 'answer': 'answer to the instruction',
# 'model_name': 'model_name'
# }
# ]