EvolQuality¶

使用 LLM 进化响应的质量。

EvolQuality 任务用于进化给定提示的响应质量，通过使用语言模型生成新的响应。此步骤实现了论文“What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning”中的进化质量任务。

属性¶

num_evolutions: 要对响应执行的进化次数。
store_evolutions: 是否存储所有进化的响应，还是仅存储最后一个。默认为 False。
include_original_response: 是否在进化的响应中包含原始响应。默认为 False。
mutation_templates: 用于进化响应的突变模板。
seed: 为 numpy 设置的种子，以便随机选择突变方法。默认为 42。

运行时参数¶

seed: 为 numpy 设置的种子，以便随机选择突变方法。

输入 & 输出列¶

graph TD
    subgraph Dataset
        subgraph Columns
            ICOL0[instruction]
            ICOL1[response]
        end
        subgraph New columns
            OCOL0[evolved_response]
            OCOL1[evolved_responses]
            OCOL2[model_name]
        end
    end

    subgraph EvolQuality
        StepInput[Input Columns: instruction, response]
        StepOutput[Output Columns: evolved_response, evolved_responses, model_name]
    end

    ICOL0 --> StepInput
    ICOL1 --> StepInput
    StepOutput --> OCOL0
    StepOutput --> OCOL1
    StepOutput --> OCOL2
    StepInput --> StepOutput

输入¶

instruction (str): 用于生成 responses 的指令。
response (str): 要重写的响应。

输出¶

evolved_response (str): 如果 store_evolutions=False，则为进化的响应。
evolved_responses (List[str]): 如果 store_evolutions=True，则为进化的响应列表。
model_name (str): 用于进化响应的 LLM 的名称。

示例¶

进化给定提示的响应质量¶

from distilabel.steps.tasks import EvolQuality
from distilabel.models import InferenceEndpointsLLM

# Consider this as a placeholder for your actual LLM.
evol_quality = EvolQuality(
    llm=InferenceEndpointsLLM(
        model_id="mistralai/Mistral-7B-Instruct-v0.2",
    ),
    num_evolutions=2,
)

evol_quality.load()

result = next(
    evol_quality.process(
        [
            {"instruction": "common instruction", "response": "a response"},
        ]
    )
)
# result
# [
#     {
#         'instruction': 'common instruction',
#         'response': 'a response',
#         'evolved_response': 'evolved response',
#         'model_name': '"mistralai/Mistral-7B-Instruct-v0.2"'
#     }
# ]

参考文献¶

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning