跳到内容

Magpie

使用指令微调的 LLM 生成对话。

Magpie 是一种巧妙的方法,它允许借助指令微调 LLM 的自回归能力,在没有种子数据或特定系统提示的情况下生成用户指令。由于它们是使用由用户消息和期望的助手输出组成的聊天模板进行微调的,因此指令微调 LLM 了解到,在预查询或预指令 token 之后,会有一个指令。如果将这些预查询 token 发送到 LLM 而没有任何用户消息,那么 LLM 将继续生成 token,就像它是用户一样。这个技巧允许从指令微调 LLM 中“提取”指令。在这个指令生成之后,它可以再次发送到 LLM,这次生成助手响应。这个过程可以重复 N 次,从而构建一个多轮对话。这种方法在论文《Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing》中进行了描述。

属性

  • n_turns: 生成的对话将具有的轮数。默认为 1

  • end_with_user: 对话是否应以用户消息结束。默认为 False

  • include_system_prompt: 是否包含在生成的对话中使用的系统提示。默认为 False

  • only_instruction: 是否仅生成指令。如果此参数为 True,则将忽略 n_turns。默认为 False

  • system_prompt: 可选的系统提示,或从中随机选择一个系统提示的列表,或从中随机选择一个系统提示的字典,或包含系统提示及其被选择概率的字典。随机系统提示将按每个输入/输出批次选择。此系统提示可用于指导指令 LLM 的生成,并引导其生成特定主题的指令。默认为 None

运行时参数

  • n_turns: 生成的对话将具有的轮数。默认为 1

  • end_with_user: 对话是否应以用户消息结束。默认为 False

  • include_system_prompt: 是否包含在生成的对话中使用的系统提示。默认为 False

  • only_instruction: 是否仅生成指令。如果此参数为 True,则将忽略 n_turns。默认为 False

  • system_prompt: 可选的系统提示,或从中随机选择一个系统提示的列表,或从中随机选择一个系统提示的字典,或包含系统提示及其被选择概率的字典。随机系统提示将按每个输入/输出批次选择。此系统提示可用于指导指令 LLM 的生成,并引导其生成特定主题的指令。

输入 & 输出列

graph TD
    subgraph Dataset
        subgraph Columns
            ICOL0[system_prompt]
        end
        subgraph New columns
            OCOL0[conversation]
            OCOL1[instruction]
            OCOL2[response]
            OCOL3[system_prompt_key]
            OCOL4[model_name]
        end
    end

    subgraph Magpie
        StepInput[Input Columns: system_prompt]
        StepOutput[Output Columns: conversation, instruction, response, system_prompt_key, model_name]
    end

    ICOL0 --> StepInput
    StepOutput --> OCOL0
    StepOutput --> OCOL1
    StepOutput --> OCOL2
    StepOutput --> OCOL3
    StepOutput --> OCOL4
    StepInput --> StepOutput

输入

  • system_prompt (str, 可选): 可选的系统提示,可以提供以指导指令 LLM 的生成,并引导其生成特定主题的指令。

输出

  • conversation (ChatType): 生成的对话,它是包含角色和消息的聊天项目列表。仅当 only_instruction=False 时。

  • instruction (str): 生成的指令,如果 only_instruction=Truen_turns==1

  • response (str): 生成的响应,如果 n_turns==1

  • system_prompt_key (str, 可选): 用于生成对话或指令的系统提示的键。仅当 system_prompt 是字典时。

  • model_name (str): 用于生成 conversationinstruction 的模型名称。

示例

使用 Llama 3 8B Instruct 和 TransformersLLM 生成指令

from distilabel.models import TransformersLLM
from distilabel.steps.tasks import Magpie

magpie = Magpie(
    llm=TransformersLLM(
        model="meta-llama/Meta-Llama-3-8B-Instruct",
        magpie_pre_query_template="llama3",
        generation_kwargs={
            "temperature": 1.0,
            "max_new_tokens": 64,
        },
        device="mps",
    ),
    only_instruction=True,
)

magpie.load()

result = next(
    magpie.process(
        inputs=[
            {
                "system_prompt": "You're a math expert AI assistant that helps students of secondary school to solve calculus problems."
            },
            {
                "system_prompt": "You're an expert florist AI assistant that helps user to erradicate pests in their crops."
            },
        ]
    )
)
# [
#     {'instruction': "That's me! I'd love some help with solving calculus problems! What kind of calculation are you most effective at? Linear Algebra, derivatives, integrals, optimization?"},
#     {'instruction': 'I was wondering if there are certain flowers and plants that can be used for pest control?'}
# ]

使用 Llama 3 8B Instruct 和 TransformersLLM 生成对话

from distilabel.models import TransformersLLM
from distilabel.steps.tasks import Magpie

magpie = Magpie(
    llm=TransformersLLM(
        model="meta-llama/Meta-Llama-3-8B-Instruct",
        magpie_pre_query_template="llama3",
        generation_kwargs={
            "temperature": 1.0,
            "max_new_tokens": 256,
        },
        device="mps",
    ),
    n_turns=2,
)

magpie.load()

result = next(
    magpie.process(
        inputs=[
            {
                "system_prompt": "You're a math expert AI assistant that helps students of secondary school to solve calculus problems."
            },
            {
                "system_prompt": "You're an expert florist AI assistant that helps user to erradicate pests in their crops."
            },
        ]
    )
)
# [
#     {
#         'conversation': [
#             {'role': 'system', 'content': "You're a math expert AI assistant that helps students of secondary school to solve calculus problems."},
#             {
#                 'role': 'user',
#                 'content': 'I'm having trouble solving the limits of functions in calculus. Could you explain how to work with them? Limits of functions are denoted by lim x→a f(x) or lim x→a [f(x)]. It is read as "the limit as x approaches a of f
# of x".'
#             },
#             {
#                 'role': 'assistant',
#                 'content': 'Limits are indeed a fundamental concept in calculus, and understanding them can be a bit tricky at first, but don't worry, I'm here to help! The notation lim x→a f(x) indeed means "the limit as x approaches a of f of
# x". What it's asking us to do is find the'
#             }
#         ]
#     },
#     {
#         'conversation': [
#             {'role': 'system', 'content': "You're an expert florist AI assistant that helps user to erradicate pests in their crops."},
#             {
#                 'role': 'user',
#                 'content': "As a flower shop owner, I'm noticing some unusual worm-like creatures causing damage to my roses and other flowers. Can you help me identify what the problem is? Based on your expertise as a florist AI assistant, I think it
# might be pests or diseases, but I'm not sure which."
#             },
#             {
#                 'role': 'assistant',
#                 'content': "I'd be delighted to help you investigate the issue! Since you've noticed worm-like creatures damaging your roses and other flowers, I'll take a closer look at the possibilities. Here are a few potential culprits: 1.
# **Aphids**: These small, soft-bodied insects can secrete a sticky substance called"
#             }
#         ]
#     }
# ]

参考