ComplexityScorer¶

使用 LLM 根据指令的复杂性对其进行评分。

ComplexityScorer 是一个预定义的任务，用于根据指令的复杂性对指令列表进行排名。它是论文“什么使对齐的数据更好？指令调优中自动数据选择的综合研究”中复杂性评分任务的实现。

属性¶

_template: 用于格式化 LLM 输入的 Jinja2 模板。

输入和输出列¶

graph TD
    subgraph Dataset
        subgraph Columns
            ICOL0[instructions]
        end
        subgraph New columns
            OCOL0[scores]
            OCOL1[model_name]
        end
    end

    subgraph ComplexityScorer
        StepInput[Input Columns: instructions]
        StepOutput[Output Columns: scores, model_name]
    end

    ICOL0 --> StepInput
    StepOutput --> OCOL0
    StepOutput --> OCOL1
    StepInput --> StepOutput

输入¶

instructions (List[str]): 要评分的指令列表。

输出¶

scores (List[float]): 每个指令的分数。
model_name (str): 用于生成分数的模型名称。

示例¶

评估指令的复杂性¶

from distilabel.steps.tasks import ComplexityScorer
from distilabel.models import InferenceEndpointsLLM

# Consider this as a placeholder for your actual LLM.
scorer = ComplexityScorer(
    llm=InferenceEndpointsLLM(
        model_id="mistralai/Mistral-7B-Instruct-v0.2",
    )
)

scorer.load()

result = next(
    scorer.process(
        [{"instructions": ["plain instruction", "highly complex instruction"]}]
    )
)
# result
# [{'instructions': ['plain instruction', 'highly complex instruction'], 'model_name': 'test', 'scores': [1, 5], 'distilabel_metadata': {'raw_output_complexity_scorer_0': 'output'}}]

使用默认模式生成结构化输出¶

from distilabel.steps.tasks import ComplexityScorer
from distilabel.models import InferenceEndpointsLLM

# Consider this as a placeholder for your actual LLM.
scorer = ComplexityScorer(
    llm=InferenceEndpointsLLM(
        model_id="mistralai/Mistral-7B-Instruct-v0.2",
    ),
    use_default_structured_output=use_default_structured_output
)

scorer.load()

result = next(
    scorer.process(
        [{"instructions": ["plain instruction", "highly complex instruction"]}]
    )
)
# result
# [{'instructions': ['plain instruction', 'highly complex instruction'], 'model_name': 'test', 'scores': [1, 2], 'distilabel_metadata': {'raw_output_complexity_scorer_0': '{ \n  "scores": [\n    1, \n    2\n  ]\n}'}}]

参考文献¶

什么使对齐的数据更好？指令调优中自动数据选择的综合研究