APIGenSemanticChecker¶

为给定 JSON 格式的函数生成查询和答案。

APIGenGenerator 的灵感来自 APIGen pipeline，它旨在生成可验证和多样化的函数调用数据集。该 task 为给定 JSON 格式的函数生成一组多样化的查询和相应的答案。

属性¶

system_prompt: task 的系统提示。有一个默认提示。
exclude_failed_execution: 是否排除执行失败的情况（不会在 keep_row_after_execution_check 列中值为 False 的行上运行，该列来自运行 APIGenExecutionChecker）。默认为 True。

输入 & 输出列¶

graph TD
    subgraph Dataset
        subgraph Columns
            ICOL0[func_desc]
            ICOL1[query]
            ICOL2[answers]
            ICOL3[execution_result]
        end
        subgraph New columns
            OCOL0[thought]
            OCOL1[keep_row_after_semantic_check]
        end
    end

    subgraph APIGenSemanticChecker
        StepInput[Input Columns: func_desc, query, answers, execution_result]
        StepOutput[Output Columns: thought, keep_row_after_semantic_check]
    end

    ICOL0 --> StepInput
    ICOL1 --> StepInput
    ICOL2 --> StepInput
    ICOL3 --> StepInput
    StepOutput --> OCOL0
    StepOutput --> OCOL1
    StepInput --> StepOutput

输入¶

func_desc (str): 函数应该做什么的描述。
query (str): 用户的指令。
answers (str): JSON 编码的列表，包含要传递给函数/API 的参数。应使用 json.loads 加载。
execution_result (str): 函数/API 执行的结果。

输出¶

thought (str): 关于是否保留此输出的推理。
keep_row_after_semantic_check (bool): True 或 False，可用于后续过滤。

示例¶

生成函数调用的语义检查器（原始实现）¶

from distilabel.steps.tasks import APIGenSemanticChecker
from distilabel.models import InferenceEndpointsLLM

llm=InferenceEndpointsLLM(
    model_id="meta-llama/Meta-Llama-3.1-70B-Instruct",
    generation_kwargs={
        "temperature": 0.7,
        "max_new_tokens": 1024,
    },
)
semantic_checker = APIGenSemanticChecker(
    use_default_structured_output=False,
    llm=llm
)
semantic_checker.load()

res = next(
    semantic_checker.process(
        [
            {
                "func_desc": "Fetch information about a specific cat breed from the Cat Breeds API.",
                "query": "What information can be obtained about the Maine Coon cat breed?",
                "answers": json.dumps([{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]),
                "execution_result": "The Maine Coon is a big and hairy breed of cat",
            }
        ]
    )
)
res
# [{'func_desc': 'Fetch information about a specific cat breed from the Cat Breeds API.',
# 'query': 'What information can be obtained about the Maine Coon cat breed?',
# 'answers': [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}],
# 'execution_result': 'The Maine Coon is a big and hairy breed of cat',
# 'thought': '',
# 'keep_row_after_semantic_check': True,
# 'raw_input_a_p_i_gen_semantic_checker_0': [{'role': 'system',
#     'content': 'As a data quality evaluator, you must assess the alignment between a user query, corresponding function calls, and their execution results.\nThese function calls and results are generated by other models, and your task is to ensure these results accurately reflect the user’s intentions.\n\nDo not pass if:\n1. The function call does not align with the query’s objective, or the input arguments appear incorrect.\n2. The function call and arguments are not properly chosen from the available functions.\n3. The number of function calls does not correspond to the user’s intentions.\n4. The execution results are irrelevant and do not match the function’s purpose.\n5. The execution results contain errors or reflect that the function calls were not executed successfully.\n'},
#     {'role': 'user',
#     'content': 'Given Information:\n- All Available Functions:\nFetch information about a specific cat breed from the Cat Breeds API.\n- User Query: What information can be obtained about the Maine Coon cat breed?\n- Generated Function Calls: [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]\n- Execution Results: The Maine Coon is a big and hairy breed of cat\n\nNote: The query may have multiple intentions. Functions may be placeholders, and execution results may be truncated due to length, which is acceptable and should not cause a failure.\n\nThe main decision factor is wheather the function calls accurately reflect the query\'s intentions and the function descriptions.\nProvide your reasoning in the thought section and decide if the data passes (answer yes or no).\nIf not passing, concisely explain your reasons in the thought section; otherwise, leave this section blank.\n\nYour response MUST strictly adhere to the following JSON format, and NO other text MUST be included.\n```\n{\n   "thought": "Concisely describe your reasoning here",\n   "pass": "yes" or "no"\n}\n```\n'}]},
# 'model_name': 'meta-llama/Meta-Llama-3.1-70B-Instruct'}]

生成函数调用的语义检查器（结构化输出）¶

from distilabel.steps.tasks import APIGenSemanticChecker
from distilabel.models import InferenceEndpointsLLM

llm=InferenceEndpointsLLM(
    model_id="meta-llama/Meta-Llama-3.1-70B-Instruct",
    generation_kwargs={
        "temperature": 0.7,
        "max_new_tokens": 1024,
    },
)
semantic_checker = APIGenSemanticChecker(
    use_default_structured_output=True,
    llm=llm
)
semantic_checker.load()

res = next(
    semantic_checker.process(
        [
            {
                "func_desc": "Fetch information about a specific cat breed from the Cat Breeds API.",
                "query": "What information can be obtained about the Maine Coon cat breed?",
                "answers": json.dumps([{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]),
                "execution_result": "The Maine Coon is a big and hairy breed of cat",
            }
        ]
    )
)
res
# [{'func_desc': 'Fetch information about a specific cat breed from the Cat Breeds API.',
# 'query': 'What information can be obtained about the Maine Coon cat breed?',
# 'answers': [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}],
# 'execution_result': 'The Maine Coon is a big and hairy breed of cat',
# 'keep_row_after_semantic_check': True,
# 'thought': '',
# 'raw_input_a_p_i_gen_semantic_checker_0': [{'role': 'system',
#     'content': 'As a data quality evaluator, you must assess the alignment between a user query, corresponding function calls, and their execution results.\nThese function calls and results are generated by other models, and your task is to ensure these results accurately reflect the user’s intentions.\n\nDo not pass if:\n1. The function call does not align with the query’s objective, or the input arguments appear incorrect.\n2. The function call and arguments are not properly chosen from the available functions.\n3. The number of function calls does not correspond to the user’s intentions.\n4. The execution results are irrelevant and do not match the function’s purpose.\n5. The execution results contain errors or reflect that the function calls were not executed successfully.\n'},
#     {'role': 'user',
#     'content': 'Given Information:\n- All Available Functions:\nFetch information about a specific cat breed from the Cat Breeds API.\n- User Query: What information can be obtained about the Maine Coon cat breed?\n- Generated Function Calls: [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]\n- Execution Results: The Maine Coon is a big and hairy breed of cat\n\nNote: The query may have multiple intentions. Functions may be placeholders, and execution results may be truncated due to length, which is acceptable and should not cause a failure.\n\nThe main decision factor is wheather the function calls accurately reflect the query\'s intentions and the function descriptions.\nProvide your reasoning in the thought section and decide if the data passes (answer yes or no).\nIf not passing, concisely explain your reasons in the thought section; otherwise, leave this section blank.\n'}]},
# 'model_name': 'meta-llama/Meta-Llama-3.1-70B-Instruct'}]

APIGenSemanticChecker¶

属性¶

输入 & 输出列¶

输入¶

输出¶

示例¶

生成函数调用的语义检查器（原始实现）¶

生成函数调用的语义检查器（结构化输出）¶

参考¶