LlamaCppEmbeddings¶

用于 embedding 生成的 LlamaCpp 库实现。

Attributes¶

model_name: 包含 GGUF 量化模型的名称，与已安装的 llama.cpp Python 绑定版本兼容。
model_path: 包含 GGUF 量化模型的路径，与已安装的 llama.cpp Python 绑定版本兼容。
repo_id: Hugging Face Hub 仓库 ID。
verbose: 是否打印详细输出。默认为 False。
n_gpu_layers: 在 GPU 上运行的层数。默认为 -1 (如果可用则使用 GPU)。
disable_cuda_device_placement: 是否禁用 CUDA 设备放置。默认为 True。
normalize_embeddings: 是否标准化 embeddings。默认为 False。
seed: RNG 种子，-1 表示随机
n_ctx: 文本上下文，0 = 来自模型
n_batch: Prompt 处理最大批次大小
extra_kwargs: 将传递给 llama_cpp 库的 Llama 类的其他关键字参数字典。默认为 {}。

运行时参数¶

n_gpu_layers: 用于 GPU 的层数。默认为 -1。
verbose: 是否打印详细输出。默认为 False。
normalize_embeddings: 是否标准化 embeddings。默认为 False。
extra_kwargs: 将传递给 llama_cpp 库的 Llama 类的其他关键字参数字典。默认为 {}。

示例¶

使用本地模型生成句子 embeddings¶

from pathlib import Path
from distilabel.models.embeddings import LlamaCppEmbeddings

# You can follow along this example downloading the following model running the following
# command in the terminal, that will download the model to the `Downloads` folder:
# curl -L -o ~/Downloads/all-MiniLM-L6-v2-Q2_K.gguf https://hugging-face.cn/second-state/All-MiniLM-L6-v2-Embedding-GGUF/resolve/main/all-MiniLM-L6-v2-Q2_K.gguf

model_path = "Downloads/"
model = "all-MiniLM-L6-v2-Q2_K.gguf"
embeddings = LlamaCppEmbeddings(
    model=model,
    model_path=str(Path.home() / model_path),
)

embeddings.load()

results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
print(results)
embeddings.unload()

使用 HuggingFace Hub 模型生成句子 embeddings¶

from distilabel.models.embeddings import LlamaCppEmbeddings
# You need to set environment variable to download private model to the local machine

repo_id = "second-state/All-MiniLM-L6-v2-Embedding-GGUF"
model = "all-MiniLM-L6-v2-Q2_K.gguf"
embeddings = LlamaCppEmbeddings(model=model,repo_id=repo_id)

embeddings.load()

results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
print(results)
embeddings.unload()
# [
#   [-0.05447685346007347, -0.01623094454407692, ...],
#   [4.4889533455716446e-05, 0.044016145169734955, ...],
# ]

使用 cpu 生成句子 embeddings¶

from pathlib import Path
from distilabel.models.embeddings import LlamaCppEmbeddings

# You can follow along this example downloading the following model running the following
# command in the terminal, that will download the model to the `Downloads` folder:
# curl -L -o ~/Downloads/all-MiniLM-L6-v2-Q2_K.gguf https://hugging-face.cn/second-state/All-MiniLM-L6-v2-Embedding-GGUF/resolve/main/all-MiniLM-L6-v2-Q2_K.gguf

model_path = "Downloads/"
model = "all-MiniLM-L6-v2-Q2_K.gguf"
embeddings = LlamaCppEmbeddings(
    model=model,
    model_path=str(Path.home() / model_path),
    n_gpu_layers=0,
    disable_cuda_device_placement=True,
)

embeddings.load()

results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
print(results)
embeddings.unload()
# [
#   [-0.05447685346007347, -0.01623094454407692, ...],
#   [4.4889533455716446e-05, 0.044016145169734955, ...],
# ]

参考资料¶

离线推理 embeddings