跳到内容

LlamaCppEmbeddings

用于 embedding 生成的 LlamaCpp 库实现。

Attributes

  • model_name: 包含 GGUF 量化模型的名称,与已安装的 llama.cpp Python 绑定版本兼容。

  • model_path: 包含 GGUF 量化模型的路径,与已安装的 llama.cpp Python 绑定版本兼容。

  • repo_id: Hugging Face Hub 仓库 ID。

  • verbose: 是否打印详细输出。默认为 False

  • n_gpu_layers: 在 GPU 上运行的层数。默认为 -1 (如果可用则使用 GPU)。

  • disable_cuda_device_placement: 是否禁用 CUDA 设备放置。默认为 True

  • normalize_embeddings: 是否标准化 embeddings。默认为 False

  • seed: RNG 种子,-1 表示随机

  • n_ctx: 文本上下文,0 = 来自模型

  • n_batch: Prompt 处理最大批次大小

  • extra_kwargs: 将传递给 llama_cpp 库的 Llama 类的其他关键字参数字典。默认为 {}

运行时参数

  • n_gpu_layers: 用于 GPU 的层数。默认为 -1

  • verbose: 是否打印详细输出。默认为 False

  • normalize_embeddings: 是否标准化 embeddings。默认为 False

  • extra_kwargs: 将传递给 llama_cpp 库的 Llama 类的其他关键字参数字典。默认为 {}

示例

使用本地模型生成句子 embeddings

from pathlib import Path
from distilabel.models.embeddings import LlamaCppEmbeddings

# You can follow along this example downloading the following model running the following
# command in the terminal, that will download the model to the `Downloads` folder:
# curl -L -o ~/Downloads/all-MiniLM-L6-v2-Q2_K.gguf https://hugging-face.cn/second-state/All-MiniLM-L6-v2-Embedding-GGUF/resolve/main/all-MiniLM-L6-v2-Q2_K.gguf

model_path = "Downloads/"
model = "all-MiniLM-L6-v2-Q2_K.gguf"
embeddings = LlamaCppEmbeddings(
    model=model,
    model_path=str(Path.home() / model_path),
)

embeddings.load()

results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
print(results)
embeddings.unload()

使用 HuggingFace Hub 模型生成句子 embeddings

from distilabel.models.embeddings import LlamaCppEmbeddings
# You need to set environment variable to download private model to the local machine

repo_id = "second-state/All-MiniLM-L6-v2-Embedding-GGUF"
model = "all-MiniLM-L6-v2-Q2_K.gguf"
embeddings = LlamaCppEmbeddings(model=model,repo_id=repo_id)

embeddings.load()

results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
print(results)
embeddings.unload()
# [
#   [-0.05447685346007347, -0.01623094454407692, ...],
#   [4.4889533455716446e-05, 0.044016145169734955, ...],
# ]

使用 cpu 生成句子 embeddings

from pathlib import Path
from distilabel.models.embeddings import LlamaCppEmbeddings

# You can follow along this example downloading the following model running the following
# command in the terminal, that will download the model to the `Downloads` folder:
# curl -L -o ~/Downloads/all-MiniLM-L6-v2-Q2_K.gguf https://hugging-face.cn/second-state/All-MiniLM-L6-v2-Embedding-GGUF/resolve/main/all-MiniLM-L6-v2-Q2_K.gguf

model_path = "Downloads/"
model = "all-MiniLM-L6-v2-Q2_K.gguf"
embeddings = LlamaCppEmbeddings(
    model=model,
    model_path=str(Path.home() / model_path),
    n_gpu_layers=0,
    disable_cuda_device_placement=True,
)

embeddings.load()

results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
print(results)
embeddings.unload()
# [
#   [-0.05447685346007347, -0.01623094454407692, ...],
#   [4.4889533455716446e-05, 0.044016145169734955, ...],
# ]

参考资料