Sunbelt Computer Software

bert.cpp

This is a ggml implementation of the BERT embedding architecture. It supports inference on both CPU and CUDA in floating point and a wide variety of quantization schemes. Includes Python bindings for batched inference.

This repo is a fork of original bert.cpp as well as embeddings.cpp. Thanks to both of you!

Install

Fetch this respository then download submodules and install packages with

git submodule update --init --recursive
pip install -r requirements.txt

To fetch models from huggingface and convert them to gguf format run the following

cd models
python download-repo.py BAAI/bge-base-en-v1.5 # or any other model
python convert-to-ggml.py BAAI/bge-base-en-v1.5 f16
python convert-to-ggml.py BAAI/bge-base-en-v1.5 f32

Build

To build the dynamic library for usage from Python

cmake -B build .
make -C build -j

If you're compiling for GPU, you should run

cmake -DGGML_CUBLAS=ON -B build .
make -C build -j

On some distros, you also need to specifiy the host C++ compiler. To do this, I suggest setting the CUDAHOSTCXX environment variable to your C++ bindir.

And for Apple Metal, you should run

cmake -DGGML_METAL=ON -B build .
make -C build -j

Excecute

All executables are placed in build/bin. To run inference on a given text, run

build/bin/main -m models/bge-base-en-v1.5/ggml-model-f16.gguf -p "Hello world"

To force CPU usage, add the flag -c.

Python

You can also run everything through Python, which is particularly useful for batch inference. For instance,

import bert
mod = bert.BertModel('models/bge-base-en-v1.5/ggml-model-f16.gguf')
emb = mod.embed(batch)

where batch is a list of strings and emb is a numpy array of embedding vectors.

Quantize

You can quantize models with the command

build/bin/quantize models/bge-base-en-v1.5/ggml-model-f32.gguf models/bge-base-en-v1.5/ggml-model-q8_0.gguf q8_0

or whatever your desired quantization level is. Currently supported values are: q8_0, q5_0, q5_1, q4_0, and q4_1. You can then pass these model files directly to main as above.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
examples		examples
ggml @ 6b14d73		ggml @ 6b14d73
models		models
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
bert.cpp		bert.cpp
bert.h		bert.h
bert.py		bert.py
requirements.txt		requirements.txt

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bert.cpp

Install

Build

Excecute

Python

Quantize

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Sunbelt Computer Software

PL/B Language Development and Support

Folders and files

Latest commit

History

Repository files navigation

bert.cpp

Install

Build

Excecute

Python

Quantize

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages