Sunbelt Computer Software

ViTSTR: Vision Transformer for Fast and Efficient Scene Text Recognition

Build

mkdir -p build && cd build
cmake ..
make -j4

Usage

./bin/vitstr -t 4 -m ../ggml-model-f16.gguf -i ../images/demo_1.png 
main: seed = 1706997535
main: n_threads = 4 / 8
vit_model_load: loading model from '../ggml-model-f16.gguf' - please wait
vit_model_load: hidden_size            = 768
vit_model_load: num_hidden_layers      = 12
vit_model_load: num_attention_heads    = 12
vit_model_load: patch_size             = 16
vit_model_load: img_size               = 224
vit_model_load: num_classes            = 96
vit_model_load: ftype                  = 1
vit_model_load: qntvr                  = 0
operator(): ggml ctx size = 164.48 MB
vit_model_load: ................... done
vit_model_load: model size =   163.56 MB / num tensors = 152
main: loaded image '../images/demo_1.png' (184 x 72)
processed, out dims : (224 x 224)
------------------ 
Available
score : 1.00 
------------------ 


main:    model load time =   144.64 ms
main:    processing time =  1176.77 ms
main:    total time      =  1321.41 ms

Name		Name	Last commit message	Last commit date
parent directory ..
images		images
CMakeLists.txt		CMakeLists.txt
README.md		README.md
convert-pth-to-ggml.py		convert-pth-to-ggml.py
main.cpp		main.cpp
quantize.cpp		quantize.cpp
vitstr.cpp		vitstr.cpp
vitstr.h		vitstr.h

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

ViTSTR: Vision Transformer for Fast and Efficient Scene Text Recognition

Build

Usage

Sunbelt Computer Software

PL/B Language Development and Support

FilesExpand file tree

vitstr.cpp

Directory actions

More options

Directory actions

More options

Latest commit

History

vitstr.cpp

Folders and files

parent directory

README.md

ViTSTR: Vision Transformer for Fast and Efficient Scene Text Recognition

Build

Usage