Skip to content

Releases: copyleftdev/pdfvec

v0.1.1

26 Jan 03:42

Choose a tag to compare

Documentation Release

Added

  • README.md with full documentation, usage examples, and benchmarks
  • LICENSE-MIT - MIT license text
  • LICENSE-APACHE - Apache 2.0 license text

Fixed

  • Repository URL in Cargo.toml now points to correct location

Install

[dependencies]
pdfvec = "0.1.1"
cargo install pdfvec

Full Changelog: v0.1.0...v0.1.1

v0.1.0

26 Jan 03:42

Choose a tag to compare

Initial Release

High-performance PDF text extraction for vectorization pipelines.

Features

  • PDF Text Extraction - Parallel and streaming modes
  • Structured API - Document and Page abstractions
  • Text Chunking - Fixed, paragraph, and sentence strategies
  • Metadata Extraction - Title, author, dates, page count
  • CLI Tool - pdfvec extract and pdfvec metadata commands

Performance

File Size pdfvec pdf-extract Speedup
33 KB 818 µs 12.7 ms 15x
94 KB 1.5 ms 83 ms 55x
422 KB 3.1 ms 439 ms 143x

Throughput: 40-134 MiB/s

Install

[dependencies]
pdfvec = "0.1.0"
cargo install pdfvec