Releases: copyleftdev/pdfvec
Releases · copyleftdev/pdfvec
v0.1.1
Documentation Release
Added
README.mdwith full documentation, usage examples, and benchmarksLICENSE-MIT- MIT license textLICENSE-APACHE- Apache 2.0 license text
Fixed
- Repository URL in Cargo.toml now points to correct location
Install
[dependencies]
pdfvec = "0.1.1"cargo install pdfvecFull Changelog: v0.1.0...v0.1.1
v0.1.0
Initial Release
High-performance PDF text extraction for vectorization pipelines.
Features
- PDF Text Extraction - Parallel and streaming modes
- Structured API - Document and Page abstractions
- Text Chunking - Fixed, paragraph, and sentence strategies
- Metadata Extraction - Title, author, dates, page count
- CLI Tool -
pdfvec extractandpdfvec metadatacommands
Performance
| File Size | pdfvec | pdf-extract | Speedup |
|---|---|---|---|
| 33 KB | 818 µs | 12.7 ms | 15x |
| 94 KB | 1.5 ms | 83 ms | 55x |
| 422 KB | 3.1 ms | 439 ms | 143x |
Throughput: 40-134 MiB/s
Install
[dependencies]
pdfvec = "0.1.0"cargo install pdfvec