Skip to content

[EPIC] Command-Line Interface #3

@copyleftdev

Description

@copyleftdev

Depends On: #1, #2

PDFVEC-004

Goals

  • Sub-second startup time
  • Unix pipeline friendly (stdin/stdout)
  • Parallel processing of multiple files
  • Progress indication for large jobs

Build a fast, user-friendly CLI tool for extracting text from PDFs. Should support single files, directories, and stdin/stdout for pipeline integration.

Acceptance Criteria

AC-1

  • Given A PDF file path
  • When pdfvec extract file.pdf is run
  • Then Text is printed to stdout

AC-2

  • Given PDF data on stdin
  • When cat file.pdf | pdfvec extract - is run
  • Then Text is printed to stdout

AC-3

  • Given A directory of PDFs
  • When pdfvec extract --recursive dir/ is run
  • Then All PDFs are processed in parallel

Technical Context

Crates: clap, indicatif, rayon

Files:

  • src/bin/pdfvec.rs
  • src/cli/mod.rs
  • src/cli/extract.rs

Source: epics/03-cli.json
Content Hash: 87fae0ee33336c46

Child Issues: PDFVEC-040, PDFVEC-041

Metadata

Metadata

Assignees

No one assigned

    Labels

    component:cliCommand-line interfaceepicLarge feature containing multiple storiespriority:mediumNormal priority

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions