VoucherVisionGO API

API Settings

Display Water, CO₂, Energy Usage

This displays impact associated with your API key. All API responses contain this information so you can track your usage in a more granular manner. Please consult the "Impact" page for more information about how these metrics are calculated.

VoucherVision Cost Calculator

On average, herbarium specimens require less than 4,000 input tokens and produce less than 1,000 output tokens. All values are in $USD per 1,000,000 tokens. Estimated costs do not factor in the cost of hosting the VoucherVisionAPI server.

LLM Provider Cost Matrix

Models are grouped by provider (OpenAI, Azure, Google, Mistral, Hyperbolic, Local, Other).

VoucherVision Settings

About OCR Engine Options

Select one or two OCR engines below. Each engine has different capabilities and performance characteristics. NOTE: The OCR models are instructed to detect handwritten text and stricken text to inform the parsing model of these features. This is accomplished by having the OCR model place section signs on either side of stricken text (§stricken text§) and guillemet quotes on either side of handwritten text («handwritten text»). These features are removed/cleaned from the API response prior to returning it to you, only the parsing LLM sees them. All prompts that want to take advantage of this tool should include the following instructions in their prompt: """Redacted or stricken text will have section signs on either side (§stricken text§). Handwritten text will have guillemet quotes on either side («handwritten text»)"""

About the models:

  • Gemini 3 Pro: An extremely capable model. By far the best model for reading handwritten labels, beating out all other OCR solutions. Try the SLTPvM_geolocate prompt!
  • Gemini 2.5 Pro: High accuracy, especially for handwritten text, can follow complex instructions, like parsing multiple species determinations. Can perform geolocation, search the internet to validate content, etc.
  • Gemini 2.5 Flash: Good, but not as good as 2.0 Flash. Handwriting performance seems worse than Gemini 2.0 Flash
  • Gemini 2.0 Flash: Fast processing with good accuracy for most specimens. The workhorse model. Start here.
  • OCR Only: Extract text only (check box below)

OCR Engines (can select one or multiple):

Bypasses TextCollage and uses the uploaded image directly. Produces markdown files.

LLM Model for creating JSON:

Please use gemini-2.0-flash and reserve gemini-2.5-pro for the prompts that are optimized to take full advantage of its capabilities (e.g. SLTPvM_geolocate_flag_multispecimen.yaml).

SLTPvM_geolocate_flag_multispecimen.yaml will flag sheets that have more than one specimen (barcodes) and will automatically geolocate specimens that lack GPS coordinates in the label text

World Flora Online (WFO) Validation:

Enable taxonomic validation against the World Flora Online database for plant specimens. This adds botanical name verification and taxonomic information to the results.

Prompt Template:

Gemini-3 optimized prompts are purple

File Upload
Image URL
Batch URLs
Batch Folder

Test with File Upload

Image Preview

Test with Image URL

Batch Process URLs

Upload a text file (.txt) with one URL per line or a CSV file (.csv) with URLs in a column.

Example CSV file format:

url description category
https://example.com/image1.jpg Specimen 1 Category A
https://example.com/image2.jpg Specimen 2 Category B
https://example.com/image3.jpg Specimen 3 Category C
8

Batch Process Local Images

Drop image files here or

8

🗺️ Specimen Location Map (Last Processed)

⚠️ Waiting for results with coordinates...

Debug Information