API Settings
Display Water, CO₂, Energy Usage
This displays impact associated with your API key. All API responses contain this information so you can track your usage in a more granular manner. Please consult the "Impact" page for more information about how these metrics are calculated.
VoucherVision Cost Calculator
On average, herbarium specimens require less than 4,000 input tokens and produce less than 1,000 output tokens. All values are in $USD per 1,000,000 tokens. Estimated costs do not factor in the cost of hosting the VoucherVisionAPI server.
LLM Provider Cost Matrix
Models are grouped by provider (OpenAI, Azure, Google, Mistral, Hyperbolic, Local, Other).
VoucherVision Settings
About OCR Engine Options
Select one or two OCR engines below. Each engine has different capabilities and performance characteristics. NOTE: The OCR models are instructed to detect handwritten text and stricken text to inform the parsing model of these features. This is accomplished by having the OCR model place section signs on either side of stricken text (§stricken text§) and guillemet quotes on either side of handwritten text («handwritten text»). These features are removed/cleaned from the API response prior to returning it to you, only the parsing LLM sees them. All prompts that want to take advantage of this tool should include the following instructions in their prompt: """Redacted or stricken text will have section signs on either side (§stricken text§). Handwritten text will have guillemet quotes on either side («handwritten text»)"""
About the models:
- Gemini 3 Pro: An extremely capable model. By far the best model for reading handwritten labels, beating out all other OCR solutions. Try the SLTPvM_geolocate prompt!
- Gemini 2.5 Pro: High accuracy, especially for handwritten text, can follow complex instructions, like parsing multiple species determinations. Can perform geolocation, search the internet to validate content, etc.
- Gemini 2.5 Flash: Good, but not as good as 2.0 Flash. Handwriting performance seems worse than Gemini 2.0 Flash
- Gemini 2.0 Flash: Fast processing with good accuracy for most specimens. The workhorse model. Start here.
- OCR Only: Extract text only (check box below)
OCR Engines (can select one or multiple):
Bypasses TextCollage and uses the uploaded image directly. Produces markdown files.
LLM Model for creating JSON:
Please use gemini-2.0-flash and reserve gemini-2.5-pro for the prompts that are optimized to take full advantage of its capabilities (e.g. SLTPvM_geolocate_flag_multispecimen.yaml).
SLTPvM_geolocate_flag_multispecimen.yaml will flag sheets that have more than one specimen (barcodes) and will automatically geolocate specimens that lack GPS coordinates in the label text
World Flora Online (WFO) Validation:
Enable taxonomic validation against the World Flora Online database for plant specimens. This adds botanical name verification and taxonomic information to the results.
Prompt Template:
Gemini-3 optimized prompts are purple
Test with File Upload
Test with Image URL
Batch Process URLs
Upload a text file (.txt) with one URL per line or a CSV file (.csv) with URLs in a column.
Example CSV file format:
| url | description | category |
|---|---|---|
| https://example.com/image1.jpg | Specimen 1 | Category A |
| https://example.com/image2.jpg | Specimen 2 | Category B |
| https://example.com/image3.jpg | Specimen 3 | Category C |
Batch Process Local Images
Drop image files here or