Batch
Batch is AIVAX's feature for running the same AI workflow over many independent items. It turns a list of inputs into a background‑processed queue, with fixed instructions, defined model, structured output, optional validation, progress metrics, cost per item, retries, and result export.
Use Batch when you have dozens, hundreds, or thousands of records that need to undergo the same reasoning: classifying leads, extracting fields from text, enriching records, summarizing short documents, evaluating responses, moderating content, generating structured data, or invoking built-in tools for each line of a list.
What Batch solves
Processing many items with AI usually requires a queue, concurrency control, pausing for balance or limits, error handling, retries, JSON validation, cost tracking, and result export. Batch consolidates these parts in AIVAX.
In practice, it mainly solves:
- Repeatable processing: the same instruction, model, and schema are applied to all items.
- Asynchronous execution: the work continues in the background, without keeping a request open.
- Structured output: each item can be required to return an object compatible with a JSON Schema.
- Correction and validation: AIVAX tries to reprocess invalid responses and can run a second validation step.
- Operation at scale: jobs can be started, paused, resumed, monitored, filtered, cleaned, resent for retry, and exported.
- Operational control: the UI shows progress, failures, confidence, cost, and job events.
When to use
Use Batch when items can be processed independently and do not need to share memory. Good examples are one line per client, URL, product, ticket, message, short document, contract snippet, or raw record.
Batch is a good choice when:
- the same prompt applies to all items;
- you need tabular or JSON results for later consumption;
- response time can be asynchronous;
- you want to track errors and retry only the problematic items;
- you want to use built-in tools, such as web search, for each item;
- you need to measure cost, confidence, and success rate per run.
Do not use Batch for real‑time conversations, flows where one item depends on the previous item's response, document indexing for RAG, or purely deterministic tasks that do not require an AI model. To index searchable knowledge, use RAG collections. For a single immediate response to a user, use inference.
Concepts
Workflow
The workflow is the processing recipe. It defines how future items will be handled:
- title;
- processing instructions;
- model;
- expected result schema;
- enabled built-in tools;
- reasoning effort, when the model supports it;
- validation instructions;
- consecutive error limit before pausing, from 1 to 100;
- maximum retries per item, from 0 to 10.
Changing a workflow affects subsequent jobs and items processed with that configuration. Use separate workflows when the instruction, schema, model, or validation rules change in a significant way.
Job
A job is a concrete execution created from a workflow. It groups the items of a workload, maintains state, events, and metrics.
A job can be:
Active: processing pending items;Paused: stopped manually or paused due to limit, balance, temporary unavailability, or many consecutive errors;Finished: completed because all items were processed or because it was terminated.
Item
An item is a row from the imported list. Each row becomes an independent input sent to the model with the workflow's instructions.
An item can end as:
Finished: processed successfully;Refused: the model rejected the input;ExecutionError: there was an execution or inference error;ValidationError: the response did not pass the schema or validation;Cancelled: the item was cancelled/removed;Pending: still awaiting processing.
Each item can also record priority, output, confidence, cost, and validation details.
How to use in the console
In the AIVAX console, go to Batch.
Create a workflow
In Workflows, create a workflow and configure:
- Basic: set a title, the processing instruction, and the JSON Schema of the result.
- Model: choose an integrated model available in the account, the reasoning effort, and the built-in tools the model may use.
- Validation: enable a second validation pass when the response needs to be checked against business rules.
- Handling: adjust the consecutive error limit and the maximum retries per item.
Write the instruction as a general rule, not as a single question. The imported item will be the variable input.
Instruction example:
Classify the company provided in the input. Return the likely sector, a short justification, and signals found in the text. If the input does not contain enough information, use sector "Undefined".
Schema example:
{
"type": "object",
"properties": {
"sector": { "type": "string" },
"reason": { "type": "string" },
"signals": {
"type": "array",
"items": { "type": "string" }
}
},
"required": ["sector", "reason", "signals"],
"additionalProperties": false
}
Create and run a job
After creating the workflow, create a job for the load you want to process. New jobs are created in Paused state so you can import and inspect the workload before starting processing.
You can import items in four modes:
lines: reads one uploaded text file and imports each non-empty line as one item.files: imports each uploaded plain-text file as one item.zip: imports each plain-text entry in an uploaded ZIP file as one item.text: imports the submitted text field as a single item.
Lines can be plain text, delimited CSV, URLs, IDs, compact JSON, or any format the instruction knows how to interpret. For structured line-based inputs, prefer JSONL: one JSON object per line.
Example:
{"name":"Company A","description":"B2B auto-parts marketplace"}
{"name":"Company B","description":"Office specializing in employment contracts"}
{"name":"Company C","description":"Regional pharmacy chain"}
With the items imported, start the job. The job screen lets you monitor:
- overall progress;
- pending, completed, and failed items;
- cost already charged and projected cost;
- average confidence;
- job events;
- most recent processed items;
- full item list with filters by state and confidence.
Operate on failed items
Use the list filters to find items with execution error, validation error, refusal, or low confidence. Then you can:
- retry all errors;
- retry only execution errors;
- retry only validation errors;
- retry completed items with low confidence;
- remove pending, completed, error, or all non‑running items;
- open an individual item to review input, output, state, confidence, and cost.
Export results
When the job finishes, export the results in JSONL. Each exported line contains metadata, the original input, and the output. Use this export to import into a spreadsheet, database, data pipeline, or manual review step.
How to use via the API
Use the API when you want to integrate Batch into your internal system, data pipeline, or automation. Authentication follows the same pattern as the AIVAX API.
The API flow is the same as the console flow, only expressed as separate operations. First create the workflow, which is the reusable recipe. Then create a job, import the items, and start the job when the workload is ready. After processing begins, use the listing, retry, cleanup, and export endpoints to operate the job without losing track of individual records.
If you are still deciding whether Batch is the right feature, compare it with RAG collections and direct inference. Batch is for repeated reasoning over independent items. RAG collections are for searchable knowledge that should be retrieved later. Direct inference is for one immediate answer.
Create workflow
Create a workflow when you want to save the processing rule that future jobs will reuse. This is where you define the instruction, model, output schema, validation behavior, retries, and enabled tools. A good workflow reads like a policy for every item, not like a one-time prompt for a single record.
Create job
Create a job when you have a concrete workload to run through an existing workflow. Jobs are created paused on purpose: this gives your system a chance to import and inspect items before spending credits on processing.
Jobs are created paused. Import the items before starting.
Import items
Import items after the job exists. Each imported item becomes one independent unit of work, so choose the mode that best matches your source data: one line per record, one file per record, one ZIP entry per record, or one submitted text as a single record.
Set mode to lines, files, zip, or text. If omitted, the API uses lines.
In lines mode, empty lines are skipped and each non-empty line becomes a pending item. The uploaded file field is items; documents is also accepted as an alias. In files and zip modes, only plain-text files are accepted. The current limits are 1,000 files or ZIP entries per request, 10 MB per file or ZIP entry, and 100 MB total imported content.
Start, pause, or finish
Start the job only after the item list looks correct. Pause it when you need to stop spending temporarily, investigate errors, or adjust operations around balance and limits. Finish it when the job should be terminated rather than resumed.
Use Paused to pause and Finished to terminate.
Monitor
Monitoring is how you decide whether the workflow is healthy. The job view gives the overall state; the item list tells you where the work is getting stuck, which items failed validation, and which low-confidence results deserve human review.
To list items:
Useful filters:
state=Pending,Finished,Refused,ExecutionError,ValidationErrororCancelled;confidence=highfor confidence ≥ 80%;confidence=lowfor confidence < 30%;filter=textto search in the input;limit=100to adjust the list size within the allowed limit.
Retry and clean
Retries are best used after you understand the failure shape. Retry execution errors when the provider or request failed, validation errors when the response can likely be regenerated into the expected shape, and low-confidence results when the item succeeded but deserves another model attempt. Cleanup endpoints are for removing non-running items from a job when they are no longer useful.
Retry modes:
errors;execution-error;validation-error;low-confidence.
Retry changes matching, non-running items back to Pending. If at least one item is retried and the job is not already active, the job is started automatically.
To remove non‑running items:
Removal modes:
pending;finished;errors;all.
Removal only deletes non-running items. The errors mode includes ExecutionError, ValidationError, and Refused.
Export
Export is the handoff point from AIVAX back into your own workflow. Use it after the job finishes, or export only a subset when a review process needs finished items first and errors later. The JSONL format is convenient for spreadsheets, databases, queues, and manual audit tools because each line remains tied to the original input and its generated output.
Use state=all, finished, errors or a specific item state. Pending items are not exported. You can also combine with confidence=high or confidence=low.
Costs, limits, and automatic pauses
Each processed item generates inference cost according to the model used. If validation is enabled, validation runs a second model call and can also incur cost. Enabled built-in tools in the workflow may generate costs or consume their own limits.
A job can pause automatically when:
- the account has no available balance;
- the plan’s processing limit has been reached;
- the inference provider is temporarily unavailable;
- the job accumulates many consecutive errors;
- the user manually pauses the job.
When the pause occurs due to temporary unavailability or a recoverable limit, AIVAX may automatically resume the job later. When the pause is due to lack of balance, add balance before resuming manually or wait for automatic resumption.
Best practices
- Test the workflow with a few items before importing a large list.
- Use restrictive schemas with
requiredandadditionalProperties: falsewhen the output will be consumed by a system. - Include input and output examples in the instruction when the format is ambiguous.
- Prefer one line per item; if you need to send complex objects, use JSONL.
- Keep validation enabled for sensitive tasks such as legal, financial extraction, or data that feeds automations.
- Use
maxRetriesto fix occasional failures, but investigate repeated errors in the prompt or schema. - Set a low
errorStopThresholdin new workflows to avoid spending on a batch with a wrong configuration. - Retry low‑confidence items separately; low confidence does not mean error, but indicates the response deserves review.
- Export results by state when manual review is needed, e.g., first
finished, thenerrors.
English
Português