Pre-requisites
Run the following command to enable watsonx Orchestrate Developer Edition to process documents:BASH
Note:
You need to configure a minimum allocation of 20GB RAM to your Docker engine during installation of watsonx Orchestrate Developer edition to support document processing features.
Note:
To run the document classifier, you must define the
WO_INSTANCE, WO_API_KEY, and AUTHORIZATION_URL credentials in your .env file. For more information on configuring the .env file, see Installing the watsonx Orchestrate Developer Edition.Configuring document extractor node in agentic workflows
-
Define document classes.
Create a class that defines the document classes to classify. Each document class must follow this structure:
Python
- Configure the document classifier node
docclassifier() method in your agentic workflow to classify the document. This method accepts the following input arguments:
Unique identifier for the node.
The LLM used for document classification. The default value is
groq/openai/gpt-oss-120b.Display name for the node.
The document classification classes.
Description of the node.
Minimum confidence threshold for classification.
Define input mappings using a structured collection of Assignment objects.
Enables or disables the human-in-the-loop feature. Set to
True to activate it and False to deactivate. The default value is False.Note:The
min_confidence setting controls the human-in-the-loop feature. This feature only works when you run the Flow from a chat session.
If the document is classified with confidence lower than min_confidence, or as Other, the agent opens a review window in the chat. You can then review and confirm the extracted values.docext node in an agentic workflow:
Python

