Building PaperVizAgent Using Rustic AI: A Step-by-Step Guild Definition Guide

Generated output of PaperVizAgent using Rustic AI

This post walks through building PaperVizAgent (formerly PaperBanana) - an agentic system for automated generation of publication-ready academic illustrations. We'll break down each component of a Guild definition, making this both a feature showcase and a quickstart guide for developers.

Full Implementation: View the complete paper_banana_guild.json in the repository.

Quick demo of the final guild:

0:00
/1:13

What We're Building

PaperVizAgent is a multi-agent workflow that:

  1. Indexes uploaded research papers/references
  2. Retrieves relevant context for illustration requests
  3. Plans the illustration content and structure
  4. Styles the plan into a visual design guide
  5. Generates the image using Vertex AI Imagen
  6. Critiques the output and iterates if needed

Thinking About Agent Design

Before diving into code, the most important step is deciding what agents you need. This requires thinking about your workflow as a series of distinct responsibilities.

How to Decide on Agents

Ask yourself these questions:

  1. What are the distinct capabilities needed?
    • Document indexing/retrieval (knowledge management)
    • Content planning (reasoning about what to illustrate)
    • Style translation (converting plans to visual prompts)
    • Image generation (calling an image model)
    • Quality assurance (critiquing output)
  2. Where are the natural boundaries?
    • Each agent should have a single responsibility
    • If an agent is doing two unrelated things, split it
    • If two agents always work together with no independent use, consider merging
  3. What models/tools does each capability need?
    • Retriever needs: embeddings, vector store, chunking
    • Planner/Stylist/Critic need: LLM (can share the same model)
    • Visualizer needs: image generation model
  4. What's the data flow between them?
    • User query -> Retriever -> Planner -> Stylist -> Visualizer -> Critic -> (loop or output)

For PaperVizAgent, this analysis yields 5 agents:

Agent Responsibility Why Separate?
Retriever Index docs, search context Specialized knowledge management, different dependencies
Planner Decide what to illustrate Distinct reasoning task, could be swapped for different planning strategies
Stylist Convert plan to visual prompt Separates "what" from "how it looks", enables style customization
Visualizer Generate image Different model type entirely (image vs text)
Critic Quality check Enables iteration loop, could be disabled for faster output
Learn more: See the Agents documentation for core concepts and the Creating Your First Agent guide.

1. Defining Your Agents

A Guild is a collection of collaborating agents. Each agent is defined declaratively using an AgentSpec. Let's look at the agents in PaperVizAgent:

The Knowledge Agent (Retriever)

{
    "id": "retriever_agent",
    "name": "Retriever",
    "description": "Indexes references and retrieves relevant context.",
    "class_name": "rustic_ai.core.agents.indexing.knowledge_agent.KnowledgeAgent",
    "additional_topics": ["RETRIEVER_SEARCH", "RETRIEVER_INDEX"],
    "properties": {
        "search_defaults": {
            "limit": 5,
            "hybrid": { "dense_weight": 0.5, "sparse_weight": 0.5 }
        },
        "chunking": { "chunk_size": 1000, "chunk_overlap": 100 },
        "embedder": { "dimension": 768 }
    }
}
View in repo | KnowledgeAgent docs

The LLM Agents (Planner, Stylist, Critic)

The LLMAgent is a powerful, model-agnostic agent that works with any LLM provider via LiteLLM:

{
    "id": "planner_agent",
    "name": "Planner",
    "description": "Plans the content and structure of the illustration.",
    "class_name": "rustic_ai.llm_agent.llm_agent.LLMAgent",
    "additional_topics": ["PLANNER"],
    "listen_to_default_topic": false,
    "properties": {
        "model": "vertex_ai/gemini-3-pro-preview",
        "default_system_prompt": "You are the Planner Agent for PaperBanana..."
    }
}
View Planner in repo | View Stylist | View Critic

Switching Models is Trivial - just change the model property:

"model": "gpt-4o"                      // OpenAI
"model": "vertex_ai/claude-3-5-sonnet" // Anthropic via Vertex
"model": "azure/gpt-4"                 // Azure OpenAI
"model": "anthropic/claude-3-opus"     // Direct Anthropic

The Image Generation Agent (Visualizer)

{
    "id": "visualizer_agent",
    "name": "Visualizer",
    "description": "Generates the illustration using Vertex AI Imagen.",
    "class_name": "rustic_ai.vertexai.agents.image_generation.VertexAiImagenAgent",
    "additional_topics": ["VISUALIZER"],
    "listen_to_default_topic": false,
    "properties": {
        "model_id": "imagen-4.0-generate-001"
    }
}
View in repo

2. Understanding Routes in Event-Based Architecture

Rustic AI uses an event-driven, message-based architecture. Agents don't call each other directly - they publish messages to topics, and routes determine where those messages go next.

Deep dive: See the documentation on Messaging and Routing Patterns.

Why Event-Based?

  1. Decoupling: Agents don't need to know about each other - they just produce and consume messages
  2. Flexibility: Change routing without changing agent code
  3. Observability: Every message is traceable through the system
  4. Scalability: Agents can run in separate processes/machines

How to Think About Routes

Routes answer the question: "When agent X produces message type Y, what should happen next?"

Think of routes as reactive rules:

WHEN [agent/method] produces [message type]
THEN transform into [message type] AND send to [destination]

For PaperVizAgent, the routing flow is:

User Message
    |
    v
[Route 1: Content Router] -- file upload? --> RETRIEVER_INDEX
                          -- text query?  --> RETRIEVER_SEARCH
    |
    v
[Route 2: After Index] --> Notify user "Indexed N files"
    |
[Route 3: After Search] --> PLANNER (with context)
    |
    v
[Route 4: After Plan] --> STYLIST
    |
    v
[Route 5: After Style] --> VISUALIZER
    |
    v
[Route 6: After Image] --> CRITIC (for review)
                       --> User (show image)
    |
    v
[Route 7: After Critique] -- approved? --> User
                          -- rejected? --> STYLIST (iterate)

Anatomy of a Route Step

{
    "agent_type": "rustic_ai.core.agents.utils.user_proxy_agent.UserProxyAgent",
    "method_name": "unwrap_and_forward_message",
    "message_format": "rustic_ai.core.guild.agent_ext.depends.llm.models.ChatCompletionRequest",
    "transformer": { ... },
    "route_times": -1,
    "process_status": "completed",
    "destination": { "topics": "user_message_broadcast" }
}
Field Purpose
agent or agent_type Which agent's output triggers this rule (by name/id or class)
method_name (Optional) Specific method that produced the message
message_format The Pydantic model type to match
transformer How to transform the message before forwarding
destination Where to send the transformed message
route_times How many times this rule can fire (-1 = unlimited)
process_status Mark the processing status (e.g., "completed")
View all routes in repo

3. Transforms: The Power of JSONata

Transforms are the glue between agents. They reshape messages from one agent's output format into another agent's expected input format.

Deep dive: See the Transformation Patterns documentation.

Why Transforms Matter

Without transforms, you'd need:

  • Agents to know about each other's formats
  • Glue code between every pair of agents
  • Code changes for every routing modification

With transforms, routing is purely declarative - defined in JSON, not code.

Transform Styles

Rustic AI supports two transform styles:

Style Use Case Modifies
Functional Transformer Dynamic routing + payload changes topics, payload, format, context
Payload Transformer Simple format conversion payload only

PaperVizAgent uses content_based_router (functional) for all transforms because routes depend on message content.

Example 1: Content-Based Routing (Entry Point)

This transformer inspects the user's message and routes it appropriately:

{
    "style": "content_based_router",
    "handler": "($content := $.payload.messages[0].content; $is_array := $type($content) = 'array'; $files := $is_array ? $content[type='file_url'] : []; $text := $is_array ? $content[type='text'][0].text : $content; $is_file_upload := $count($files) > 0; $ToRetrieverIndex := function(){ {'topics': 'RETRIEVER_INDEX', 'format': '...IndexMediaLinks', 'payload': {'media': [...]}} }; $ToRetrieverSearch := function(){ {'topics': 'RETRIEVER_SEARCH', 'format': '...KBSearchRequest', 'payload': {'text': $text, 'limit': 3}} }; $is_file_upload ? $ToRetrieverIndex() : $ToRetrieverSearch())"
}
View in repo

What this does:

  1. Extracts content from the incoming message
  2. Checks if it contains file uploads
  3. Routes to RETRIEVER_INDEX for files, RETRIEVER_SEARCH for text queries

Breaking down the JSONata:

// 1. Extract content (could be string or array of content parts)
$content := $.payload.messages[0].content;
$is_array := $type($content) = 'array';

// 2. Extract files and text separately
$files := $is_array ? $content[type='file_url'] : [];
$text := $is_array ? $content[type='text'][0].text : $content;

// 3. Decide routing based on content type
$is_file_upload := $count($files) > 0;

// 4. Define output functions for each case
$ToRetrieverIndex := function(){ ... };
$ToRetrieverSearch := function(){ ... };

// 5. Route conditionally
$is_file_upload ? $ToRetrieverIndex() : $ToRetrieverSearch()

Example 2: Simple Pipeline (Planner to Stylist)

{
    "style": "content_based_router",
    "handler": "({'topics': 'STYLIST', 'format': 'rustic_ai.core.guild.agent_ext.depends.llm.models.ChatCompletionRequest', 'payload': {'messages': [{'role': 'user', 'content': $.payload.choices[0].message.content}]}})"
}
View in repo

What this does:

  • Extracts the Planner's LLM response ($.payload.choices[0].message.content)
  • Wraps it in a new ChatCompletionRequest format
  • Routes to the STYLIST topic

Example 3: Context Preservation (Retriever to Planner)

{
    "handler": "({'topics': 'PLANNER', 'format': '...ChatCompletionRequest', 'context': {'user_query': $.payload.query_text}, 'payload': {'messages': [{'role': 'user', 'content': 'User Query: ' & $.payload.query_text & '\\n\\nContext:\\n' & $string($.payload.results)}]}})"
}
View in repo

Key feature: The context field preserves the original user query for downstream agents, accessible via $.context.user_query in later transforms.

Example 4: Conditional Routing (Critic Feedback Loop)

{
    "handler": "($content := $.payload.choices[0].message.content; $is_approved := $contains($content, 'APPROVE'); $ToUser := function(){ {'topics': 'user_message_broadcast', ...} }; $ToStylist := function(){ {'topics': 'STYLIST', 'payload': {'messages': [{'role': 'user', 'content': 'The previous image was rejected. Critique: ' & $content & '. Please refine the prompt.'}]}} }; $is_approved ? $ToUser() : $ToStylist())"
}
View in repo

This implements a feedback loop:

  • If Critic says "APPROVE" -> send to user
  • Otherwise -> send feedback back to Stylist for refinement

Transform Building Blocks (JSONata)

Operation Syntax Example
Variable assignment $var := expr $total := $sum(items.price)
Ternary cond ? a : b $priority > 5 ? 'high' : 'normal'
Object literal {'key': value} {'topics': 'OUTPUT', 'payload': {...}}
String concat a & b 'Hello ' & $.name
Array filter arr[predicate] $content[type='file_url']
Functions function(){ expr } $Route := function(){ {...} }

4. Dependencies

Dependencies provide shared resources to agents via dependency injection. Define them at the guild level:

"dependency_map": {
    "filesystem": {
        "class_name": "rustic_ai.core.guild.agent_ext.depends.filesystem.FileSystemResolver",
        "properties": {
            "path_base": "/tmp/PaperVizAgent_data",
            "protocol": "file",
            "storage_options": { "auto_mkdir": true },
            "asynchronous": true
        },
        "scope": "guild"
    },
    "kb_backend": {
        "class_name": "rustic_ai.core.knowledgebase.kbindex_backend_memory.InMemoryKBIndexBackendResolver",
        "properties": {},
        "scope": "guild"
    },
    "llm": {
        "class_name": "rustic_ai.litellm.agent_ext.llm.LiteLLMResolver",
        "properties": {
            "model": "vertex_ai/gemini-3-pro-preview"
        }
    }
}
View in repo | Dependencies documentation

Scope Options:

  • agent (default): One instance per agent
  • guild: Shared instance across all agents in the guild
  • org: Shared across all guilds in an organization

Complete Guild Structure

{
    "name": "PaperBanana",
    "description": "Automated generation of publication-ready academic illustrations",
    "spec": {
        "name": "PaperBanana",
        "agents": [ /* 5 agents */ ],
        "dependency_map": { /* 3 dependencies */ },
        "routes": {
            "steps": [ /* 8 routing rules */ ]
        }
    }
}
View complete implementation

Summary

Component What It Does Documentation
Agents Define individual AI workers with specific roles Agents
LLMAgent Model-agnostic LLM integration (swap models via config) LLMAgent
Routes Declarative workflow definition between agents Routing patterns
Transformers JSONata expressions for message transformation & routing Transformation patterns
Dependencies Shared resources (LLM, storage, databases) via DI Dependency injection

PaperVizAgent demonstrates the power of declarative agent orchestration - complex multi-step AI workflows defined entirely in configuration, with no glue code required.


Further Reading

Subscribe to updates from the Dragonscale Newsletter

Don't miss out on the latest posts. Sign up now to get new posts sent directly to your inbox.
jamie@example.com
Subscribe