Image to Text Generation

Images shouldn’t slow decision-making. GTWY’s Image Support lets you upload, view, and analyze images instantly using AI — whether they’re screenshots, photos, scanned documents, charts, diagrams, invoices, UI designs, or product images.

Instead of manually inspecting visuals, GTWY understands image content, extracts insights, answers questions, and provides contextual explanations automatically.

This turns static images into intelligent, actionable data.

How Image Support Works

GTWY gives you two flexible ways to upload and use images with your AI agents.

Option 1: Upload via External Image Link

If your image is already hosted online, simply include its URL in your API request.

curl --location 'https://api.gtwy.ai/api/v2/model/chat/completion' \
--header 'pauthkey: YOUR_GENERATED_PAUTHKEY' \
--header 'Content-Type: application/json' \
--data '{
  "user": "Explain the insights from this image."
  "agent_id": "YOUR_AGENT_ID",
  "thread_id": "YOUR_THREAD_ID",
  "response_type": "text",
  "variables": {},
  "user_urls": [{
    "type" : "image",
    "url" : "https://example.com/image.png"
  }]
}'

If both thread_id and agent_id are provided, the image is automatically linked to the session — so you don’t need to resend the image URL in follow-up requests.

RESPONSE FORMAT :

{
  "success": true,
  "response": {
    "data": {
      "id": "RESPONSE_ID",
      "content": "IMAGE_ANALYSIS_RESPONSE_TEXT",
      "model": "MODEL_NAME",
      "role": "assistant",
      "finish_reason": "completed",
      "tools_data": {},
      "images": [
        "IMAGE_URL"
      ],
      "annotations": [],
      "fallback": false,
      "firstAttemptError": "",
      "message_id": "MESSAGE_ID"
    },
    "usage": {
      "total_tokens": TOTAL_TOKENS,
      "input_tokens": INPUT_TOKENS,
      "output_tokens": OUTPUT_TOKENS,
      "cached_tokens": 0,
      "cache_read_input_tokens": 0,
      "cache_creation_input_tokens": 0,
      "reasoning_tokens": 0,
      "cost": REQUEST_COST
    }
  }
}

Option 2: Upload Image Directly via GTWY API

If the image isn’t hosted publicly, you can upload it directly to GTWY.

curl --location 'https://api.gtwy.ai/files/upload' \
--header 'pauthkey: YOUR_GENERATED_PAUTHKEY' \
--form 'file=@"path/to/your/image.png"' \
--form 'agent_id="YOUR_AGENT_ID"' \
--form 'thread_id="YOUR_THREAD_ID"'

RESPONSE FORMAT :

{
  "success": true,
  "file_url": "https://resources.gtwy.ai/uploads/example.png"
}

Use the returned file_url in your main request:

"user_urls": [{
    "type" : "image",
    "url" : "https://resources.gtwy.ai/uploads/example.png"
}]

For high-resolution images or multi-image analysis, processing may exceed one minute.

GTWY supports Webhook-based asynchronous responses, allowing results to be delivered reliably to your custom endpoint without blocking execution.

Common Use Cases

GTWY’s Image Support enables automation for any workflow involving visual understanding:

Extracting text from scanned images (OCR)

Analyzing invoices, receipts, and bills

Understanding charts, graphs, and dashboards

Reviewing UI screenshots and product designs

Interpreting diagrams, flowcharts, and schematics

Analyzing photos for objects, patterns, or defects

Answering questions from posters, slides, or infographics

Verifying documents using photographed IDs or forms

Key Benefits

Instant visual understanding: No manual inspection required

Accurate OCR & interpretation: Extract text, numbers, and structure

Context-aware reasoning: Ask questions about what’s inside the image

Multimodal intelligence: Combine images with prompts, variables, or documents

Usage Considerations

Image Size & Limits

Maximum combined content size: 32MB per request

Multiple images supported within a single request (within limits)

Summary

GTWY’s Image Support allows AI agents to see, understand, extract, and reason over images — turning visual data into intelligent outputs across business, finance, legal, product, design, operations, and education.

Images stop being static.

They become sources of insight.

Was this helpful?

Built with DocStar