Skip to content

Qwen3-VL-235B-A22B-T

Pricing

API Pricing (USD)

Type Cost
Request $0.0048/request

Points-based Pricing

Type Cost
Total Cost 160 points/message
Initial Points Cost 160 points

Last Checked: 2025-10-15 16:47:21.461168

Bot Information

Creator: @novitaai

Description: Qwen3-VL is the most advanced vision-language model in the Qwen series, offering enhanced text understanding, visual reasoning, spatial perception, and agent capabilities. It supports Dense/MoE architectures and Instruct/Thinking editions for versatile deployment.

Key Features: - Visual Agent: Operates GUIs, recognizes elements, invokes tools, and completes tasks. - Coding Boost: Generates Draw.io, HTML, CSS, and JS from images/videos. - Spatial Perception: Enables 2D/3D reasoning with strong object positioning and occlusion analysis. - Long Context: Processes up to 1M tokens for books or long videos. - Multimodal Reasoning: Excels in STEM, math, causal analysis, and evidence-based answers. - Visual Recognition: Recognizes a wide range of objects, landmarks, and more. - OCR: Supports 32 languages with improved performance in challenging conditions. - Text-Vision Fusion: Achieves seamless, unified comprehension.

Ideal for multimodal reasoning, spatial analysis, and integrated text-vision tasks.

Technical Specifications

File Support: Image, Video, PDF and Markdown files Context window: 128k tokens

Extra: Powered by a server managed by @novitaai. Learn more

Architecture

Input Modalities: text

Output Modalities: text

Modality: text->text

Technical Details

Model ID: Qwen3-VL-235B-A22B-T

Object Type: model

Created: 1758695878297

Owned By: poe

Root: Qwen3-VL-235B-A22B-T

API Last Updated: 2025-10-15 16:36:09.625241