Your Guide to Qwen-Agent: Build Powerful AI Agents with Tools & RAG

Ilya Krukowski | 05 May 2026 | 35 min read

Table of contents

In this guide, we will walk through Qwen-Agent and how to use the qwen-agent framework to build AI agents with tools, RAG, Code Interpreter workflows, and external web data.

Here's the thing: calling a model is easy, but building an agent around it gets messy fast. Once you add function calling, custom tools, document retrieval, code execution, browser-style workflows, and state management, a small demo can turn into a homemade framework you now have to maintain.

Qwen-Agent gives developers a cleaner Python structure built around LLMs, Tools, and Agents. It is designed for Qwen-based applications, but it can also work with DashScope or OpenAI-compatible backends such as vLLM and Ollama.

In this article you'll learn how Qwen-Agent works, when it makes sense to use it, how to get started with a basic Assistant, and how to extend it with custom tools. We will also look at where ScrapingBee fits in: as a managed web extraction layer for agents that need reliable page content without dealing with JavaScript rendering, proxy rotation, anti-bot issues, and brittle scraping code.

Your Guide to Qwen-Agent: Build Powerful AI Agents with Tools & RAG

TL;DR

Qwen-Agent is an open-source Python framework for building AI agents with Qwen models. It gives you a structured way to combine LLMs, tools, memory, RAG, Code Interpreter workflows, and multi-step reasoning instead of wiring everything together by hand.

Use it when you need more than a raw model call: document QA, coding agents, research assistants, browser-style agents, data workflows, or internal automation tools.

For web-based agents, ScrapingBee can act as the web data tool behind Qwen-Agent, handling JavaScript rendering, proxies, anti-bot problems, and structured extraction.

If you are also exploring AI agents and web data extraction, check out our Crawl4AI tutorial. It covers how to install Crawl4AI, build your first crawler, extract structured data, and use smarter crawling patterns for LLM and agent workflows.

The AI agent problem: Why "just calling the model" isn't enough

Calling an LLM API is the easy part. You send a prompt, get a response, and the first prototype feels almost too simple. Then you try to build an actual agent.

Now the model needs to call tools, keep track of previous steps, read files, fetch data, run code, recover from bad outputs, and keep working through a task without falling apart halfway through. That is where a small "LLM wrapper" starts turning into a framework you never planned to maintain.

Why DIY agent stacks fall short

Most developers start with raw model APIs. That is usually the right move for a prototype: minimal code, full control, quick feedback. The problems show up when the prototype grows into something useful.

You add one function call. Then another. Then document search. Then memory. Then retries because the model passed strange arguments. Then logging because nobody knows why the agent picked the wrong tool. Then a web data source, which brings its own little box of snakes.

Before long, the codebase has a few familiar problems:

Orchestration gets messy. You are manually chaining prompts, tool calls, retries, and follow-up model calls.
State becomes fragile. Conversation history, user context, retrieved documents, and tool outputs all need to be tracked and passed around.
Tool handling gets repetitive. Every function needs schemas, validation, error handling, and response formatting.
RAG becomes a subsystem. Parsing, chunking, retrieval, ranking, and context trimming are not free.
Multi-step workflows create branching logic. Once the agent needs to inspect results and decide what to do next, simple scripts start growing weird edges.
Feature creep becomes permanent. The "quick demo" slowly becomes an in-house agent framework with no docs and one brave maintainer.

Raw model APIs are still great for narrow workflows. But once you need tool use, memory, retrieval, code execution, and multi-step reasoning, most of the work moves outside the model call.

That outside layer is the real agent problem.

The current LLM framework landscape

Developers usually deal with that problem in a few ways.

General-purpose orchestration frameworks give you a large set of components for building LLM apps. They can cover agents, tools, retrieval, workflows, integrations, and data pipelines. That breadth is useful, but it can also bring heavy abstractions when you only want to build one focused agent.
Direct model APIs are the opposite. They are clean, explicit, and easy to reason about. You call the model, get a response, and decide what happens next. The downside is obvious once the app grows: all orchestration belongs to you.
Custom in-house frameworks are the natural endpoint for teams that hit the same problems over and over. They can fit your stack nicely, but they are expensive. You now own the abstractions, the edge cases, the migration path, and every "temporary" helper that became load-bearing.

So there is a gap between raw API calls and big orchestration platforms.

A useful agent framework should give developers enough structure for real workloads without hiding everything behind magic. It should have clear places for models, tools, memory, retrieval, and code execution. It should make common agent patterns easier, but still feel like normal Python.

That is where focused frameworks start to make sense.

Enter Qwen-Agent: A framework built for Qwen-based agents

Qwen-Agent is an official framework from the Qwen team for building LLM applications around the strengths of Qwen models: instruction following, tool usage, planning, and memory. It is also used as the backend of Qwen Chat, which is a useful signal that the framework is built around real application patterns, not only isolated demos.

The main idea is to remove the usual glue-code pain. Instead of building your own agent loop from scratch, Qwen-Agent gives you a cleaner architecture for:

connecting to an LLM
defining tools
handling function calls
managing conversation and task context
adding files and RAG workflows
running code through a Code Interpreter
wiring the whole thing into an agent workflow

It also comes with practical pieces already included, such as RAG support, MCP integration, a Docker-based Code Interpreter, and browser-assistant examples such as BrowserQwen.

You still decide what your agent should do, you define the tools, and own the product logic. But Qwen-Agent gives those pieces a proper home, so you are not rebuilding the same agent scaffolding from raw API calls every time.

Check out tutorial on BrowserUse that covers how to use AI agents for scraping.

What is Qwen-Agent? Architecture & philosophy

Qwen-Agent is an open-source, Apache-2.0 Python framework for building LLM agents with Qwen models.

The key thing: it is not just a pile of helper functions around a model API. The qwen-agent framework gives you a structure for building applications where a model can use tools, read files, work with retrieval, run code, and keep track of a multi-step task.

So instead of stopping at "I called the model," you get closer to "I built an agent that can actually do useful work."

Overview

At a high level, Qwen-Agent is built for developers who want a proper way to assemble agentic applications without inventing the whole orchestration layer from scratch. You can use it to define:

which model your agent should use
what tools the agent can call
what files or documents it can work with
what system instructions it should follow
how the agent should handle a conversation or task

That gives you a more practical starting point than writing your own agent loop by hand. You still control the product logic, but Qwen-Agent handles a lot of the common wiring around model calls, tool execution, message history, and final responses.

It is also built around real application patterns, not just tiny prompt demos. The project includes support for RAG, Code Interpreter workflows, MCP integration, and browser-assistant examples such as BrowserQwen.

Core architecture: LLMs, Tools, and Agents

The easiest way to understand Qwen-Agent is to think in three layers:

the LLM is the brain
tools are the hands
the agent is the part that coordinates the work

Perhaps not a perfect metaphor, but good enough. We are building software, not performing brain surgery.

LLMs

The LLM layer wraps the model your agent talks to. In Qwen-Agent, model configuration is usually passed through an llm_cfg dictionary. You can use DashScope, or point the framework at an OpenAI-compatible endpoint if you are serving a model through something like vLLM, SGLang, or Ollama.

The LLM layer handles the chat interface and supports things like streaming responses, function calling, and generation settings.

A typical DashScope config looks like this:

import os

llm_cfg = {
    "model": "qwen-max-latest",
    "model_type": "qwen_dashscope",
    "api_key": os.getenv("DASHSCOPE_API_KEY"),
    "generate_cfg": {
        "top_p": 0.8,
    },
}

That config tells Qwen-Agent which model to use, which backend to call, and which generation options to apply.

Tools

Tools are external capabilities the agent can use. A tool can be almost anything:

an internal API
a database query
a web scraper
a file operation
a calculator
a Python utility
a ScrapingBee extraction call

In Qwen-Agent, tools are described with a name, description, and parameters. That description matters because the model uses it to decide when the tool is useful and what arguments to pass.

A good tool definition should make two things obvious:

what the tool does
what input it expects

This is where agents start to become more interesting than plain chatbots. The model is no longer limited to generating text from static context. It can call a service, inspect the result, and continue the task with fresh information.

Agents

The agent layer coordinates the flow. An agent receives messages, talks to the LLM, decides when tools are needed, executes those tools, feeds the results back into the conversation, and eventually returns an answer.

Qwen-Agent provides ready-made agent implementations such as Assistant for common workflows. An Assistant can work with instructions, tools, chat history, and files for RAG-style tasks.

So instead of writing this loop yourself:

send prompt to model
check if it wants a tool
parse tool arguments
call the tool
handle errors
add tool output to messages
call the model again
repeat until done

You define the agent and let Qwen-Agent manage the boring orchestration parts. That does not mean the framework magically designs your app for you. You still need to define useful tools, write clear instructions, and decide what the workflow should do. But Qwen-Agent gives the messy pieces a proper place to live.

Why it's different: A focused, application-oriented framework

Qwen-Agent is not just a thin client around a model API. It packages patterns that show up in real agent applications:

function calling
multi-step tool use
document-based question answering
code execution
browser-style workflows
MCP-based tool integration

It's also Qwen-first. The framework is designed around the way Qwen models handle instruction following, tool calls, planning, and reasoning. That makes it a natural fit if you are already working with Qwen models or evaluating them for agent workloads.

Another useful signal: Qwen-Agent is used as the backend of Qwen Chat. That doesn't mean every app you build with it is automatically production-ready. (Please do not ship chaos and blame the framework.) But it does show that the core ideas are used beyond toy examples.

The philosophy is pretty straightforward:

keep the developer close to Python
provide reusable components for common agent patterns
make tools and retrieval first-class
avoid forcing every project to reinvent the same orchestration layer

That is why Qwen-Agent feels less like "yet another model wrapper" and more like a proper app framework for Qwen-based agents.

Code example: Minimal Assistant with a Qwen model

Let's start with the smallest useful Qwen-Agent example: one Assistant, one Qwen model, and a simple terminal chat loop.

First, install Qwen-Agent:

pip install -U qwen-agent

Then set your DashScope API key:

export DASHSCOPE_API_KEY="your_api_key_here"

Now create a minimal assistant:

import os

from qwen_agent.agents import Assistant
from qwen_agent.utils.output_beautify import typewriter_print


llm_cfg = {
    "model": "qwen-max-latest",
    "model_type": "qwen_dashscope",
    "api_key": os.getenv("DASHSCOPE_API_KEY"),
    "generate_cfg": {
        "top_p": 0.8,
    },
}

bot = Assistant(
    llm=llm_cfg,
    system_message="You are a helpful developer assistant.",
)

messages = []

while True:
    query = input("\nUser: ")

    if query.strip().lower() in {"exit", "quit"}:
        break

    messages.append({
        "role": "user",
        "content": query,
    })

    response = []
    response_text = ""

    print("Assistant:")

    for response in bot.run(messages=messages):
        response_text = typewriter_print(response, response_text)

    messages.extend(response)

This example doesn't add tools yet, and that's intentional. It gives you the basic shape first: configure the model, create an Assistant, keep a message history, and stream responses from the agent.

Once this works, you can start adding custom tools, files for RAG, Code Interpreter support, or a ScrapingBee-powered web extraction tool.

You might also be interested in checking our Scrapegraph AI Tutorial: Scrape websites easily with LLaMA AI.

Key features: What Qwen-Agent gives you out of the box

The qwen-agent framework is useful because it does not stop at "send prompt, get text." It gives you the pieces you usually need when building real AI agents. Some of these are built-in components; others are official examples or patterns you can extend. Either way, the point is the same: less homemade orchestration, more time spent on the actual agent logic.

Function calling and custom tools

Tool calling is one of the main reasons to use Qwen-Agent instead of a bare model API.

With raw API calls, you usually have to build the loop yourself:

ask the model what to do
detect the tool call
parse the arguments
run the function
pass the result back
ask the model to continue
repeat until your soul leaves your body

Qwen-Agent gives you a better tool system around this. You define tools with a name, description, and parameters, then add them to the agent through function_list. A tool can be almost anything your agent needs:

a database lookup
a weather API
a local Python utility
a document parser
a web scraper
a ScrapingBee extraction call
an internal business API
a small helper function that saves you from writing yet another prompt hack

The important part is the tool description. The model uses it to decide when the tool is useful and what arguments to send. Good descriptions matter; bad descriptions produce cursed function calls. This is known science.

Qwen-Agent supports custom tools through BaseTool and the @register_tool decorator. Here is a shortened version of the official custom image-generation example.

One note before the code: this example also adds code_interpreter to function_list, so make sure you install the Code Interpreter extra and have Docker running if you want that part to work.

pip install -U "qwen-agent[code_interpreter]"

import json
import os
import urllib.parse

import json5
from qwen_agent.agents import Assistant
from qwen_agent.tools.base import BaseTool, register_tool


@register_tool("my_image_gen")
class MyImageGen(BaseTool):
    description = (
        "Generate an image from a text prompt and return the image URL."
    )

    parameters = [
        {
            "name": "prompt",
            "type": "string",
            "description": "Detailed image description in English.",
            "required": True,
        }
    ]

    def call(self, params: str, **kwargs) -> str:
        prompt = json5.loads(params)["prompt"]
        encoded_prompt = urllib.parse.quote(prompt)

        return json.dumps(
            {
                "image_url": f"https://image.pollinations.ai/prompt/{encoded_prompt}"
            },
            ensure_ascii=False,
        )


llm_cfg = {
    "model": "qwen-max-latest",
    "model_type": "qwen_dashscope",
    "api_key": os.getenv("DASHSCOPE_API_KEY"),
}

bot = Assistant(
    llm=llm_cfg,
    system_message=(
        "You can generate images and use Python when needed. "
        "First call the image generation tool, then use code if the user asks "
        "for analysis, downloading, or processing."
    ),
    function_list=[
        "my_image_gen",
        "code_interpreter",
    ],
)

messages = [
    {
        "role": "user",
        "content": "Generate an image of a corgi astronaut, then write Python code to download it.",
    }
]

for response in bot.run(messages=messages):
    print(response)

The nice bit here is that my_image_gen is your custom tool, while code_interpreter is a built-in Qwen-Agent tool. The agent can combine them in one workflow: generate the image URL, then use Python to process or download the result.

The same pattern works for boring but useful production tasks too. Replace my_image_gen with scrape_product_page, query_customer_db, or extract_page_with_scrapingbee, and you have an agent that can act on real data instead of just vibes.

Built-in Code Interpreter

Qwen-Agent includes a Docker-based Code Interpreter tool that lets agents write and run Python code. That matters because many tasks are awkward to solve with text generation alone. Sometimes the model should not guess: it should calculate.

Good use cases include:

data analysis
CSV inspection
chart generation
math-heavy tasks
file processing
quick transformations
checking assumptions with actual code

For example, if the user uploads a spreadsheet and asks for a trend, the agent can use Python instead of trying to eyeball the answer from raw rows. Much better — fewer hallucinated charts from the shadow realm.

To use the Code Interpreter support, install Qwen-Agent with the optional dependency:

pip install -U "qwen-agent[code_interpreter]"

You also need Docker running, because the tool executes code inside a container.

That container gives you a safer execution environment than running arbitrary model-written Python directly on your machine. Still, do not treat it as a magic production security boundary. The Qwen-Agent README says the Docker-based Code Interpreter mounts only the specified working directory and provides basic sandbox isolation, but should still be used with caution in production.

So the rule is:

great for local development, controlled internal tools, demos, and trusted workflows
needs real security review before you expose it to untrusted users at scale

As usual, the agent is smart, but it's not your security team.

Long-document RAG and 1M-token QA examples

RAG is another area where Qwen-Agent tries to save you from rebuilding the same machinery again. A basic document QA system sounds easy:

load documents
retrieve relevant chunks
pass them to the model
answer the question

Then real files show up and ruin the party. Large PDFs. Mixed formats. Long reports. Multiple documents. Context window limits. Weird tables. Duplicate sections. Retrieval misses. Token budgets. That one 400-page PDF someone exported from PowerPoint because apparently we live like this.

Qwen-Agent includes RAG support and examples for document question answering, including workflows aimed at very long documents.

The important nuance: this does not mean every basic Qwen-Agent RAG setup magically handles 1M-token inputs by default. The Qwen team has released specific long-document QA approaches, including a fast RAG solution and a more expensive but competitive agent, with reported results on 1M-token needle-in-the-haystack style tests.

So the takeaway is not "throw infinite text at the model and hope." Please do not do that. The idea is that Qwen-Agent gives you starting points for:

asking questions over large documents
splitting work across long files
retrieving relevant context
reducing manual context-window juggling
building assistants over PDFs, docs, reports, and mixed corpora

This is especially handy for internal knowledge assistants, research tools, legal or finance document review, technical documentation QA, and any workflow where "just paste the document into the prompt" stops working after page 12.

BrowserQwen: A browser assistant built on Qwen-Agent

BrowserQwen is an example browser assistant built on top of Qwen-Agent. It shows what an agent can look like when it is connected to real browsing context instead of sitting in a blank chat box. The project is available in the qwen GitHub repo and works as a Chrome browser extension backed by a local service.

Check out our tutorial on Lightpanda: The Headless Browser Built for AI Agents and Scalable Automation.

BrowserQwen can work with the current webpage or PDF, keep track of browsed pages and documents, summarize browsing content, help with writing tasks, and use plugins such as Code Interpreter for math and data visualization.

That makes it a useful reference if you want to build agents for:

web research
page summarization
multi-page QA
browser-based workflows
document reading
writing assistants
lightweight automation

It's also a good reminder that "web agent" does not just mean "call search and summarize the top result." Real browser agents need to read pages, keep context, handle multiple sources, and work with messy web content.

This is also where ScrapingBee can become useful in your own Qwen-Agent apps. BrowserQwen demonstrates the browser-assistant pattern. ScrapingBee gives you a managed way to fetch and extract web content when you want that pattern to survive real websites, JavaScript rendering, proxy issues, and anti-bot systems.

MCP integration for standardized tools

Qwen-Agent also supports MCP, short for Model Context Protocol. MCP is a standard way to connect models and agents to external tools and services. Instead of wiring every integration by hand, you can expose capabilities through MCP servers and let the agent use them as tools.

That can mean connecting a Qwen-Agent app to things like:

filesystem access
memory services
SQLite
GitHub
fetch tools
internal utilities
other structured tool servers

The benefit is standardization. Your agent does not need a one-off integration style for every external capability. MCP gives tools a cleaner interface, and Qwen-Agent can load MCP servers through its function_list configuration.

A simplified MCP-style config might look like this:

from qwen_agent.agents import Assistant


mcp_config = {
    "mcpServers": {
        "filesystem": {
            "command": "npx",
            "args": [
                "-y",
                "@modelcontextprotocol/server-filesystem",
                "./workspace",
            ],
        },
        "sqlite": {
            "command": "uvx",
            "args": [
                "mcp-server-sqlite",
                "--db-path",
                "test.db",
            ],
        },
    }
}

bot = Assistant(
    llm=llm_cfg,
    function_list=[mcp_config],
)

This assumes the required MCP server runtimes are installed. For example, the filesystem server uses npx, while the SQLite example uses uvx.

That makes MCP a good fit when your agent needs a growing set of structured external capabilities but you do not want each one to become a custom integration adventure. So, between function calling, custom tools, Code Interpreter, RAG, BrowserQwen, and MCP, Qwen-Agent gives developers a strong base for building agents that do actual work.

Performance & robustness considerations

There are no "Qwen-Agent is 7.3x faster than X" numbers worth throwing around here. And honestly, that isn't the best way to think about the qwen-agent framework. Qwen-Agent is not a browser engine, a vector database, or an inference server. Its value is in orchestration: helping your AI agent call tools, manage context, run code, use retrieval, and move through a workflow without you building every piece by hand.

So the question is not "how fast is Qwen-Agent in a vacuum?" The better question is: where does it reduce wasted work, broken flows, and painful architecture decisions?

Practical performance: where Qwen-Agent helps

In agent applications, performance is not just raw model latency. A slow or unreliable agent often wastes time because the workflow around the model is messy:

it calls the wrong tool
it repeats the same step
it dumps too much text into context
it loses track of previous tool results
it needs extra model rounds to recover from bad outputs
it relies on a giant prompt that nobody wants to maintain

Qwen-Agent helps by giving these moving parts a clearer structure.

First, tool-call orchestration becomes more organized. The framework supports advanced tool-calling patterns, including parallel, multi-step, and multi-turn function calls where the model and backend setup allow it. That matters because real agents often need more than one tool call to finish a task. A research agent might search, fetch, extract, summarize, and then call another tool before answering.

Second, the RAG patterns are more practical than "just paste the whole PDF into the prompt." Qwen-Agent includes document QA and long-context RAG workflows, which helps reduce the usual context-window hacks. Instead of treating the prompt like a landfill, you can build a flow that retrieves and passes relevant information more deliberately.

Third, the separation between LLMs, Tools, and Agents reduces architecture churn. You can adjust one layer without rewriting the entire app every time. That does not make bad design disappear, but it gives you fewer places to hide it.

It means:

fewer custom agent loops
less duplicated tool-handling code
cleaner message history handling
more reusable RAG and Code Interpreter workflows
easier experiments with DashScope or OpenAI-compatible backends
less "why is this helper function now the core of our product?" energy

Reliability & testing

Robustness matters more than clever demos. An AI agent that works once in a notebook but falls apart with real users is not an agent. It is a haunted autocomplete box.

Qwen-Agent has a few useful reliability signals.

It is maintained by the Qwen team and used as the backend of Qwen Chat, so the framework is tied to real application work rather than only isolated examples. The official repository also includes demos and examples around Qwen3.5 Agent, QwQ-32B tool calls, Qwen2.5-Math, Code Interpreter, RAG, MCP, and browser workflows.

The project has also been moving steadily. On PyPI, qwen-agent started with version 0.0.1 in April 2024 and has continued receiving releases, with version 0.0.34 released in February 2026. That is not a guarantee that every edge case is covered, but it is a decent sign that the framework is actively maintained.

For your own apps, you still need to test the boring parts:

tool schemas
argument validation
failed tool calls
retries and timeouts
long conversations
bad or missing documents
malformed API responses
prompt injection attempts
Code Interpreter permissions
web extraction failures

This is especially important when your Qwen-Agent app depends on external data. A Qwen model can reason well and still fail if the tool gives it garbage. If your agent needs web pages, product data, search results, or public documents, the extraction layer becomes part of the reliability story.

That is why ScrapingBee fits naturally into this stack. Qwen-Agent handles the agent workflow, while ScrapingBee handles the messy web access layer: rendering JavaScript, rotating proxies, dealing with anti-bot systems, and returning cleaner extracted content.

A good production setup usually has all three sides covered:

Qwen-Agent for tool orchestration, context management, RAG, and code execution
ScrapingBee for reliable web data acquisition
your own tests and guardrails around the workflow

That combo is much closer to a useful agent system than "call the model and pray."

When to choose Qwen-Agent

Qwen-Agent is a good fit when you want to build AI agents around Qwen models without starting from a raw API call and slowly reinventing your own framework.

It is not the right tool for every LLM app (nothing is). But if your project needs tools, retrieval, code execution, or multi-step workflows, the qwen-agent framework gives you a practical starting point.

Ideal use cases

Qwen-Agent makes the most sense when your agent needs to do more than return one response to one prompt. Good use cases include:

agents that call multiple tools, such as APIs, databases, scrapers, or internal services
assistants that need Code Interpreter-style Python execution
document QA systems that use RAG over long files or mixed corpora
browser-style agents that read, summarize, or work with web content
research agents that combine search, extraction, reasoning, and final synthesis
data analysis workflows where the model needs to inspect files and run calculations
internal automation tools that need tool calls, task context, and repeatable workflows

It's also a strong choice for teams standardizing on the Qwen model family. If you are already using Qwen models, Qwen-Agent gives you the most natural framework for building on top of them. You get a project designed around Qwen's strengths in instruction following, tool usage, planning, and context-heavy workflows.

Another good fit: developers who want more structure than direct API calls, but do not want to spend a week learning a giant orchestration stack before building the first useful thing.

That is probably the sweet spot for Qwen-Agent:

more organized than raw model APIs
more focused than broad orchestration platforms
still close enough to Python that you can understand what is happening

Consider alternatives when...

Qwen-Agent is useful, but it is not mandatory for every project.

You may not need it if your app is simple. For example, if you only need single-turn completions, rewriting, classification, or a small chat feature with no tools and no retrieval, a raw model API is probably enough. No need to bring an agent framework to a knife fight.

You may also want to stay with another framework if your team is already deep into that ecosystem. If your app is built around existing LangChain or LlamaIndex pipelines, and you are not planning to use Qwen models heavily, switching just for the sake of switching may not be worth it.

Qwen-Agent may also be less ideal if your main requirement is broad multi-provider abstraction. Some teams want one framework to normalize many model providers, vector stores, workflow engines, evaluation layers, deployment targets, and observability tools. In that case, a more general-purpose platform might fit better.

Consider alternatives when:

you only need a basic model call
your stack is already standardized around another LLM framework
Qwen models are not part of your roadmap
you need deep abstractions across many providers and infrastructure layers
your agent architecture is so custom that a framework would mostly get in the way

Qwen-Agent vs. other tooling

The easiest comparison is Qwen-Agent vs. "just use the Qwen API."

The Qwen API gives you access to the model. Qwen-Agent gives you a way to build an application around the model.

With direct API calls, you manage the workflow yourself:

message history
tool selection
function-call handling
document retrieval
code execution
retries and follow-up calls
final response assembly

With Qwen-Agent, those pieces have a framework-level home. You still design the agent, but you are not writing the entire orchestration layer from zero.

Compared with general-purpose LLM frameworks, Qwen-Agent is more focused. It is Qwen-first, Pythonic, and built around agent workloads that the Qwen team clearly cares about: tool calling, Code Interpreter workflows, RAG, BrowserQwen-style browser assistance, MCP, and multi-step planning.

That focus is the trade-off. A broad framework may give you a larger integration universe. Qwen-Agent gives you a tighter path for building Qwen-based agents with practical examples already in the repo.

So the rule is:

use raw APIs when the task is simple
use general-purpose frameworks when you need a huge cross-provider ecosystem
use Qwen-Agent when you want a focused framework for Qwen-based agents with tools, RAG, code execution, and real application patterns

And when that agent needs reliable web data, pair the framework with a proper extraction layer like ScrapingBee instead of duct-taping a fragile scraper into the workflow and hoping the website is feeling generous today.

Getting started with Qwen-Agent

Now that we have the big picture, let's get a basic Qwen-Agent setup running.

The goal here is not to build the final production agent. We just want the qwen-agent framework installed, connected to a model, and answering through a simple Assistant. Once that works, adding tools, RAG, Code Interpreter, MCP, or a ScrapingBee web extraction tool becomes much less painful.

Installation

For the full setup, install Qwen-Agent with the most useful extras:

pip install -U "qwen-agent[gui,rag,code_interpreter,mcp]"

Those extras enable:

gui for the Gradio-based web UI
rag for retrieval and document QA features
code_interpreter for Code Interpreter support
mcp for Model Context Protocol tool integration

If you only want the minimal package, use:

pip install -U qwen-agent

That is enough to start working with basic agents and model calls.

A couple of quick notes before you continue:

the Gradio GUI requires Python 3.10 or higher
the Code Interpreter extra installs the Python-side dependencies, but you still need Docker running for Docker-based code execution

Model backends: DashScope and OpenAI-compatible APIs

Qwen-Agent does not lock you into one model serving setup.

You have two common paths:

use DashScope, Alibaba Cloud's hosted model service
use an OpenAI-compatible API endpoint, such as vLLM, SGLang, or Ollama

For DashScope, set your API key as an environment variable:

export DASHSCOPE_API_KEY="your_dashscope_api_key"

Then configure the model. A common DashScope config looks like this:

llm_cfg = {
    "model": "qwen-max-latest",
    "model_type": "qwen_dashscope",
    "generate_cfg": {
        "top_p": 0.8,
    },
}

If api_key is not set directly in llm_cfg, Qwen-Agent can read it from DASHSCOPE_API_KEY.

For a self-hosted or local model exposed through an OpenAI-compatible /v1 endpoint, the config looks slightly different:

llm_cfg = {
    "model": "Qwen3-8B",
    "model_server": "http://localhost:8000/v1",
    "api_key": "EMPTY",
    "generate_cfg": {
        "top_p": 0.8,
    },
}

The exact model name depends on what your server exposes. For example, vLLM, SGLang, and Ollama setups may use different model names or ports. The useful part is that Qwen-Agent can sit on top of either hosted Qwen models or your own model service.

So you can start with DashScope for convenience, then move to self-hosted inference later if your workload, budget, or data requirements push you there.

Your first agent: simple Assistant

The fastest way to create a basic Qwen agent is with Assistant. This example creates a small interactive chat loop. No tools yet, no RAG, no Code Interpreter.

import os

from qwen_agent.agents import Assistant
from qwen_agent.utils.output_beautify import typewriter_print


llm_cfg = {
    "model": "qwen-max-latest",
    "model_type": "qwen_dashscope",
    "api_key": os.getenv("DASHSCOPE_API_KEY"),
    "generate_cfg": {
        "top_p": 0.8,
    },
}

bot = Assistant(
    llm=llm_cfg,
    system_message="You are a helpful developer assistant.",
)

messages = []

while True:
    query = input("\nUser: ")

    if query.strip().lower() in {"exit", "quit"}:
        break

    messages.append({
        "role": "user",
        "content": query,
    })

    response = []
    response_text = ""

    print("Assistant:")

    for response in bot.run(messages=messages):
        response_text = typewriter_print(response, response_text)

    messages.extend(response)

The important pieces are:

llm_cfg tells Qwen-Agent which model backend to use
Assistant gives you the basic agent wrapper
system_message defines the agent's behavior
messages stores the conversation history
bot.run() streams the agent response

This is still a simple chat assistant, but the structure is ready for more interesting work. You can add tools through function_list, pass files for document QA, or enable built-in tools like code_interpreter.

For example, later you might create a Qwen-Agent tool called scrape_with_scrapingbee and add it to the assistant so the agent can fetch clean web data when it needs external context.

Enabling GUI

Qwen-Agent also gives you a quick way to wrap an agent in a browser UI. If you installed the gui extra, you can launch a Gradio-based interface with two lines:

from qwen_agent.gui import WebUI

WebUI(bot).run()

Here, bot is the same Assistant instance from the previous example.

That is useful for testing as you can move from a terminal loop to a browser interface without rebuilding the app. It also makes demos easier: wire the agent, add a few tools, launch the UI, and let people try the workflow without touching your Python script.

A full toy version looks like this:

import os

from qwen_agent.agents import Assistant
from qwen_agent.gui import WebUI


llm_cfg = {
    "model": "qwen-max-latest",
    "model_type": "qwen_dashscope",
    "api_key": os.getenv("DASHSCOPE_API_KEY"),
    "generate_cfg": {
        "top_p": 0.8,
    },
}

bot = Assistant(
    llm=llm_cfg,
    system_message="You are a helpful assistant.",
)

WebUI(bot).run()

That is enough to get a basic Qwen-Agent web UI running.

From there, the interesting part is adding capabilities: custom tools, RAG files, Code Interpreter, MCP servers, or a ScrapingBee-powered extraction tool for agents that need reliable web data.

Qwen-Agent + web data: where ScrapingBee fits

Qwen-Agent gives you the agent framework: tools, task context, RAG, Code Interpreter workflows, and orchestration around the model. But agents still need data.

If your Qwen agent is supposed to monitor competitors, summarize pages, extract product details, research companies, compare job listings, or answer questions from the live web, it needs a reliable way to fetch and clean web content.

That is where ScrapingBee fits nicely. Qwen-Agent handles the agent workflow. ScrapingBee handles the agent's eyes on the web.

Agents still need reliable data sources

Qwen-Agent makes it easy to define a tool and let the model decide when to call it. That is great. But if that tool is a homegrown scraper, you still own the scraping problem. And the scraping problem is where good weekends go to die.

A basic scraper might work for simple static pages. Then the real web shows up:

JavaScript-rendered pages
anti-bot systems
IP blocks
CAPTCHAs
rate limits
flaky HTML
changing CSS selectors
pages that load content after scrolling
random 403s that only happen when your demo starts

So yes, you can give your Qwen-Agent app a fetch_url() tool built on requests. For simple internal pages, that may be enough. But for agents that depend on public web data, the question becomes: How do you give your agent structured, reliable page content without turning your team into a scraping infrastructure team?

Because Qwen-Agent can decide to call the tool. It can pass the result back into the workflow. The model can summarize, compare, classify, or use that content in a RAG-style flow. But Qwen-Agent doesn't solve proxy rotation, JavaScript rendering, anti-bot handling, retries, or extraction maintenance for every website your agent touches.

That layer needs its own solution.

Want to test this without committing to a paid setup? Sign up today! ScrapingBee gives new users 1,000 free API credits with no credit card required. That is enough room to try a few Qwen-Agent web-data workflows, test JavaScript rendering, and see whether the API fits your use case before wiring it deeper into your app.

Using ScrapingBee as the agent's web data tool

ScrapingBee works well as a managed web data tool behind Qwen-Agent. Instead of asking your agent to call a fragile scraper, you expose a cleaner tool such as:

web_fetch
extract_page
scrape_product_page
get_article_text
extract_company_profile

Under the hood, that tool calls ScrapingBee.

Learn how to scrape all text from a website for LLM training.

The agent sees a simple capability: "give me the content of this URL." ScrapingBee handles the annoying web-access layer:

rendering JavaScript with a headless browser
rotating proxies
dealing with anti-bot issues
returning HTML, text, or structured data
extracting fields with rules instead of brittle hand-parsing
making the scraping layer easier to maintain

This is the right split of responsibilities. Qwen-Agent should not care whether a page needed a headless browser, a proxy, or extra wait time. The agent just needs usable content.

ScrapingBee should not decide what your agent is trying to accomplish. It just needs to fetch and extract the web data reliably. Together, the pattern looks like this:

The user asks a question that needs web data.
Qwen-Agent decides to call a web extraction tool.
The tool calls ScrapingBee with the target URL and options.
ScrapingBee returns page content or structured data.
Qwen-Agent uses that data for summarization, comparison, RAG, or decision-making.

Instead of stuffing scraping logic into the agent itself, you give it a proper web data layer. Less duct tape, fewer mystery 403s, and more time spent building the actual product.

Example: Defining a ScrapingBee-based `web_fetch` tool

Here is a short sketch of what a ScrapingBee-powered tool can look like inside Qwen-Agent. This example keeps the tool intentionally simple: it accepts a URL, calls ScrapingBee, and returns readable page text to the agent.

Before running it, set both API keys:

export DASHSCOPE_API_KEY="your_dashscope_api_key"
export SCRAPINGBEE_API_KEY="your_scrapingbee_api_key"

Find the ScrapingBee API key in your dashboard after logging in to the system.

Then define the tool:

import os

import json5
import requests

from qwen_agent.agents import Assistant
from qwen_agent.tools.base import BaseTool, register_tool


@register_tool("web_fetch")
class WebFetch(BaseTool):
    description = (
        "Fetch readable text content from a given web page URL."
    )

    parameters = [
        {
            "name": "url",
            "type": "string",
            "description": "The full URL of the web page to fetch.",
            "required": True,
        }
    ]

    def call(self, params: str, **kwargs) -> str:
        api_key = os.getenv("SCRAPINGBEE_API_KEY")
        if not api_key:
            raise RuntimeError("SCRAPINGBEE_API_KEY is not set")

        url = json5.loads(params)["url"]

        response = requests.get(
            "https://app.scrapingbee.com/api/v1/",
            params={
                "api_key": api_key,
                "url": url,
                "render_js": "true",
                "return_page_text": "true",
            },
            timeout=60,
        )

        response.raise_for_status()
        return response.text[:12000]

Then attach it to a Qwen-Agent Assistant:

llm_cfg = {
    "model": "qwen-max-latest",
    "model_type": "qwen_dashscope",
    "api_key": os.getenv("DASHSCOPE_API_KEY"),
}

bot = Assistant(
    llm=llm_cfg,
    system_message=(
        "You are a research assistant. "
        "When the user asks about a specific web page, use web_fetch first, "
        "then summarize the useful information clearly."
    ),
    function_list=["web_fetch"],
)

Now the agent can use web_fetch when it needs page content.

A user could ask:

Summarize this page and list the main technical claims:
https://example.com/some-article

The flow would be:

Qwen-Agent sees that the task needs a URL fetch.
It calls web_fetch.
web_fetch calls ScrapingBee.
ScrapingBee returns readable page text.
The agent summarizes or analyzes that content.

For real projects, you would probably add more controls:

render_js as an optional parameter
country or proxy options
extraction rules for structured fields
output length limits
retry handling
allowlists for safe domains
logging and observability
RAG ingestion for longer pages or multi-page crawls

But the core pattern stays the same: make ScrapingBee a tool, let Qwen-Agent decide when to use it, and keep scraping infrastructure out of your agent loop.

Conclusion: Choosing your path to building Qwen-based agents

If you are building with Qwen and need more than a single model call, Qwen-Agent is the natural place to start.

It gives you a structured way to build agents that can use tools, manage conversation flow, work with documents, connect to external systems, and run code through Code Interpreter workflows. You still write the product logic, but you do not have to rebuild the same agent scaffolding from scratch every time.

And when your agent needs web data, ScrapingBee is the piece that keeps things practical. Qwen-Agent can decide when a web extraction tool should be used. ScrapingBee can handle the messy web layer behind that tool call.

For teams that need more than "just call the LLM"

A raw Qwen API call is fine for simple prompts. But once your app needs tools, RAG, task context, code execution, or multi-step workflows, you need more structure. That is what Qwen-Agent provides.

It gives developers a production-oriented framework for Qwen-based agents, with support for:

function calling
custom tools
RAG workflows
Code Interpreter tasks
MCP tool integration
browser-style assistant patterns
multi-step agent workflows

It also has a useful credibility signal: Qwen-Agent is maintained by the Qwen team and used as the backend of Qwen Chat. The official repo includes practical examples such as BrowserQwen, Code Interpreter, and custom assistant workflows.

So if you are asking "what is Qwen-Agent?" or "how do I build agents with Qwen?", the short answer is:

Start with the qwen-agent framework, build a minimal Assistant, add tools, then expand into RAG, Code Interpreter, MCP, or browser workflows as your app needs them.

Do not start by duct-taping ten scripts together unless you enjoy debugging your own accidental framework at 1 a.m.

Next steps & resources

For Qwen-Agent, start here:

A good learning path:

Build a minimal Assistant.
Add one custom tool.
Try a built-in tool like code_interpreter.
Add files and experiment with RAG.
Explore MCP if you need standardized external tools.
Look at BrowserQwen if you are building browser or research agents.

For managed web data, start here:

Use ScrapingBee as the web data tool behind your Qwen-Agent when you need reliable page fetching, JavaScript rendering, proxy rotation, anti-bot handling, or structured extraction without running scraping infrastructure yourself.

The final takeaway: Qwen-Agent gives you the framework for building serious Qwen-based AI agents. ScrapingBee gives those agents a reliable way to see and extract data from the web.

FAQ

What is Qwen-Agent?

Qwen-Agent is an open-source Python framework for building AI agents with Qwen models. It helps developers define agents, connect tools, run function calls, use RAG, and build multi-step LLM workflows without writing all the orchestration code from scratch.

Is Qwen-Agent only for Qwen models?

Qwen-Agent is designed around the Qwen model family, so that is where it fits best. But it can also work with OpenAI-compatible endpoints, which means you can connect it to self-hosted setups like vLLM or Ollama if they expose a compatible API.

What can you build with the qwen-agent framework?

You can build research assistants, document QA tools, coding agents, data analysis workflows, browser-style agents, internal automation tools, and RAG applications. Basically, anything where the model needs to use tools or work through a task in multiple steps.

Does Qwen-Agent include web scraping?

Qwen-Agent can call tools, including custom web scraping tools, but it does not magically solve scraping infrastructure by itself. If your agent needs reliable web data, you can define a ScrapingBee-powered tool to handle JavaScript rendering, proxies, anti-bot issues, and structured extraction.

How is Qwen-Agent different from just calling the Qwen API?

The Qwen API gives you access to the model. Qwen-Agent gives you an application framework around the model: tools, agents, memory, RAG patterns, Code Interpreter support, and multi-step workflows. Use the raw API for simple calls; use Qwen-Agent when you need agent behavior.

Ilya Krukowski

Ilya is an IT tutor and author, web developer, and ex-Microsoft/Cisco specialist. His primary programming languages are Ruby, JavaScript, Python, and Elixir. He enjoys coding, teaching people and learning new things. In his free time he writes educational posts, participates in OpenSource projects, tweets, goes in for sports and plays music.