← back to posts

Getting Started with CrewAI

A few months back, I found myself wrestling with a problem that felt deceptively simple at first: I had multiple AI agents that needed to work together on a complex task. Agent A would gather research. Agent B would analyze it. Agent C would synthesize findings into a report.

Sounds straightforward, right? It wasn’t.

Without a proper framework, I was manually wiring agent callbacks, managing state across async operations, and debugging task dependencies that would have made a DevOps engineer weep. Then I discovered CrewAI, and suddenly the whole thing clicked.

CrewAI is a framework for orchestrating autonomous agents. It handles the messy middle—task sequencing, context sharing, error recovery, tool management—so you focus on what each agent should do, not how to wire them together.

This post is the first in a series exploring CrewAI deeply. Let’s start with the fundamentals.

Why Multi-Agent Systems Matter

Modern problems are rarely single-discipline anymore. Building a market analysis requires research skills, financial modeling, writing, and strategic thinking. No single AI model excels at all of these.

The promise of multi-agent systems is elegant: specialized agents, each with domain expertise and specific tools, collaborate to solve complex problems. The reality—without the right framework—is coordination hell.

CrewAI solves this by providing abstractions that feel natural:

  • Agents are autonomous workers with personality and tools
  • Tasks are work units with clear outcomes
  • Crews are teams of agents executing coordinated work

It’s built on the premise that AI orchestration shouldn’t require distributed systems knowledge.

Core Concepts

Let’s define the building blocks before diving into code.

Agents

An agent is an autonomous actor. It has:

  • A role (Senior Analyst, Research Specialist)
  • A goal (specific objective the agent is optimizing for)
  • A backstory (context and expertise framing)
  • Tools (functions or APIs it can call)
  • An LLM (the language model powering reasoning)

Agents don’t follow scripts. They use their tools, interpret results, and decide next steps autonomously. The backstory and goal guide behavior without micromanagement.

Tasks

A task is a unit of work. It specifies:

  • A description (what needs to be done)
  • Expected output (what success looks like)
  • An assigned agent (who does it)
  • Dependencies (tasks that must complete first, via context=)
  • Tools available for this specific task (optional override)

Tasks aren’t imperative—you’re not telling the agent how to do something. You’re describing the goal and outcome, then the agent figures out execution.

Crews

A crew orchestrates agents and tasks. It defines which agents are on the team, task execution order, error handling, and output format. By default, crews use Process.sequential—tasks run in order, each receiving the output of previous tasks as context. A Process.hierarchical mode also exists for manager-delegated workflows.

When you kickoff a crew, it manages the entire workflow—passing context between tasks, handling failures, and collecting results. By default, agents have no memory between separate runs; we’ll cover persistent memory in an upcoming post.

Tools

Tools are functions agents can invoke. CrewAI integrates with:

  • Custom Python functions (via @tool decorator from crewai.tools)
  • LangChain tools
  • Pre-built tools in the crewai-tools package (web search, file access, databases)

An agent’s autonomy is limited by its tools. Give it the right tools, and it becomes effective. Restrict tools poorly, and it becomes useless.

Your First Crew

Let’s build something concrete. Imagine we want to analyze a company using three specialized agents: one researches the company, one analyzes financial metrics, and one writes a summary.

First, install CrewAI:

1
pip install crewai crewai-tools

Set your API key—CrewAI uses OpenAI by default:

1
export OPENAI_API_KEY="sk-..."

Now the code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
import json
from crewai import Agent, Task, Crew
from crewai.tools import tool  # decorator lives in crewai.tools

# Define tools — must return strings; the output feeds back into the LLM
@tool("web_search")
def search_company(query: str) -> str:
    """Search for company information online.
    
    Replace this mock with a real search tool (e.g., SerperDevTool from crewai-tools).
    """
    return f"Search results for: {query}"

@tool("financial_lookup")
def get_financial_data(ticker: str) -> str:
    """Fetch financial metrics for a company. Returns JSON with pe_ratio, revenue_growth, net_margin."""
    data = {
        "pe_ratio": 15.2,
        "revenue_growth": 0.23,
        "net_margin": 0.18
    }
    return json.dumps(data)

# Create agents — use llm= not llm_model=
researcher = Agent(
    role="Senior Research Analyst",
    goal="Uncover detailed information about companies",
    backstory="You have 10 years of experience researching markets. You excel at finding patterns and connections.",
    tools=[search_company],
    verbose=True,
    llm="openai/gpt-4o"
)

analyst = Agent(
    role="Financial Analyst",
    goal="Analyze financial health and metrics",
    backstory="You're a seasoned financial analyst with deep expertise in valuations and growth metrics.",
    tools=[get_financial_data],
    verbose=True,
    llm="openai/gpt-4o"
)

writer = Agent(
    role="Report Writer",
    goal="Synthesize analysis into clear, actionable reports",
    backstory="You write technical reports that executive teams actually read.",
    tools=[],
    verbose=True,
    llm="openai/gpt-4o"
)

# Define tasks — context= establishes ordering and passes outputs forward
research_task = Task(
    description="Research the company TechCorp. Find information about their business model, market position, and recent developments.",
    expected_output="A comprehensive overview of TechCorp's business and market standing",
    agent=researcher
)

financial_task = Task(
    description="Analyze TechCorp's financial metrics. Evaluate growth, profitability, and valuation.",
    expected_output="Financial health assessment with key metrics and investment perspective",
    agent=analyst,
    context=[research_task]
)

summary_task = Task(
    description="Write a concise investment summary for TechCorp based on research and financial analysis. Include recommendation.",
    expected_output="A 3-paragraph executive summary with clear investment stance",
    agent=writer,
    context=[research_task, financial_task]
)

# Assemble the crew
crew = Crew(
    agents=[researcher, analyst, writer],
    tasks=[research_task, financial_task, summary_task],
    verbose=True
)

if __name__ == "__main__":
    result = crew.kickoff()
    print(result)

What’s happening here:

  1. Agents are created with personality. The research agent’s backstory shapes how it approaches tasks. It’s not just a function; it’s a role with context.

  2. Tasks have dependencies. financial_task depends on research_task. CrewAI doesn’t start the analyst until the researcher finishes. Context flows automatically.

  3. Tools return strings. The LLM reads tool output as text—return json.dumps(data) not raw Python dicts.

  4. Crew orchestrates everything. When you call crew.kickoff(), CrewAI manages execution order, error handling, and output formatting.

How CrewAI Differs from Alternatives

You might be wondering: can’t I just string together LangChain agents? Or use LlamaIndex workflows?

LangChain Agents: LangChain is excellent but lower-level. For structured multi-agent coordination you’d reach for LangGraph, which gives you graph-based orchestration. CrewAI trades flexibility for a cleaner role/task model.

LlamaIndex Workflows: Workflows are great for data pipeline orchestration. CrewAI is specifically designed for agentic reasoning workflows where agents make decisions.

Autogen: Microsoft’s Autogen supports both conversation and task-oriented patterns. CrewAI is more opinionated; the role/task/crew model is narrower but faster to get right.

Crew’s sweet spot: You want autonomous agents with defined roles that cooperate on structured, multi-step problems without writing orchestration logic from scratch.

Common Pitfalls (Avoid These)

1. Vague task descriptions. Tasks with unclear expected output lead to rambling agent behavior. “Analyze the market” is too broad. “Provide a 3-point summary of market growth trends” is actionable.

2. Tool overload. Giving an agent access to twenty tools confuses its decision-making. Curate. Each agent should have 3-5 focused tools.

3. Missing context. If Task B logically needs Task A’s output but you don’t specify context=[task_a], you’ll get broken workflows. CrewAI can’t infer intent.

4. Ignoring token costs. Multi-agent workflows are token-hungry. Context accumulates. Monitor costs carefully and consider task batching.

5. Not iterating on backstories. An agent’s personality seems superficial until you realize it drives behavior. A “skeptical analyst” behaves differently than an “enthusiastic researcher.” Experiment.

6. Verbose in production. verbose=True is useful during development but produces massive logs. Disable it before shipping.

What’s Next

This post covers the foundation. In upcoming posts, we’ll explore:

  • Advanced tool integration: Building custom tools and integrating with APIs so agents have real capabilities beyond mocks
  • Memory and state management: How to make an agent remember results across runs without blowing your context window
  • Debugging multi-agent workflows: Tracing the actual decisions agents made and why
  • Performance optimization: Reducing token usage and execution time for production workloads
  • Production deployment: Moving from a local script to a robust, observable service

Ready to build something? Scaffold a project with crewai create crew and start experimenting with agent personalities. The framework disappears when it’s working well—you’ll stop thinking about orchestration and start thinking about what your agents can accomplish.

Stay tuned for the next post in this series.