top of page

Anatomy of AI agents & Agent Persona

Exploring Agent characteristics, Persona and Workflows of Intelligent Agents

December 19, 2024

AI agents are no longer experimental. From open-source projects like  babyagi or autoGen  to commercial integrations like Microsoft Copilot and Khanmigo, their presence is growing. According to Markets forecasts the AI agent market is projected to reach $47.1 billion by 2030. This article simplifies key agent concepts and terminology, drawing from academic and industry AI. We will briefly touch upon four main points
  1. What is an Agentic workflow
  2. Anatomy of an AI agent & Agent Persona
  3. Example of an AI Agent
  4. Multi-Agent architecture


1.What is an Agentic workflow
The term AI agent evolved from John McCarthy’s work in the 1950s on autonomous agents and self-directed machines, alongside Herbert Simon’s decision-making models. While McCarthy did not use the term ‘AI agent’ explicitly, his ideas laid the foundation for what would become a key concept in modern AI research. The term ‘AI agent’ became widely popular in the 1990s with the rise of multi-agent and autonomous systems, and is now commonly used to refer to intelligent systems that perform specific tasks, such as virtual assistants (Alexa, Google Assistant), recommendation systems, chatbots, and autonomous robots. It gained massive popularity after the release of GPT-3 in 2020, which demonstrated future possibilities of broad spectrum of automations using LLMs.




Agentic adj. — “The ability of a system to express agency or control on one’s own behalf or on the behalf of another” — Agxncy.ai

ChatGPT is a basic form of an agent. You provide a prompt, it processes the request autonomously, returns a result, and interacts with you to refine it. On contrary a more advance AI agent uses “agentic workflow” to take the prompt, plan a series of tasks, call functions, take decisions and iterate autonomously to achieve a more qualitative result.



2.Anatomy of an AI agent & Agent Persona

Based on emerging trends in AI agent adoption, an agent possess several key characteristics that define their capabilities and functions. These characteristics are
  1. Planning

  2. React

  3. Knowledge

  4. Memory Capabilities

  5. Tool Calling

  6. Automation Level




Every agent needs a clear purpose: its job description. This defines the problem it solves, its target users (e.g., customer service, data research), and how success is measured. An Agent persona is similar to User persona which are mapped from CUstomer pain points and Needs.


Reading sources




3.Example of an AI agent

Using the above Anatomy, here is an example of an AI Photo-enhancing app.

1. Customer Job: Capture and share visually compelling photos using their iPhone, effortlessly.
2. Customer Pain Point: Struggles to remove unwanted elements from photos after capture, requiring complex editing software or skills.
3. Customer Need: A simple, mobile-friendly AI-powered solution that seamlessly removes unwanted details from iPhone photos, preserving quality and saving time.
4. AI Agent Persona
  • Agent Name: "Pixie"
  • Agent Goal: To empower iPhone photographers to effortlessly create flawless images by seamlessly removing unwanted elements.
  • Agent's Job Description & Instructions: Clarity acts as an intelligent photo editing assistant within the iPhone Photos app. It analyzes images, identifies potential distractions, and offers intuitive tools for removal. Instructions are to prioritize speed, ease of use, and preservation of image quality.
  • Personality & Tone: Helpful, efficient, friendly, and subtly playful. Clarity offers suggestions without being intrusive, explains processes clearly, and uses encouraging language.
5. AI Agent Details
  • Tools (APIs): A photo editing API like the Adobe Creative Cloud API (or similar) with content-aware fill and object removal capabilities. Integration with the iPhone Photos framework is essential.
  • Memory Allocation:
    • Short-term: Remembers recent edits within a session (e.g., objects removed, adjustments made).
    • Long-term: Learns user preferences (e.g., types of objects frequently removed, preferred editing style) to personalize suggestions over time.
  • Automation Level: Primarily autonomous, proactively suggesting removals and performing them with minimal user input.
    • Human Intervention: Clarity will prompt the user for confirmation on suggested removals, especially in cases where the context is ambiguous (e.g., is that a stray hair or part of the subject?). It also provides manual selection tools for users to specify other objects for removal. If the automatic removal isn't satisfactory, it offers alternative suggestions or allows the user to refine the edit.
  • Knowledge: Clarity should be trained on a massive and diverse dataset of images, including those specifically taken with iPhones, focusing on common unwanted elements (people, objects, blemishes, text). Training should emphasize realistic inpainting, seamless blending, and preserving image context. It should also be trained to prioritize the main subject and understand its relationship to the background for accurate object removal. Models that understand image composition and aesthetics would further enhance Clarity's suggestions.



4.Multi-Agent Architecture


Multi-agent systems (MAS) involve multiple autonomous agents interacting within an environment, each with their own goals and capabilities. These agents cooperate, compete, or coordinate to achieve complex tasks beyond the reach of individual agents, enabling sophisticated problem-solving and distributed intelligence. Source: LaangChain: Multi-Agent Systems



Based upon the previous example, a MAS system in a AI Photo-enhancing app could look like the following
  • "Focus" Agent: Analyzes image composition and suggests cropping/framing adjustments.
  • "Pixie" Agent: Identifies and removes unwanted elements (as described before).
  • "Style" Agent: Offers stylistic enhancements based on image content and user preferences.

Interaction: A user uploads a photo. Focus analyzes it, suggesting a crop. Clarity then removes a distracting object. Finally, Style suggests a subtle filter. The user can accept or modify each agent's suggestion.



bottom of page