Building Safe, Trustworthy AI Agents – A Converging Vision

Brian Ritchie, kama.ai, Felicia Anthonio, #KeepItOn coalition, and Dr. Moses Isooba, Executive Director of UNNGOF for Forus Workshop on AI Activism

Building Safe, Trustworthy AI Agents – A Converging Vision

The AI industry is evolving quickly. The early wave of generative AI focused on answering questions. You asked a query, and the system pulled in data, often from the open web, to give you a reply. Now, the next phase is here  –  AI that doesn’t just respond but takes action.

At kama.ai, we have long argued that the future of AI lies in letting systems manage complex, multi-step, compliance-driven tasks. This while upholding the need for safety, trust, and accountability. Our Responsible AI framework, published last year, mapped out exactly how hybrid AI can combine deterministic accuracy with probabilistic flexibility. This now further integrates with Robotic Process Automation (RPA) for real-world execution.

At the time of this writing, Anthropic released their own “framework for developing safe and trustworthy agents.” It’s a strong piece of work, and their principles echo many of the same ideas we have championed. The overlap between these visions is encouraging. It shows the AI industry is aligning around what matters most.

Building Safe, Trustworthy AI Agents – A Converging Vision

The AI industry is evolving quickly. The early wave of generative AI focused on answering questions. You asked a query, and the system pulled in data, often from the open web, to give you a reply. Now, the next phase is here  –  AI that doesn’t just respond but takes action.

At kama.ai, we have long argued that the future of AI lies in letting systems manage complex, multi-step, compliance-driven tasks. This while upholding the need for safety, trust, and accountability. Our Responsible AI framework, published last year, mapped out exactly how hybrid AI can combine deterministic accuracy with probabilistic flexibility. This now further integrates with Robotic Process Automation (RPA) for real-world execution.

At the time of this writing, Anthropic released their own “framework for developing safe and trustworthy agents.” It’s a strong piece of work, and their principles echo many of the same ideas we have championed. The overlap between these visions is encouraging. It shows the AI industry is aligning around what matters most.

  1. From Simple Answers to Complex Actions

Both Anthropic and kama.ai recognize that AI agents are moving far beyond simple Q&A. These next-generation systems need to book meetings, onboard employees, review contracts, and resolve support issues. All this while following organizational rules.

In our model, a hybrid AI agent starts with a deterministic knowledge graph for guaranteed accuracy in mission-critical responses. If the knowledge graph doesn’t have an answer, the agent selectively engages generative AI. But this is only done with trusted, internal content. Once a course of action is confirmed, RPA systems carry out the execution. That’s whether the system is updating records in HR software, processing a benefits change, or notifying a customer.

Anthropic’s description of agents like Claude Code mirrors this. Autonomous task execution is still guided under human-defined boundaries. The convergence here is clear  –  AI is no longer a static tool. It’s an active, context-aware teammate.

  1. Keeping Humans in Control

One of Anthropic’s core principles is balancing autonomy with human oversight. At kama.ai, this is foundational. Our hybrid agents operate within governance guardrails. The deterministic layer ensures that for high-risk or regulated scenarios, only verified responses are delivered. Generative AI is “switched on” only when sanctioned by the organization executives. This is only done when it is felt to be safe enough to do so. Beyond that, the system is always transparent about the answers begin GenAI crafted.

This is similar to Anthropic’s approach of granting permissions in Claude Code. Our framework emphasizes human sign-off before sensitive actions. The principle is simple: AI should augment, not override, human judgment.

  1. Transparency That Builds Trust

Anthropic stresses the need for agents to explain their reasoning in ways humans can follow. We agree – without visibility, trust collapses.

In the kama.ai framework, every AI-assisted action is logged. If the answer is deterministic, it’s backed by a knowledge graph reference. If it’s generative, the system flags it clearly and points to the source content. This mirrors Anthropic’s real-time checklists in Claude, where users can see the agent’s plan and intervene when needed.

Transparency is not a “nice to have.” It’s the basis for accountability.

  1. Alignment with Human Values

Both companies call out a challenge: AI can act in ways that seem reasonable to the system but misaligned with human expectations.

In our eBook, we explain how hybrid AI mitigates this risk. By default, deterministic logic handles any scenario where compliance, safety, or legal alignment is non-negotiable. Generative AI operates in a contained environment. It never simply chooses to pull from uncontrolled open-web sources. Running the system this way reduces the chance of unpredictable or misaligned actions.

Anthropic’s point about avoiding overreach, like reorganizing someone’s files without explicit direction, reinforces why clear constraints and structured handoffs matter.

  1. Privacy and Security by Design

Privacy and security are not afterthoughts; they’re engineered into the architecture. Anthropic’s Model Context Protocol lets enterprises set strict connector permissions. At kama.ai, containment is our first principle. There is no outside data, and no leakage. Internal documents are stored in vetted collections called ‘trusted collections’. These are vectorized for semantic search, but never exposed beyond the governed environment.

Both frameworks stress protection against prompt injection, data crossover between contexts, and misuse of connected systems. It is not merely a theoretical risk. These are real threats that trustworthy agents need to be built to withstand.

Why This Convergence Matters

The similarities between Anthropic’s framework and kama.ai’s hybrid AI model aren’t coincidence. We believe there is an emerging consensus. The industry is learning that safe, effective AI isn’t about choosing between autonomy and oversight, or between speed and compliance. It’s about integrating them.

Deterministic AI provides the anchor for truth. Generative AI adds adaptability and natural language flexibility. RPA turns intent into action. Governance binds them together into something enterprises can trust.

When two independent teams arrive at the same core principles, it signals a clear direction for the market.

To review the full ebook… click here for the full version. No questions asked, no strings attached. 

Final Thoughts

Responsible AI isn’t a marketing slogan. It’s the foundation for AI systems that can take meaningful action without eroding trust, compliance, or security. At kama.ai, we welcome Anthropic’s contribution. The more alignment we see across the industry, the faster organizations can adopt AI agents with confidence.

The future of AI is hybrid. It’s human-guided, transparent, values-aligned, and secure by design. If your enterprise is ready to explore this path, moving from answers to actions, then let’s start that conversation today.