Author: Certik
Recently, OpenClaw (commonly known in the industry as "Little Lobster"), an open-source, self-hosted AI agent platform, has rapidly gained popularity due to its flexible scalability and self-controllable deployment characteristics, becoming a phenomenal product in the personal AI agent field. Its core ecosystem, Clawhub, serves as an application marketplace, aggregating a massive number of third-party skill plugins that allow agents to unlock advanced capabilities with a single click, ranging from web search and content creation to encrypted wallet operation, on-chain interaction, and system automation. This has led to explosive growth in the ecosystem's scale and user base.

But where exactly is the platform's true security boundary for these third-party skills that run in high-privilege environments?
Recently, CertiK, the world's largest Web3 security company, released its latest research on Skill security. The article points out that there is a misconception in the market regarding the security boundaries of the AI agent ecosystem: the industry generally treats "Skill scanning" as the core security boundary, but this mechanism is almost useless in the face of hacker attacks.
If we compare OpenClaw to the operating system of a smart device, then Skills are the various apps installed on that system. Unlike ordinary consumer-grade apps, some Skills in OpenClaw run in a high-privilege environment, allowing them to directly access local files, call system tools, connect to external services, execute commands in the host environment, and even manipulate the user's encrypted digital assets. Once a security issue arises, it can directly lead to serious consequences such as the leakage of sensitive information, remote takeover of the device, and the theft of digital assets.
Currently, the industry-wide standard security solution for third-party skills is "pre-listing scanning and review." OpenClaw's Clawhub has also built a three-layer review and protection system: integrating VirusTotal code scanning, a static code analysis engine, and AI logic consistency detection, pushing security pop-ups to users based on risk levels, attempting to safeguard ecosystem security. However, CertiK's research and proof-of-concept attack tests have confirmed that this detection system has shortcomings in real-world attack and defense scenarios and cannot shoulder the core responsibility of security protection.
The study first dismantles the inherent limitations of existing detection mechanisms:
Static detection rules are easily bypassed. The core of this engine relies on matching code features to identify risks. For example, it might identify the combination of "reading sensitive environmental information + sending out network requests" as a high-risk behavior. However, attackers only need to make slight syntactic changes to the code to easily bypass feature matching while retaining the malicious logic. It's like giving the dangerous content a different synonym, rendering the security scanner completely ineffective.
AI auditing has inherent blind spots. Clawhub's AI auditing is positioned as a "logic consistency detector," which can only identify obvious malicious code that "declares functions that do not match actual behavior," but is helpless against exploitable vulnerabilities hidden in normal business logic. It's like trying to find a fatal trap hidden deep in the terms of a seemingly compliant contract.
Even more fatally, the review process has a fundamental design flaw: even if VirusTotal's scan results are still in a "pending" state, Skill, which has not completed the full "check-up" process, can still be directly uploaded and made public, allowing users to install it without warning, leaving an opportunity for attackers.
To verify the true severity of the risk, the CertiK research team completed a full test. The team developed a skill called "test-web-searcher," which appears to be a fully compliant web search tool with code logic that conforms to standard development practices. However, it actually embeds a remote code execution vulnerability within its normal functional flow.
This skill bypasses the detection of static engines and AI review, and can be installed normally without any security warnings while VirusTotal scan is still pending. Finally, a command was sent remotely via Telegram, which successfully triggered the vulnerability and enabled arbitrary command execution on the host device (in the demonstration, the calculator was directly controlled to pop up).
CertiK's research explicitly points out that these issues are not product bugs unique to OpenClaw, but rather a common misconception in the entire AI agent industry: the industry generally treats "approval scanning" as the core security defense, while neglecting the true foundation of security, which is mandatory isolation and granular permission control at runtime. This is similar to the core security of Apple's iOS ecosystem, which has never been the strict approval process of the App Store, but rather the system's mandatory sandbox mechanism and granular permission control, ensuring that each app can only run in its own dedicated "isolation chamber" and cannot arbitrarily obtain system permissions. However, OpenClaw's existing sandbox mechanism is optional rather than mandatory and highly dependent on manual configuration by users. Most users choose to disable the sandbox to ensure the functionality of the Skills, ultimately leaving the agent in a "naked" state. Once a Skill with vulnerabilities or malicious code is installed, it will directly lead to catastrophic consequences.
In response to the issues discovered, CertiK has also provided security guidelines:
The AI agent race is currently on the verge of explosive growth, but the pace of ecosystem expansion must not outpace the pace of security development. Review and scanning can only stop basic malicious attacks, but can never constitute a security boundary for high-privilege agents. Only by shifting from "pursuing perfect detection" to "mitigating damage by defaulting to existing risks," and by forcibly establishing isolation boundaries at the runtime level, can we truly safeguard the security bottom line of AI agents and ensure the steady and long-term progress of this technological revolution.


