Meerkat Governance — 技能 — openclaw中文资讯站

技能详情（站内镜像，无评论）

AI governance API with two endpoints. Shield scans untrusted content for prompt injection and threats. Verify checks AI output for hallucinations, numerical...

媒体与内容

许可证：MIT-0

MIT-0 ·免费使用、修改和重新分发。无需归因。

版本：v1.0.4

统计：⭐ 0 · 406 · 0 current installs · 0 all-time installs

⭐ 0

安装量（当前） 0

🛡 VirusTotal ：良性 · OpenClaw ：良性

Package：7789996399/meerkat-governance

安全扫描（ClawHub）

VirusTotal ：良性
OpenClaw ：良性

OpenClaw 评估

The skill's requirements and runtime instructions match its stated purpose (an external governance API accessed via an API key); there are no code files or surprising installs, and the only required secret is an API key for the described service.

目的

Name/description (governance: shield and verify endpoints) align with what the skill asks for: a single MEERKAT_API_KEY and curl-style HTTP calls to api.meerkatplatform.com. Nothing in the metadata or SKILL.md requires unrelated cloud credentials, system binaries, or local configuration.

说明范围

SKILL.md contains explicit curl examples and descriptions of the two API endpoints and does not instruct the agent to read unrelated files, search system state, or exfiltrate other credentials. It states the developer controls which content is sent and that the skill does not auto-activate, which is consistent with an instruction-only API integration.

安装机制

There is no install spec and no code files — the skill is instruction-only and relies on outbound HTTPS requests. That is low-risk compared with download-and-exec install mechanisms.

证书

The skill requires a single API key (MEERKAT_API_KEY), which is proportionate for a hosted API service. Minor inconsistency: registry metadata shows no 'primary credential' but the SKILL.md and requirements declare MEERKAT_API_KEY; this is a small metadata omission rather than a security problem.

持久

always is false and the skill is instruction-only with no install, so it does not request persistent system presence. The normal platform default allowing autonomous invocation remains, but that is expected and not combined with other red flags here.

综合结论

This skill appears coherent: it calls an external governance API and needs one API key. Before installing, verify the endpoint hostname (api.meerkatplatform.com) and TLS certificate, review Meerkat's privacy and data-retention policy, restrict and rotate the API key if possible, and monitor X-Meerkat-Usage/X-Meerkat-Remaining headers for unexpected activity. Because the skill makes outbound requests, avoid sending highly sensitive raw data unl…

安装（复制给龙虾 AI）

将下方整段复制到龙虾中文库对话中，由龙虾按 SKILL.md 完成安装。

请把本段交给龙虾中文库（龙虾 AI）执行：为本机安装 OpenClaw 技能「Meerkat Governance」。简介：AI governance API with two endpoints. Shield scans untrusted content for prompt…。
请 fetch 以下地址读取 SKILL.md 并按文档完成安装：https://raw.githubusercontent.com/openclaw/skills/refs/heads/main/skills/7789996399/meerkat-governance/SKILL.md
（来源：yingzhi8.cn 技能库）

SKILL.md

打开原始 SKILL.md（GitHub raw）

---
name: meerkat-governance
description: AI governance API with two endpoints. Shield scans untrusted content for prompt injection and threats. Verify checks AI output for hallucinations, numerical errors, and manipulation against source data. Returns structured results with trust scores and remediation guidance. Full audit trail.
homepage: https://meerkatplatform.com
metadata:
  clawdbot:
    emoji: "🔒"
    requires:
      env:
        - MEERKAT_API_KEY
    tags:
      - security
      - governance
      - safety
---

# Meerkat Governance

Scope: This skill provides two API endpoints your agent can call. It does not auto-activate, does not run in the background, and does not access content unless explicitly called by the agent. The developer controls what content is sent to Meerkat.

Privacy and data handling: https://meerkatplatform.com/privacy
Meerkat processes content in memory and discards it after the response. Only trust scores and metadata are stored. No raw content is retained. No data is shared with third parties. All processing stays in Canada.

Security: Your API key authenticates requests to Meerkat's API. Rotate keys via the dashboard if compromised. All communication is TLS 1.2+ encrypted. Meerkat endpoints are hosted on Azure Container Apps with managed SSL certificates. Verify the endpoint hostname (api.meerkatplatform.com) matches the TLS certificate before sending data.

## Ingress Shield

The `/v1/shield` endpoint scans content for prompt injection, jailbreaks, data exfiltration, and social engineering. The agent can call this before processing content the developer designates as untrusted. Common examples include external emails, web-scraped content, and user-uploaded documents. Developers can optionally configure their agent to shield skill descriptions before installation.

```bash
curl -s -X POST https://api.meerkatplatform.com/v1/shield 
  -H "Authorization: Bearer $MEERKAT_API_KEY" 
  -H "Content-Type: application/json" 
  -d "{"input": "<THE_CONTENT>"}"
```

**Response fields:**
- `safe` (boolean): Whether the content passed scanning
- `threat_level`: `NONE`, `LOW`, `MEDIUM`, `HIGH`, or `CRITICAL`
- `attack_type`: Category of detected threat (if any)
- `detail`: Human-readable description
- `sanitized_input`: Content with threats removed (when available)
- `audit_id`: Unique identifier for the audit record

The agent can use the response to decide how to proceed. For example, content flagged `HIGH` or `CRITICAL` could be blocked, while `MEDIUM` could prompt user confirmation. If `sanitized_input` is returned, the agent can use the cleaned version.

## Egress Verify

The `/v1/verify` endpoint checks AI-generated output against source data using up to five ML checks: entailment (DeBERTa NLI), numerical verification, semantic entropy, implicit preference detection, and claim extraction.

```bash
curl -s -X POST https://api.meerkatplatform.com/v1/verify 
  -H "Authorization: Bearer $MEERKAT_API_KEY" 
  -H "Content-Type: application/json" 
  -d "{"input": "<USER_REQUEST>", "output": "<AI_OUTPUT>", "context": "<SOURCE_DATA>", "domain": "<DOMAIN>"}"
```

The `domain` field applies domain-specific rules. Supported values: `healthcare`, `financial`, `legal`, `general`.

**Response fields:**
- `trust_score` (0-100): Weighted composite score across all checks
- `status`: `PASS` or `FLAG` (severity communicated via `trust_score` and `remediation.severity`)
- `checks`: Per-check scores, flags, and details
- `remediation`: Corrections and agent instructions (when status is not `PASS`)
- `audit_id`: Unique identifier for the audit record
- `session_id`: Session identifier for linking retry attempts

The agent can use the status and trust score to decide whether to proceed. When `remediation` is present, the `agent_instruction` field contains guidance for self-correction, and `corrections` lists specific errors (e.g., found value vs expected value). The agent can regenerate output with corrections applied and resubmit using the same `session_id` to link attempts.

## Observation Mode

When no `context` field is provided, Meerkat runs in observation mode: it checks semantic entropy and implicit preference but skips source-grounded checks. The `context_mode` field in the response will be `observation`. This is useful for checking open-ended generation where no source document exists.

## Audit Trail

Every shield and verify call is logged with an audit ID. The `/v1/audit/<audit_id>` endpoint retrieves the full record. Add `?include_session=true` to see all linked attempts in a retry session.

```bash
curl -s https://api.meerkatplatform.com/v1/audit/<audit_id> 
  -H "Authorization: Bearer $MEERKAT_API_KEY"
```

## Setup

1. Get a free API key at https://meerkatplatform.com (10,000 verifications/month, no credit card)
2. Set the environment variable: `MEERKAT_API_KEY=mk_live_your_key_here`
3. The developer controls which content is sent to Meerkat through their agent configuration. The agent calls the shield endpoint before processing untrusted external content, and the verify endpoint before executing high-impact actions.

## Detection Capabilities

See https://meerkatplatform.com/docs for example payloads and response formats.

**Ingress** detects: prompt injection, indirect injection, data exfiltration attempts, jailbreak and role-hijacking patterns, credential harvesting, and social engineering.

**Egress** detects: hallucinated facts, numerical distortions (medication doses, financial figures, legal terms), fabricated entities and citations, confident confabulation, bias and implicit preference, and ungrounded numbers.

## Usage Headers

Every API response includes usage headers:
- `X-Meerkat-Usage`: Current verification count
- `X-Meerkat-Limit`: Monthly limit (or "unlimited")
- `X-Meerkat-Remaining`: Verifications remaining
- `X-Meerkat-Warning`: Warning when approaching limit (80%+)

## Privacy

Meerkat processes content for security scanning only. Content is not stored beyond the audit trail retention period. Your API key is scoped to your organization. See https://meerkatplatform.com/privacy for details.