openclaw 网盘下载
OpenClaw

技能详情(站内镜像,无评论)

首页 > 技能库 > Screen Vision

macOS screen OCR & click automation via Apple Vision + ScreenCaptureKit. Capture any window or screen region, extract text with coordinates, find text, and c...

数据与表格

作者:Jack Yun @jackyun1024

许可证:MIT-0

MIT-0 ·免费使用、修改和重新分发。无需归因。

版本:v1.2.0

统计:⭐ 0 · 104 · 1 current installs · 1 all-time installs

0

安装量(当前) 1

🛡 VirusTotal :可疑 · OpenClaw :良性

Package:screen-vision

安全扫描(ClawHub)

  • VirusTotal :可疑
  • OpenClaw :良性

OpenClaw 评估

The skill's requested actions and files are consistent with a macOS screen OCR/click tool; it installs a CLI (via Homebrew/GitHub/build) and requires Screen Recording permission — nothing in the package requests unrelated credentials or surprising system access.

目的

Name/description (macOS screen OCR + click) match the included instructions and setup script: the script installs a 'screen-vision' binary (Homebrew, GitHub release, or source build) and 'cliclick' for automation. No unrelated services, credentials, or config paths are requested.

说明范围

SKILL.md limits actions to running the CLI and parsing its output (list, ocr, find, tap, wait). It explicitly requires macOS 14+ and Screen Recording permission. There are no instructions to read unrelated files, exfiltrate data, or contact unexpected endpoints.

安装机制

Install is handled by the included setup.sh (no separate install spec). The script uses Homebrew where available, otherwise downloads a tarball from the project's GitHub releases or clones/builds the repo via git/swift. Those are typical approaches, but the curl|tar extraction into /usr/local/bin and building from remote source are operations that write binaries to disk and should be reviewed before running.

证书

The skill declares no environment variables, no credentials, and no config paths. The setup script does not attempt to read or require unrelated secrets or environment variables.

持久

The skill is not forced-always and does not modify other skills. The setup script installs binaries into /usr/local/bin (write to system path) and instructs the user to grant Screen Recording permission to the terminal app — both are expected for a screen-capture tool but are elevated actions that require user consent and attention.

综合结论

This skill appears to do what it says: it installs a CLI that captures screen contents and can simulate clicks. Before installing, review the upstream GitHub repository and release you will download (setup.sh references the project's GitHub releases). Prefer the Homebrew path when possible, or build from source yourself if you want maximum assurance. Be aware you will need to grant Screen Recording permission to your terminal; that permission …

安装(复制给龙虾 AI)

将下方整段复制到龙虾中文库对话中,由龙虾按 SKILL.md 完成安装。

请把本段交给龙虾中文库(龙虾 AI)执行:为本机安装 OpenClaw 技能「Screen Vision」。简介:macOS screen OCR & click automation via Apple Vision + ScreenCaptureKit. Captur…。
请 fetch 以下地址读取 SKILL.md 并按文档完成安装:https://raw.githubusercontent.com/openclaw/skills/refs/heads/main/skills/ls18166407597-design/screen-vision/SKILL.md
(来源:yingzhi8.cn 技能库)

SKILL.md

打开原始 SKILL.md(GitHub raw)

---
name: screen-vision
description: macOS Local OCR & Automation Tool using Vision Framework. Zero token cost for screen understanding.
metadata:
  {
    "openclaw": {
      "requires": { "bins": ["swift"] }
    }
  }
---

# screen-vision Skill

利用 Mac 本地 Vision 框架实现的极速 OCR 识别工具,为 AI 提供“本地之眼”。

## 功能
- **零 Token 截屏识别**:在本地完成屏幕文字提取,仅向 AI 传输关键文本和坐标。
- **精确坐标定位**:识别屏幕上任何文字的 [X, Y] 坐标。
- **多语言支持**:支持中英文混合识别。
- **通用操作基础**:配合内置脚本,可实现对任何应用的自动化点击和输入。

## 权限要求 (重要)
由于 macOS 的安全性限制,使用此技能前,用户必须手动在以下路径开启权限:
1. **系统设置 -> 隐私与安全性 -> 屏幕录制**:勾选你运行 OpenClaw 的终端或应用(如 Terminal, iTerm2)。
2. **系统设置 -> 隐私与安全性 -> 辅助功能**:同上(用于点击操作)。

## 使用场景
- 当用户说:“帮我操作 [某应用]”时,先运行此 Skill 扫描界面。
- 自动监控屏幕上的状态变化(如:余额、通知、进度条)。
- 识别非标准 UI(如 Telegram 桌面版、专业工具软件)。

## 内部代码
- `scripts/vision_ocr.swift`: 执行本地 Swift 识别逻辑。
- `scripts/click.swift`: 执行物理鼠标点击。