openclaw 网盘下载
OpenClaw

技能详情(站内镜像,无评论)

首页 > 技能库 > Minimax Image Understanding

使用多模态大模型理解图片内容,生成业务含义描述。支持多种模型:(1) MiniMax VLM (2) OpenAI GPT-4V (3) Claude Vision。用于理解截图、图表、文档照片等,生成精准的文字描述。

媒体与内容

许可证:MIT-0

MIT-0 ·免费使用、修改和重新分发。无需归因。

版本:v1.0.0

统计:⭐ 0 · 272 · 3 current installs · 3 all-time installs

0

安装量(当前) 3

🛡 VirusTotal :良性 · OpenClaw :可疑

Package:aidescend/minimax-image-understanding

安全扫描(ClawHub)

  • VirusTotal :良性
  • OpenClaw :可疑

OpenClaw 评估

The skill generally matches its stated purpose (sending a local image to a chosen multimodal model and returning a description) but has implementation inconsistencies (undeclared dependencies and an unexpected use of curl) and sends full image data to external endpoints — review dependencies and trust of remote APIs before installing.

目的

Name/description (image understanding via MiniMax/OpenAI/Anthropic) align with the included script and SKILL.md: the code reads a local image, base64-encodes it, and sends it to the selected model provider for analysis. Required environment variables listed in SKILL.md correspond to the providers used.

说明范围

Runtime instructions and the script are scoped to reading a local image file and sending it to a model provider; they do not access unrelated system files or secrets. However the skill will transmit the entire image (base64) to remote APIs, so image confidentiality and provider trust are relevant security considerations that the user should evaluate.

安装机制

No install spec is provided, but the script relies on external tools/libraries: it calls the 'curl' binary for the MiniMax path and imports the Python 'requests' module for OpenAI/Anthropic. The registry metadata claims 'required binaries: none' which contradicts the actual script requirements — this omission can cause runtime failures and indicates incomplete packaging/ documentation.

证书

The env vars mentioned (MINIMAX_API_KEY, MINIMAX_API_HOST, OPENAI_API_KEY, ANTHROPIC_API_KEY) match the services the skill integrates with and are proportionate to its purpose. No unrelated credentials or additional config paths are requested.

持久

The skill does not request permanent presence (always:false) and does not modify other skills or system-wide settings. It runs on demand and does not persist credentials or change agent configuration.

安装(复制给龙虾 AI)

将下方整段复制到龙虾中文库对话中,由龙虾按 SKILL.md 完成安装。

请把本段交给龙虾中文库(龙虾 AI)执行:为本机安装 OpenClaw 技能「Minimax Image Understanding」。简介:使用多模态大模型理解图片内容,生成业务含义描述。支持多种模型:(1) MiniMax VLM (2) OpenAI GPT-4V (3) Claude Vis…。
请 fetch 以下地址读取 SKILL.md 并按文档完成安装:https://raw.githubusercontent.com/openclaw/skills/refs/heads/main/skills/aidescend/minimax-image-understanding/SKILL.md
(来源:yingzhi8.cn 技能库)

SKILL.md

打开原始 SKILL.md(GitHub raw)

---
name: minimax-image-understanding
description: 使用多模态大模型理解图片内容,生成业务含义描述。支持多种模型:(1) MiniMax VLM (2) OpenAI GPT-4V (3) Claude Vision。用于理解截图、图表、文档照片等,生成精准的文字描述。
---

# 图片理解

调用多模态大模型理解图片,生成精准的业务描述。

## 支持的模型

| 模型 | 环境变量 | 说明 |
|------|----------|------|
| MiniMax VLM | `MINIMAX_API_KEY`, `MINIMAX_API_HOST` | 默认,推荐用于中文理解 |
| OpenAI | `OPENAI_API_KEY` | GPT-4V |
| Anthropic | `ANTHROPIC_API_KEY` | Claude Vision |

## 使用方法

### 前提条件

设置对应模型的环境变量(至少一个):

```bash
# MiniMax(默认)
export MINIMAX_API_KEY="your-minimax-key"
export MINIMAX_API_HOST="https://api.minimaxi.com"

# 或 OpenAI
export OPENAI_API_KEY="your-openai-key"

# 或 Anthropic
export ANTHROPIC_API_KEY="your-anthropic-key"
```

### 调用脚本

```bash
python3 <skill>/scripts/understand_image.py <图片路径> [model] [prompt]
```

**参数:**
- 图片路径:本地图片文件(PNG、JPG、JPEG、GIF、WebP)
- model(可选):`minimax`(默认)、`openai`、`anthropic`
- prompt(可选):自定义提示词

### 示例

```bash
# 使用默认(MiniMax)
python3 ~/.openclaw/workspace/skills/minimax-image-understanding/scripts/understand_image.py /path/to/image.png

# 指定模型
python3 ~/.openclaw/workspace/skills/minimax-image-understanding/scripts/understand_image.py /path/to/image.png openai

# 自定义提示词
python3 ~/.openclaw/workspace/skills/minimax-image-understanding/scripts/understand_image.py /path/to/image.png minimax "描述图表中的数据趋势"
```

## 输出

直接输出图片的业务含义描述,不再罗列元素位置,聚焦数据内容和业务逻辑。