Gemini Video Analyzer — 技能 — openclaw中文资讯站

技能详情（站内镜像，无评论）

Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe...

媒体与内容

许可证：MIT-0

MIT-0 ·免费使用、修改和重新分发。无需归因。

版本：v1.0.0

统计：⭐ 0 · 525 · 1 current installs · 1 all-time installs

⭐ 0

安装量（当前） 1

🛡 VirusTotal ：可疑 · OpenClaw ：良性

Package：aiwithabidi/a6-gemini-video-analyzer

安全扫描（ClawHub）

VirusTotal ：可疑
OpenClaw ：良性

OpenClaw 评估

The skill is internally coherent: it asks only for a Google AI API key and runs Python scripts that upload videos to Google's Generative Language / Files API and request Gemini analysis as described.

目的

Name, description, and included scripts consistently implement video upload + Gemini model analysis against generativelanguage.googleapis.com. The single required credential (GOOGLE_AI_API_KEY) is the expected credential for this purpose. Minor mismatch: the declared required binaries include curl although the provided scripts use only python3/urllib; this is a small inconsistency but not evidence of malicious intent.

说明范围

Runtime instructions and scripts explicitly upload user video files to Google Files API and then call the Gemini model — this is consistent with the stated purpose. Important privacy note: videos (and any text/UI/audio they contain) are transmitted to Google and may be processed server-side and retained per the API (SKILL.md claims ~48h retention). The instructions do not read unrelated files or other environment variables.

安装机制

This is instruction-only plus two Python scripts with no install spec. Nothing is downloaded from third-party URLs during install; risk from installation is low. The scripts perform network calls at runtime (to Google endpoints) which is expected for this skill.

证书

Only GOOGLE_AI_API_KEY is requested and used, which is proportionate to contacting Google's Files/Generative Language APIs. Users should ensure the API key is scoped/restricted (project, API quotas, billing) because it could be used to bill requests or access other Google APIs depending on key permissions. The skill does not request unrelated secrets or config paths.

持久

The skill is not force-included (always: false) and does not request persistent system-wide privileges or modify other skills. It runs as-invoked and uses only its own scripts and the provided API key.

综合结论

This skill appears to do what it says: it uploads videos to Google's Generative Language/Files API and asks Gemini to analyze them. Before installing or running: (1) Be aware that videos will be uploaded off your machine to Google — avoid uploading sensitive footage unless you accept that. (2) Use a restricted API key (limit to the specific project/APIs, set quotas, and rotate or revoke when done) to reduce blast radius if the key is leaked. (…

安装（复制给龙虾 AI）

将下方整段复制到龙虾中文库对话中，由龙虾按 SKILL.md 完成安装。

请把本段交给龙虾中文库（龙虾 AI）执行：为本机安装 OpenClaw 技能「Gemini Video Analyzer」。简介：Native video analysis using Google Gemini API. Upload and analyze video files —…。
请 fetch 以下地址读取 SKILL.md 并按文档完成安装：https://raw.githubusercontent.com/openclaw/skills/refs/heads/main/skills/aiwithabidi/a6-gemini-video-analyzer/SKILL.md
（来源：yingzhi8.cn 技能库）

SKILL.md

打开原始 SKILL.md（GitHub raw）

---
name: gemini-video-analyzer
description: |
  Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe speech, identify objects and actions. Use when: (1) User sends a video file and wants it analyzed, (2) Video summarization or description needed, (3) Extracting text, UI elements, or information from screen recordings, (4) Answering questions about video content, (5) Comparing multiple videos, (6) Analyzing tutorials, demos, or walkthroughs.
homepage: https://www.agxntsix.ai
metadata:
  {
    "openclaw":
      {
        "emoji": "🎬",
        "requires": { "bins": ["python3", "curl"], "env": ["GOOGLE_AI_API_KEY"] },
        "primaryEnv": "GOOGLE_AI_API_KEY",
      },
  }
---

# Gemini Video Analyzer

Analyze videos natively using Google Gemini's multimodal API. No frame extraction needed — Gemini processes video at 1 FPS with full motion, audio, and visual understanding.

## Quick Start

```bash
# Analyze a video with default prompt (full description)
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4

# Ask a specific question
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4 "What text is visible on screen?"

# Manage uploaded files
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py list
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py cleanup
```

## Supported Formats

MP4, AVI, MOV, MKV, WebM, FLV, MPEG, MPG, WMV, 3GP — up to 2GB per file.

## How It Works

1. Video uploads to Google's Files API (temporary, auto-deletes after 48h)
2. Gemini processes at 1 frame/sec — understands motion, transitions, audio context
3. Model generates response based on your prompt
4. Way better than frame extraction for understanding temporal content

## Use Cases

| Task | Example Prompt |
|------|---------------|
| General description | *(default — no prompt needed)* |
| UI/text extraction | `"What text and UI elements are visible?"` |
| Tutorial summary | `"Summarize the steps shown in this tutorial"` |
| Bug report from video | `"Describe what went wrong in this screen recording"` |
| Meeting notes | `"Summarize the key points discussed"` |
| Content comparison | Upload 2 videos, ask for differences |

## Configuration

Set `GOOGLE_AI_API_KEY` in your environment or `.env` file. Get a free key at [aistudio.google.com](https://aistudio.google.com/apikey).

Default model: `gemini-2.5-flash` (fast, cheap, excellent vision). Override with `--model gemini-2.5-pro` for complex analysis.

## API Reference

See [references/gemini-files-api.md](references/gemini-files-api.md) for file upload limits, processing details, and advanced options.