Gemini Voice Assistant — 技能 — openclaw中文资讯站

技能详情（站内镜像，无评论）

Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI...

通信与消息

作者：Ali Mostafa Radwan @AliMostafaRadwan

许可证：MIT-0

MIT-0 ·免费使用、修改和重新分发。无需归因。

版本：v1.0.0

统计：⭐ 1 · 433 · 0 current installs · 0 all-time installs

⭐ 1

安装量（当前） 0

🛡 VirusTotal ：可疑 · OpenClaw ：可疑

Package：alimostafaradwan/gemini-voice-assistant

安全扫描（ClawHub）

VirusTotal ：可疑
OpenClaw ：可疑

OpenClaw 评估

The skill's code largely matches its voice-assistant description, but there are metadata inconsistencies (registry says no env required while the code and skill.json require GEMINI_API_KEY) and a few operational behaviors (reading a local .env, writing audio to /tmp, spawning ffmpeg) you should review before installing.

目的

The handler.py implements a Gemini Live audio/text client, depends on google-genai and audio libraries, and uses ffmpeg for conversion — which is coherent with a 'Gemini Voice Assistant'. However the registry metadata provided to the evaluator claimed 'Required env vars: none' while skill.json and the code require GEMINI_API_KEY. That metadata mismatch is an inconsistency you should resolve before trusting the package source.

说明范围

SKILL.md instructions map directly to the CLI entrypoint in handler.py. The runtime reads a .env file in the skill directory (documented) and uses GEMINI_API_KEY from the environment; it writes temporary audio to /tmp and invokes ffmpeg. The instructions do not attempt to read unrelated system files or send data to endpoints other than the Gemini API.

安装机制

There is no automated install spec (instruction-only behavior plus a Python script). Dependencies are standard Python packages and FFmpeg is expected to be present on the host. No external archive downloads or custom installers are present in the skill bundle.

证书

Requiring a single GEMINI_API_KEY is proportionate to contacting Gemini. The code will also load any key-value pairs from a local .env file into the process environment (only if present), so any secrets stored there may be read by the skill — ensure that .env contains only the intended API key. The earlier registry claim of 'no env vars' contradicts the code and skill.json, which is concerning.

持久

The skill does not request always:true and does not modify other skills or global config. It does create audio files under /tmp and leaves OGG output there; this is local persistence but not an elevated platform privilege.

安装（复制给龙虾 AI）

将下方整段复制到龙虾中文库对话中，由龙虾按 SKILL.md 完成安装。

请把本段交给龙虾中文库（龙虾 AI）执行：为本机安装 OpenClaw 技能「Gemini Voice Assistant」。简介：Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spok…。
请 fetch 以下地址读取 SKILL.md 并按文档完成安装：https://raw.githubusercontent.com/openclaw/skills/refs/heads/main/skills/alimostafaradwan/gemini-voice-assistant/SKILL.md
（来源：yingzhi8.cn 技能库）

SKILL.md

打开原始 SKILL.md（GitHub raw）

---
name: gemini-voice-assistant
description: Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI assistant powered by Google's Gemini models.
metadata:
  openclaw:
    emoji: "🎙️"
---

# Gemini Voice Assistant

A voice-to-voice AI assistant powered by Google's Gemini Live API. Speak to the AI and it responds with natural-sounding voice.

## Usage

### Text Mode

```bash
cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py "Your question or message"
```

### Voice Mode

```bash
cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py --audio /path/to/audio.ogg "optional context"
```

## Response Format

The handler returns a JSON response:

```json
{
  "message": "[[audio_as_voice]]nMEDIA:/tmp/gemini_voice_xxx.ogg",
  "text": "Text response from Gemini"
}
```

## Configuration

Set your Gemini API key:

```bash
export GEMINI_API_KEY="your-api-key-here"
```

Or create a `.env` file in the skill directory:

```
GEMINI_API_KEY=your-api-key-here
```

## Model Options

The default model is `gemini-2.5-flash-native-audio-preview-12-2025` for audio support.

To use a different model, edit `handler.py`:

```python
MODEL = "gemini-2.0-flash-exp"  # For text-only
```

## Requirements

- `google-genai>=1.0.0`
- `numpy>=1.24.0`
- `soundfile>=0.12.0`
- `librosa>=0.10.0` (for audio input)
- FFmpeg (for audio conversion)

## Features

- 🎙️ Voice input/output support
- 💬 Text conversations
- 🔧 Configurable system instructions
- ⚡ Fast responses with Gemini Flash