openclaw 网盘下载
OpenClaw

技能详情(站内镜像,无评论)

首页 > 技能库 > Local STT (Nvidia Parakeet + Whisper Support)

Local STT with selectable backends - Parakeet (best accuracy) or Whisper (fastest, multilingual).

开发与 DevOps

许可证:MIT-0

MIT-0 ·免费使用、修改和重新分发。无需归因。

版本:v1.0.0

统计:⭐ 1 · 2.3k · 16 current installs · 16 all-time installs

1

安装量(当前) 16

🛡 VirusTotal :良性 · OpenClaw :可疑

Package:araa47/local-stt

安全扫描(ClawHub)

  • VirusTotal :良性
  • OpenClaw :可疑

OpenClaw 评估

The skill largely does what it says (local STT) but it reads ~/.env files and uses Matrix credentials without declaring them, and will download models at runtime — behavior the registry metadata doesn't disclose and that could expose secrets or send transcripts externally if used.

目的

The code and SKILL.md align with a local STT tool (ffmpeg conversion, ONNX-based Parakeet/Whisper backends). The ability to post transcriptions to a Matrix room matches the documented --room-id option. However, the registry metadata listed no required environment variables while the script clearly expects MATRIX_HOMESERVER and MATRIX_ACCESS_TOKEN when the Matrix feature is used; that mismatch is noteworthy.

说明范围

SKILL.md documents the --room-id option but does not mention that the runtime will: (1) attempt to load environment files from ~/.openclaw/.env and ~/.env, (2) read MATRIX_HOMESERVER and MATRIX_ACCESS_TOKEN from the environment, (3) write logs to /tmp/stt_matrix.log, and (4) load models via onnx_asr which typically pulls model files from network sources (e.g., huggingface). Reading a user's ~/.env is scope-creep because it can surface unrelate…

安装机制

There is no install spec (instruction-only), which minimizes installer risk. The script includes a commented dependency list and a nonstandard shebang ('uv run --script') indicating runtime packages will be required; this implies runtime package installation/network activity but no explicit installer URL or archive is used.

证书

The skill requests no environment variables in registry metadata, yet the script loads ~/.openclaw/.env and ~/.env and reads MATRIX_HOMESERVER and MATRIX_ACCESS_TOKEN if present. Automatically loading a user's .env and using tokens is disproportionate unless clearly documented; it increases the chance of accidental use of unrelated secrets. The Matrix access token, if present, will be used to transmit transcriptions to the specified homeserver.

持久

The skill is not always-enabled and does not request elevated platform privileges. It writes a local log file (/tmp/stt_matrix.log) and temporarily writes a converted WAV file before deleting it, which is reasonable for this CLI. It does not modify other skills or agent-wide configuration.

安装(复制给龙虾 AI)

将下方整段复制到龙虾中文库对话中,由龙虾按 SKILL.md 完成安装。

请把本段交给龙虾中文库(龙虾 AI)执行:为本机安装 OpenClaw 技能「Local STT (Nvidia Parakeet + Whisper Support)」。简介:Local STT with selectable backends - Parakeet (best accuracy) or Whisper (faste…。
请 fetch 以下地址读取 SKILL.md 并按文档完成安装:https://raw.githubusercontent.com/openclaw/skills/refs/heads/main/skills/araa47/local-stt/SKILL.md
(来源:yingzhi8.cn 技能库)

SKILL.md

打开原始 SKILL.md(GitHub raw)

---
name: local-stt
description: Local STT with selectable backends - Parakeet (best accuracy) or Whisper (fastest, multilingual).
metadata: {"openclaw":{"emoji":"🎙️","requires":{"bins":["ffmpeg"]}}}
---

# Local STT (Parakeet / Whisper)

Unified local speech-to-text using ONNX Runtime with int8 quantization. Choose your backend:

- **Parakeet** (default): Best accuracy for English, correctly captures names and filler words
- **Whisper**: Fastest inference, supports 99 languages

## Usage

```bash
# Default: Parakeet v2 (best English accuracy)
~/.openclaw/skills/local-stt/scripts/local-stt.py audio.ogg

# Explicit backend selection
~/.openclaw/skills/local-stt/scripts/local-stt.py audio.ogg -b whisper
~/.openclaw/skills/local-stt/scripts/local-stt.py audio.ogg -b parakeet -m v3

# Quiet mode (suppress progress)
~/.openclaw/skills/local-stt/scripts/local-stt.py audio.ogg --quiet
```

## Options

- `-b/--backend`: `parakeet` (default), `whisper`
- `-m/--model`: Model variant (see below)
- `--no-int8`: Disable int8 quantization
- `-q/--quiet`: Suppress progress
- `--room-id`: Matrix room ID for direct message

## Models

### Parakeet (default backend)
| Model | Description |
|-------|-------------|
| **v2** (default) | English only, best accuracy |
| v3 | Multilingual |

### Whisper
| Model | Description |
|-------|-------------|
| tiny | Fastest, lower accuracy |
| **base** (default) | Good balance |
| small | Better accuracy |
| large-v3-turbo | Best quality, slower |

## Benchmark (24s audio)

| Backend/Model | Time | RTF | Notes |
|---------------|------|-----|-------|
| Whisper Base int8 | 0.43s | 0.018x | Fastest |
| **Parakeet v2 int8** | 0.60s | 0.025x | Best accuracy |
| Parakeet v3 int8 | 0.63s | 0.026x | Multilingual |

## openclaw.json

```json
{
  "tools": {
    "media": {
      "audio": {
        "enabled": true,
        "models": [
          {
            "type": "cli",
            "command": "~/.openclaw/skills/local-stt/scripts/local-stt.py",
            "args": ["--quiet", "{{MediaPath}}"],
            "timeoutSeconds": 30
          }
        ]
      }
    }
  }
}
```