openclaw 网盘下载
OpenClaw

技能详情(站内镜像,无评论)

首页 > 技能库 > tts

Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch contro...

媒体与内容

许可证:MIT-0

MIT-0 ·免费使用、修改和重新分发。无需归因。

版本:v1.0.0

统计:⭐ 0 · 34 · 0 current installs · 0 all-time installs

0

安装量(当前) 0

🛡 VirusTotal :良性 · OpenClaw :良性

Package:17854566382/tts-1

安全扫描(ClawHub)

  • VirusTotal :良性
  • OpenClaw :良性

OpenClaw 评估

The skill's files, instructions, and install steps are coherent with a TTS wrapper around the node-edge-tts library; required artifacts and behaviors match the description and appear proportionate.

目的

The name/description (Edge TTS wrapper) matches the included scripts, package.json, and documentation. Dependencies (node-edge-tts, commander) are appropriate for a Node.js TTS CLI/skill. There are no unrelated credentials, binaries, or config paths requested that would be out of scope for a TTS skill.

说明范围

SKILL.md instructs either using the platform's built-in tts tool or the provided CLI scripts. The scripts only read/write a per-user config (~/.tts-config.json), create temp audio files under the system temp directory, and send text to the Microsoft Edge online TTS service via node-edge-tts. There are no instructions to read other system files, harvest environment variables, or post data to unexpected endpoints in the scripts. Note: the skill …

安装机制

There is no complex installer in the skill metadata, but an install.sh is included that runs npm install in the scripts directory. Dependencies are pulled from the public npm registry (package-lock.json references known packages). This is a standard, moderate-risk install mechanism for Node.js code and is proportionate to the task.

证书

The skill requests no environment variables or credentials. It does persist a small config file in the user's home directory and writes temporary audio files; both are reasonable for a TTS tool. The proxy option (configurable) can reroute requests — expected for networking tools but means network traffic can be directed via a proxy if configured. Users should be aware that supplied text is sent over the network to the TTS backend.

持久

The skill does not require 'always' presence and does not modify other skills or system-wide agent settings. It persists only a per-user config file (~/.tts-config.json) and writes output to temp or specified output paths. Those are normal privileges for a CLI-based skill.

综合结论

This package appears to be a straightforward wrapper around node-edge-tts. Before installing, consider: (1) npm install will fetch packages from the public registry—review package-lock.json if you audit dependencies; (2) generated audio and a small config file (~/.tts-config.json) are written to disk; (3) user text is sent to an external TTS service (Microsoft Edge online service) — do not send sensitive secrets or private data through it; (4)…

安装(复制给龙虾 AI)

将下方整段复制到龙虾中文库对话中,由龙虾按 SKILL.md 完成安装。

请把本段交给龙虾中文库(龙虾 AI)执行:为本机安装 OpenClaw 技能「tts」。简介:Text-to-speech conversion using node-edge-tts npm package for generating audio …。
请 fetch 以下地址读取 SKILL.md 并按文档完成安装:https://raw.githubusercontent.com/openclaw/skills/refs/heads/main/skills/17854566382/tts-1/SKILL.md
(来源:yingzhi8.cn 技能库)

SKILL.md

打开原始 SKILL.md(GitHub raw)

---
name: edge-tts
description: |
  Text-to-speech conversion using node-edge-tts npm package for generating audio from text.
  Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation.
  Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
---

# Edge-TTS Skill

## Overview

Generate high-quality text-to-speech audio using Microsoft Edge's neural TTS service via the node-edge-tts npm package. Supports multiple languages, voices, adjustable speed/pitch, and subtitle generation.

## Quick Start

When you detect TTS intent from triggers or user request:

1. **Call the tts tool** (Clawdbot built-in) to convert text to speech
2. The tool returns a MEDIA: path
3. Clawdbot routes the audio to the current channel

```javascript
// Example: Built-in tts tool usage
tts("Your text to convert to speech")
// Returns: MEDIA: /path/to/audio.mp3
```

## Trigger Detection

Recognize "tts" keyword as TTS requests. The skill automatically filters out TTS-related keywords from text before conversion to avoid converting the trigger words themselves to audio.

## Advanced Customization

### Using the Node.js Scripts

For more control, use the bundled scripts directly:

#### TTS Converter
```bash
cd scripts
npm install
node tts-converter.js "Your text" --voice en-US-AriaNeural --rate +10% --output output.mp3
```

**Options:**
- `--voice, -v`: Voice name (default: en-US-AriaNeural)
- `--lang, -l`: Language code (e.g., en-US, es-ES)
- `--format, -o`: Output format (default: audio-24khz-48kbitrate-mono-mp3)
- `--pitch`: Pitch adjustment (e.g., +10%, -20%, default)
- `--rate, -r`: Rate adjustment (e.g., +10%, -20%, default)
- `--volume`: Volume adjustment (e.g., +0%, -10%, default)
- `--save-subtitles, -s`: Save subtitles as JSON file
- `--output, -f`: Output file path (default: tts_output.mp3)
- `--proxy, -p`: Proxy URL (e.g., http://localhost:7890)
- `--timeout`: Request timeout in milliseconds (default: 10000)
- `--list-voices, -L`: List available voices

#### Configuration Manager
```bash
cd scripts
npm install
node config-manager.js --set-voice en-US-AriaNeural

node config-manager.js --set-rate +10%

node config-manager.js --get

node config-manager.js --reset
```

### Voice Selection

Common voices (use `--list-voices` for full list):

**English:**
- `en-US-MichelleNeural` (female, natural, **default**)
- `en-US-AriaNeural` (female, natural)
- `en-US-GuyNeural` (male, natural)
- `en-GB-SoniaNeural` (female, British)
- `en-GB-RyanNeural` (male, British)

**Other Languages:**
- `es-ES-ElviraNeural` (Spanish, Spain)
- `fr-FR-DeniseNeural` (French)
- `de-DE-KatjaNeural` (German)
- `ja-JP-NanamiNeural` (Japanese)
- `zh-CN-XiaoxiaoNeural` (Chinese)
- `ar-SA-ZariyahNeural` (Arabic)

### Rate Guidelines

Rate values use percentage format:
- `"default"`: Normal speed
- `"-20%"` to `"-10%"`: Slow, clear (tutorials, stories, accessibility)
- `"+10%"` to `"+20%"`: Slightly fast (summaries)
- `"+30%"` to `"+50%"`: Fast (news, efficiency)

### Output Formats

Choose audio quality based on use case:
- `audio-24khz-48kbitrate-mono-mp3`: Standard quality (voice notes, messages)
- `audio-24khz-96kbitrate-mono-mp3`: High quality (presentations, content)
- `audio-48khz-96kbitrate-stereo-mp3`: Highest quality (professional audio, music)

## Resources

### scripts/tts-converter.js
Main TTS conversion script using node-edge-tts. Generates audio files with customizable voice, rate, volume, pitch, and format. Supports subtitle generation and voice listing.

### scripts/config-manager.js
Manages persistent user preferences for TTS settings (voice, language, format, pitch, rate, volume). Stores config in `~/.tts-config.json`.

### scripts/package.json
NPM package configuration with node-edge-tts dependency.

### references/node_edge_tts_guide.md
Complete documentation for node-edge-tts npm package including:
- Full voice list by language
- Prosody options (rate, pitch, volume)
- Usage examples (CLI and Module)
- Subtitle generation
- Output formats
- Best practices and limitations

### Voice Testing
Test different voices and preview audio quality at: https://tts.travisvn.com/

Refer to this when you need specific voice details or advanced features.

## Installation

To use the bundled scripts:

```bash
cd /home/user/clawd/skills/public/tts-skill/scripts
npm install
```

This installs:
- `node-edge-tts` - TTS library
- `commander` - CLI argument parsing

## Workflow

1. **Detect intent**: Check for "tts" trigger or keyword in user message
2. **Choose method**: Use built-in `tts` tool for simple requests, or `scripts/tts-converter.js` for customization
3. **Generate audio**: Convert the target text (message, search results, summary)
4. **Return to user**: The tts tool returns a MEDIA: path; Clawdbot handles delivery

## Testing

### Basic Test
Run the test script to verify TTS functionality:
```bash
cd /home/user/clawd/skills/public/edge-tts/scripts
npm test
```
This generates a test audio file and verifies the TTS service is working.

### Voice Testing
Test different voices and preview audio quality at: https://tts.travisvn.com/

### Integration Test
Use the built-in `tts` tool for quick testing:
```javascript
// Example: Test TTS with default settings
tts("This is a test of the TTS functionality.")
```

### Configuration Test
Verify configuration persistence:
```bash
cd /home/user/clawd/skills/public/edge-tts/scripts
node config-manager.js --get
node config-manager.js --set-voice en-US-GuyNeural
node config-manager.js --get
```

## Troubleshooting

- **Test connectivity**: Run `npm test` to check if TTS service is accessible
- **Check voice availability**: Use `node tts-converter.js --list-voices` to see available voices
- **Verify proxy settings**: If using proxy, test with `node tts-converter.js "test" --proxy http://localhost:7890`
- **Check audio output**: The test should generate `test-output.mp3` in the scripts directory

## Notes

- node-edge-tts uses Microsoft Edge's online TTS service (updated, working authentication)
- No API key needed (free service)
- Output is MP3 format by default
- Requires internet connection
- Supports subtitle generation (JSON format with word-level timing)
- **Temporary File Handling**: By default, audio files are saved to the system's temporary directory (`/tmp/edge-tts-temp/` on Unix, `C:Users<user>AppDataLocalTempedge-tts-temp` on Windows) with unique filenames (e.g., `tts_1234567890_abc123.mp3`). Files are not automatically deleted - the calling application (Clawdbot) should handle cleanup after use. You can specify a custom output path with the `--output` option if permanent storage is needed.
- **TTS keyword filtering**: The skill automatically filters out TTS-related keywords (tts, TTS, text-to-speech) from text before conversion to avoid converting the trigger words themselves to audio
- For repeated preferences, use `config-manager.js` to set defaults
- **Default voice**: `en-US-MichelleNeural` (female, natural)
- Neural voices (ending in `Neural`) provide higher quality than Standard voices