Nano Banana — 技能 — openclaw中文资讯站

技能详情（站内镜像，无评论）

Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting

媒体与内容

作者：Anil Chandra Naidu Matcha @anil-matcha

许可证：MIT-0

MIT-0 ·免费使用、修改和重新分发。无需归因。

版本：v0.1.0

统计：⭐ 0 · 150 · 3 current installs · 3 all-time installs

⭐ 0

安装量（当前） 3

🛡 VirusTotal ：可疑 · OpenClaw ：可疑

Package：anil-matcha/muapi-nano-banana-skill

安全扫描（ClawHub）

VirusTotal ：可疑
OpenClaw ：可疑

OpenClaw 评估

The skill's files and instructions mostly match an image-prompting wrapper, but there are inconsistencies (mentions muapi.ai and a proprietary model yet requires no credentials and calls a core script outside the skill), so review the platform primitive it invokes before installing.

目的

The skill claims to generate images via muapi.ai and a 'nano-banana-pro' model, but the shipped script does not call any network endpoint or muapi-specific client; instead it delegates to a relative path core/media/generate-image.sh. This could be legitimate if the platform provides the muapi adapter and credentials, but the README and manifest do not explain that dependency.

说明范围

SKILL.md confines the agent to rewrite prompts into a reasoning brief and run the provided script. The script itself only constructs a prompt and invokes a single core primitive script; it does not read arbitrary files or environment variables. The only notable action is executing a script located outside the skill directory (../../../../core/media/generate-image.sh).

安装机制

No install spec (instruction-only + one helper script). No downloads or package installs are performed by the skill itself, so there is low install-time risk from the skill bundle.

证书

The skill advertises use of muapi.ai but declares no required environment variables or API keys. That mismatch could be benign if the host platform supplies API keys/adapter via the core primitive, but it is unexplained in the SKILL.md and manifest—worth verifying where muapi credentials live and which component performs network calls.

持久

The skill does not request always: true and is user-invocable only. It does not attempt to modify other skills or system settings in the provided files.

安装（复制给龙虾 AI）

将下方整段复制到龙虾中文库对话中，由龙虾按 SKILL.md 完成安装。

请把本段交给龙虾中文库（龙虾 AI）执行：为本机安装 OpenClaw 技能「Nano Banana」。简介：Reasoning-driven image generation using structured creative briefs (Gemini 3 st…。
请 fetch 以下地址读取 SKILL.md 并按文档完成安装：https://raw.githubusercontent.com/openclaw/skills/refs/heads/main/skills/anil-matcha/muapi-nano-banana-skill/SKILL.md
（来源：yingzhi8.cn 技能库）

SKILL.md

打开原始 SKILL.md（GitHub raw）

---
slug: muapi-nano-banana-skill
name: muapi-nano-banana
version: "0.1.0"
description: Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting
acceptLicenseTerms: true
---

# 🍌 Nano-Banana Expert Skill (Gemini 3 Style)

**A specialized skill for AI Agents to leverage "Reasoning-Driven" image generation.**
Based on the advanced prompting architecture of Google's Gemini 3 (Nano Banana Pro), this skill moves beyond keyword stuffing to structured, logic-based creative briefs.

## Core Competencies

1. **Reasoning-Driven Prompting**: Using natural language logic to define physics, lighting, and spatial relationships.
2. **Structured Creative Briefs**: Implementing the "Perfect Prompt" formula: `Subject + Action + Context + Composition + Lighting`.
3. **Text Rendering Precision**: Explicitly defining typography and signifiers for legible text integration.
4. **Contextual Grounding**: Using "Search Grounding" logic (simulated) to anchor generations in real-world accuracy.

---

## 🏗️ Technical Specification

### 1. The "Perfect Prompt" Formula

| Component | Description | Example |
| :--- | :--- | :--- |
| **Subject** | Detailed entity description | "A stoic robot barista with exposed copper wiring" |
| **Action** | Dynamic interaction | "Pouring a latte art leaf with mechanical precision" |
| **Context** | Environment & Atmosphere | "Inside a neon-lit cyberpunk cafe at midnight" |
| **Composition** | Camera & Lens choice | "Close-up, 85mm lens, f/1.8 aperture" |
| **Lighting** | Mood & Direction | "Volumetric blue rim light, warm cafe glow" |
| **Style** | Aesthetic anchor | "Cinematic, photorealistic, 4K production value" |

### 2. Advanced Features
- **Negative Constraint Logic**: Instead of "no blurry," use "Ensure sharp focus on the subject's eyes."
- **Identity Consistency**: (Simulated) "Maintain consistent facial structure across variations."
- **Text Integration**: Use double quotes for specific text: `The sign reads "OPEN 24/7"`.

---

## 🧠 Prompt Optimization Protocol (Agent Instruction)

**Before calling the script, the Agent MUST rewrite the user's prompt into a logic-driven Reasoning Brief:**

1. **NO KEYWORD SOUP**: Remove "8k, masterpiece, ultra-detailed." Use full, descriptive sentences.
2. **PHYSICAL CONSISTENCY**: Describe how elements interact (e.g., "The light from the crystal shards casts caustic patterns across the obsidian floor").
3. **TEXT PRECISION**: If the user wants text, define it precisely: `featuring a sign that says "STORE NAME" in a weathered serif font`.
4. **OPTICAL DIRECTIVES**: Specify lens behavior: *Shallow Depth of Field (f/1.8)*, *Macro Lens*, *Anamorphic Flare*.

---

## 🚀 Protocol: Using Nano-Banana

### Step 1: Define the Creative Logic
Provide the agent with a subject and a specific scenario.

### Step 2: Invoke the Script
The `generate-nano-art.sh` script translates the logic into a structured Gemini 3-style prompt.

```bash
# Generating a reasoning-driven image
bash scripts/generate-nano-art.sh 
  --subject "a glass chess piece" 
  --action "shattering into liquid shards" 
  --context "on a obsidian table" 
  --style "macro photography"
```

---

## ⚠️ Constraints & Guardrails

- **No Keyword Soup**: **MANDATORY** - Do not use "trending on artstation, masterpiece, 8k". Use natural language descriptions.
- **Physics Logic**: Ensure the prompt describes *physically possible* lighting and reflection interactions.
- **Full Sentences**: The model parses relationships; use "light reflecting off the water" instead of "water, reflection".

---

## ⚙️ Implementation Details
This skill applies a "Logic Wrapper" around the `core/media/generate-image.sh` primitive, converting fragmented inputs into a coherent, reasoning-ready narrative prompt.