Image To Data — 技能 — openclaw中文资讯站

技能详情（站内镜像，无评论）

Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings.

媒体与内容

许可证：MIT-0

MIT-0 ·免费使用、修改和重新分发。无需归因。

版本：v2.0.0

统计：⭐ 0 · 1.4k · 4 current installs · 4 all-time installs

⭐ 0

安装量（当前） 4

🛡 VirusTotal ：可疑 · OpenClaw ：可疑

Package：datadrivenconstruction/image-to-data

安全扫描（ClawHub）

VirusTotal ：可疑
OpenClaw ：可疑

OpenClaw 评估

The skill's stated purpose (image-to-data) is plausible, but its runtime instructions expect API keys from environment variables while the package metadata declares no required env vars — this mismatch and the declared filesystem/network permissions are disproportionate and worth caution.

目的

The skill claims to extract structured data from construction images (OCR, object detection, measurements) which coheres with requiring filesystem and network access (for local images + vision APIs). However, the manifest (requires.env: none) does not declare any API keys even though the instructions explicitly say 'All API keys loaded from environment variables' and mention calling Claude/OpenAI Vision. That omission is an inconsistency: eith…

说明范围

SKILL.md and instructions.md direct the agent to read arbitrary image file paths, perform OCR/detection, and call external AI Vision APIs. The docs are vague about which env vars/endpoints to use and do not constrain filesystem paths. The agent is therefore instructed to access local files and make network calls, and could access environment variables broadly because no specific keys are declared.

安装机制

This is an instruction-only skill with no install spec and no code files to be downloaded or executed at install time, which minimizes install-time risk.

证书

The skill requires network and filesystem permissions (declared in claw.json) but declares no required environment variables. The instructions nevertheless expect API keys from env vars (e.g., Claude/OpenAI Vision). That mismatch is disproportionate: it is unclear which specific secrets are needed and the skill could try to use any env var present. Network + filesystem access combined with unspecified secret usage increases the risk of acciden…

持久

always is false and there is no install step that modifies other skills or system-wide configuration. The skill does not request permanent/autonomous elevation beyond normal agent invocation.

安装（复制给龙虾 AI）

将下方整段复制到龙虾中文库对话中，由龙虾按 SKILL.md 完成安装。

请把本段交给龙虾中文库（龙虾 AI）执行：为本机安装 OpenClaw 技能「Image To Data」。简介：Extract data from construction images using AI Vision. Analyze site photos, sca…。
请 fetch 以下地址读取 SKILL.md 并按文档完成安装：https://raw.githubusercontent.com/openclaw/skills/refs/heads/main/skills/datadrivenconstruction/image-to-data/SKILL.md
（来源：yingzhi8.cn 技能库）

SKILL.md

打开原始 SKILL.md（GitHub raw）

---
name: "image-to-data"
description: "Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings."
---

# Image To Data

## Overview

Based on DDC methodology (Chapter 2.4), this skill extracts structured data from construction images using computer vision, OCR, and AI models to analyze site photos, scanned documents, and drawings.

**Book Reference:** "Преобразование данных в структурированную форму" / "Data Transformation to Structured Form"

## Quick Start

```python
from dataclasses import dataclass, field
from enum import Enum
from typing import List, Dict, Optional, Any, Tuple
from datetime import datetime
import json
import base64

class ImageType(Enum):
    """Types of construction images"""
    SITE_PHOTO = "site_photo"
    SCANNED_DOCUMENT = "scanned_document"
    FLOOR_PLAN = "floor_plan"
    ELEVATION = "elevation"
    DETAIL_DRAWING = "detail_drawing"
    PROGRESS_PHOTO = "progress_photo"
    SAFETY_PHOTO = "safety_photo"
    DEFECT_PHOTO = "defect_photo"
    MATERIAL_PHOTO = "material_photo"
    EQUIPMENT_PHOTO = "equipment_photo"

class ExtractionType(Enum):
    """Types of data extraction"""
    OCR_TEXT = "ocr_text"
    TABLE = "table"
    OBJECT_DETECTION = "object_detection"
    MEASUREMENT = "measurement"
    CLASSIFICATION = "classification"
    PROGRESS = "progress"

@dataclass
class BoundingBox:
    """Bounding box for detected region"""
    x: int
    y: int
    width: int
    height: int
    confidence: float = 1.0

@dataclass
class TextRegion:
    """Extracted text region from image"""
    text: str
    bbox: BoundingBox
    confidence: float
    language: str = "en"

@dataclass
class DetectedObject:
    """Detected object in image"""
    label: str
    bbox: BoundingBox
    confidence: float
    attributes: Dict[str, Any] = field(default_factory=dict)

@dataclass
class ExtractedTable:
    """Extracted table from image"""
    headers: List[str]
    rows: List[List[str]]
    bbox: BoundingBox
    confidence: float

@dataclass
class ProgressMeasurement:
    """Progress measurement from image"""
    element_type: str
    total_count: int
    completed_count: int
    percent_complete: float
    area_sqft: Optional[float] = None
    volume_cuft: Optional[float] = None

@dataclass
class ImageAnalysisResult:
    """Complete image analysis result"""
    image_id: str
    image_type: ImageType
    text_regions: List[TextRegion]
    detected_objects: List[DetectedObject]
    tables: List[ExtractedTable]
    progress: Optional[ProgressMeasurement] = None
    metadata: Dict[str, Any] = field(default_factory=dict)
    processing_time: float = 0.0


class OCREngine:
    """OCR engine for text extraction"""

    def __init__(self, engine: str = "tesseract"):
        self.engine = engine
        self.supported_languages = ["en", "ru", "de", "fr", "es"]

    def extract_text(
        self,
        image_data: bytes,
        language: str = "en"
    ) -> List[TextRegion]:
        """Extract text from image"""
        # Simulated OCR extraction (use actual OCR library in production)
        # In production: pytesseract, EasyOCR, or cloud OCR services

        regions = []

        # Simulate detecting title block in drawing
        regions.append(TextRegion(
            text="PROJECT: OFFICE BUILDING",
            bbox=BoundingBox(x=100, y=50, width=300, height=30, confidence=0.95),
            confidence=0.95,
            language=language
        ))

        regions.append(TextRegion(
            text="DRAWING: A-101",
            bbox=BoundingBox(x=100, y=90, width=200, height=25, confidence=0.92),
            confidence=0.92,
            language=language
        ))

        regions.append(TextRegion(
            text="SCALE: 1:100",
            bbox=BoundingBox(x=100, y=120, width=150, height=20, confidence=0.88),
            confidence=0.88,
            language=language
        ))

        return regions

    def extract_structured_text(
        self,
        image_data: bytes,
        template: Optional[Dict] = None
    ) -> Dict[str, str]:
        """Extract structured text using template matching"""
        # Extract text regions
        regions = self.extract_text(image_data)

        # Match to template fields
        structured = {}

        if template:
            for field_name, field_config in template.items():
                # Find matching region
                for region in regions:
                    if field_config.get("keyword") in region.text.lower():
                        structured[field_name] = region.text
                        break
        else:
            # Default extraction
            for region in regions:
                if "PROJECT:" in region.text:
                    structured["project_name"] = region.text.split(":")[-1].strip()
                elif "DRAWING:" in region.text:
                    structured["drawing_number"] = region.text.split(":")[-1].strip()
                elif "SCALE:" in region.text:
                    structured["scale"] = region.text.split(":")[-1].strip()

        return structured


class ObjectDetector:
    """Object detection for construction images"""

    def __init__(self, model: str = "yolov8"):
        self.model = model
        self.construction_classes = self._load_construction_classes()

    def _load_construction_classes(self) -> Dict[str, Dict]:
        """Load construction-specific object classes"""
        return {
            # Equipment
            "excavator": {"category": "equipment", "safety_zone": 20},
            "crane": {"category": "equipment", "safety_zone": 30},
            "forklift": {"category": "equipment", "safety_zone": 10},
            "concrete_mixer": {"category": "equipment", "safety_zone": 5},
            "scaffolding": {"category": "equipment", "safety_zone": 5},

            # Safety
            "hard_hat": {"category": "ppe", "required": True},
            "safety_vest": {"category": "ppe", "required": True},
            "safety_glasses": {"category": "ppe", "required": False},
            "harness": {"category": "ppe", "required": False},

            # Materials
            "rebar_bundle": {"category": "material", "unit": "bundle"},
            "concrete_block": {"category": "material", "unit": "pallet"},
            "lumber_stack": {"category": "material", "unit": "bundle"},
            "pipe_stack": {"category": "material", "unit": "bundle"},

            # Workers
            "worker": {"category": "person", "track": True},

            # Building elements
            "column": {"category": "structure"},
            "beam": {"category": "structure"},
            "slab": {"category": "structure"},
            "wall": {"category": "structure"},
        }

    def detect(
        self,
        image_data: bytes,
        confidence_threshold: float = 0.5
    ) -> List[DetectedObject]:
        """Detect objects in image"""
        # Simulated detection (use actual model in production)
        # In production: YOLO, Faster R-CNN, etc.

        detected = []

        # Simulate detected objects
        sample_detections = [
            ("worker", 0.92, BoundingBox(200, 300, 80, 180, 0.92)),
            ("hard_hat", 0.88, BoundingBox(210, 300, 30, 25, 0.88)),
            ("safety_vest", 0.85, BoundingBox(210, 340, 60, 80, 0.85)),
            ("scaffolding", 0.78, BoundingBox(400, 100, 200, 400, 0.78)),
            ("concrete_block", 0.72, BoundingBox(50, 450, 100, 50, 0.72)),
        ]

        for label, conf, bbox in sample_detections:
            if conf >= confidence_threshold:
                class_info = self.construction_classes.get(label, {})
                detected.append(DetectedObject(
                    label=label,
                    bbox=bbox,
                    confidence=conf,
                    attributes=class_info
                ))

        return detected

    def detect_safety_compliance(
        self,
        image_data: bytes
    ) -> Dict:
        """Detect safety compliance in image"""
        objects = self.detect(image_data)

        workers = [o for o in objects if o.label == "worker"]
        hard_hats = [o for o in objects if o.label == "hard_hat"]
        vests = [o for o in objects if o.label == "safety_vest"]

        compliance = {
            "workers_detected": len(workers),
            "hard_hats_detected": len(hard_hats),
            "vests_detected": len(vests),
            "hard_hat_compliance": len(hard_hats) / len(workers) if workers else 1.0,
            "vest_compliance": len(vests) / len(workers) if workers else 1.0,
            "overall_compliance": "compliant" if len(hard_hats) >= len(workers) else "non-compliant",
            "violations": []
        }

        if len(hard_hats) < len(workers):
            compliance["violations"].append({
                "type": "missing_hard_hat",
                "count": len(workers) - len(hard_hats)
            })

        return compliance


class TableExtractor:
    """Extract tables from images"""

    def extract_tables(
        self,
        image_data: bytes,
        detect_headers: bool = True
    ) -> List[ExtractedTable]:
        """Extract tables from image"""
        # Simulated table extraction
        # In production: Camelot, Tabula, or custom CNN

        tables = []

        # Simulate a schedule table
        tables.append(ExtractedTable(
            headers=["Activity", "Start", "End", "Duration"],
            rows=[
                ["Foundation", "2024-01-01", "2024-01-15", "14 days"],
                ["Framing", "2024-01-16", "2024-02-28", "44 days"],
                ["MEP Rough-in", "2024-03-01", "2024-03-31", "31 days"]
            ],
            bbox=BoundingBox(50, 200, 500, 200, 0.85),
            confidence=0.85
        ))

        return tables

    def table_to_dataframe(self, table: ExtractedTable) -> Dict:
        """Convert table to dictionary (DataFrame-like)"""
        return {
            "columns": table.headers,
            "data": table.rows,
            "records": [
                dict(zip(table.headers, row))
                for row in table.rows
            ]
        }


class ProgressAnalyzer:
    """Analyze construction progress from images"""

    def __init__(self):
        self.reference_models = {}

    def analyze_progress(
        self,
        current_image: bytes,
        reference_image: Optional[bytes] = None,
        element_type: str = "general"
    ) -> ProgressMeasurement:
        """Analyze progress by comparing images"""
        # Simulated progress analysis
        # In production: Use semantic segmentation + comparison

        # Simulate progress detection
        return ProgressMeasurement(
            element_type=element_type,
            total_count=100,
            completed_count=65,
            percent_complete=65.0,
            area_sqft=15000.0,
            volume_cuft=None
        )

    def compare_with_plan(
        self,
        site_photo: bytes,
        plan_image: bytes
    ) -> Dict:
        """Compare site photo with plan"""
        return {
            "match_score": 0.78,
            "deviations": [],
            "completion_estimate": 65.0,
            "areas_of_concern": []
        }


class ConstructionImageAnalyzer:
    """
    Main class for construction image analysis.
    Based on DDC methodology Chapter 2.4.
    """

    def __init__(self):
        self.ocr = OCREngine()
        self.detector = ObjectDetector()
        self.table_extractor = TableExtractor()
        self.progress_analyzer = ProgressAnalyzer()

    def analyze_image(
        self,
        image_data: bytes,
        image_type: ImageType,
        image_id: str = "img_001",
        extract_types: Optional[List[ExtractionType]] = None
    ) -> ImageAnalysisResult:
        """
        Analyze a construction image.

        Args:
            image_data: Image data as bytes
            image_type: Type of image
            image_id: Unique image identifier
            extract_types: Types of extraction to perform

        Returns:
            Complete analysis result
        """
        start_time = datetime.now()

        if extract_types is None:
            extract_types = [ExtractionType.OCR_TEXT, ExtractionType.OBJECT_DETECTION]

        text_regions = []
        detected_objects = []
        tables = []
        progress = None

        # OCR extraction
        if ExtractionType.OCR_TEXT in extract_types:
            text_regions = self.ocr.extract_text(image_data)

        # Object detection
        if ExtractionType.OBJECT_DETECTION in extract_types:
            detected_objects = self.detector.detect(image_data)

        # Table extraction
        if ExtractionType.TABLE in extract_types:
            tables = self.table_extractor.extract_tables(image_data)

        # Progress analysis
        if ExtractionType.PROGRESS in extract_types:
            progress = self.progress_analyzer.analyze_progress(image_data)

        processing_time = (datetime.now() - start_time).total_seconds()

        return ImageAnalysisResult(
            image_id=image_id,
            image_type=image_type,
            text_regions=text_regions,
            detected_objects=detected_objects,
            tables=tables,
            progress=progress,
            metadata={"extraction_types": [e.value for e in extract_types]},
            processing_time=processing_time
        )

    def analyze_site_photo(
        self,
        image_data: bytes,
        image_id: str = "site_001"
    ) -> Dict:
        """Analyze site photo for progress and safety"""
        result = self.analyze_image(
            image_data,
            ImageType.SITE_PHOTO,
            image_id,
            [ExtractionType.OBJECT_DETECTION, ExtractionType.PROGRESS]
        )

        safety = self.detector.detect_safety_compliance(image_data)

        return {
            "image_id": result.image_id,
            "objects_detected": len(result.detected_objects),
            "progress": result.progress,
            "safety_compliance": safety,
            "equipment": [o.label for o in result.detected_objects if o.attributes.get("category") == "equipment"],
            "materials": [o.label for o in result.detected_objects if o.attributes.get("category") == "material"]
        }

    def extract_drawing_data(
        self,
        image_data: bytes,
        image_id: str = "dwg_001"
    ) -> Dict:
        """Extract data from scanned drawing"""
        result = self.analyze_image(
            image_data,
            ImageType.FLOOR_PLAN,
            image_id,
            [ExtractionType.OCR_TEXT, ExtractionType.TABLE]
        )

        # Extract title block info
        title_block = self.ocr.extract_structured_text(image_data)

        return {
            "image_id": result.image_id,
            "title_block": title_block,
            "text_regions": len(result.text_regions),
            "tables": [
                self.table_extractor.table_to_dataframe(t)
                for t in result.tables
            ],
            "all_text": [r.text for r in result.text_regions]
        }

    def batch_analyze(
        self,
        images: List[Tuple[bytes, ImageType, str]]
    ) -> List[ImageAnalysisResult]:
        """Analyze multiple images"""
        results = []
        for image_data, image_type, image_id in images:
            result = self.analyze_image(image_data, image_type, image_id)
            results.append(result)
        return results

    def export_results(
        self,
        result: ImageAnalysisResult,
        format: str = "json"
    ) -> str:
        """Export analysis results"""
        data = {
            "image_id": result.image_id,
            "image_type": result.image_type.value,
            "text_count": len(result.text_regions),
            "object_count": len(result.detected_objects),
            "table_count": len(result.tables),
            "texts": [
                {"text": r.text, "confidence": r.confidence}
                for r in result.text_regions
            ],
            "objects": [
                {"label": o.label, "confidence": o.confidence}
                for o in result.detected_objects
            ],
            "processing_time": result.processing_time
        }

        if format == "json":
            return json.dumps(data, indent=2)
        else:
            raise ValueError(f"Unsupported format: {format}")
```

## Common Use Cases

### Analyze Site Photo

```python
analyzer = ConstructionImageAnalyzer()

# Load image (in production, read from file)
with open("site_photo.jpg", "rb") as f:
    image_data = f.read()

result = analyzer.analyze_site_photo(image_data)

print(f"Objects detected: {result['objects_detected']}")
print(f"Safety compliance: {result['safety_compliance']['overall_compliance']}")
print(f"Progress: {result['progress'].percent_complete}%")
```

### Extract Drawing Data

```python
with open("floor_plan.png", "rb") as f:
    drawing_data = f.read()

data = analyzer.extract_drawing_data(drawing_data)

print(f"Drawing: {data['title_block'].get('drawing_number')}")
print(f"Project: {data['title_block'].get('project_name')}")
for table in data['tables']:
    print(f"Table with {len(table['records'])} rows")
```

### Detect Safety Violations

```python
detector = ObjectDetector()

with open("site_photo.jpg", "rb") as f:
    image_data = f.read()

safety = detector.detect_safety_compliance(image_data)

if safety['overall_compliance'] == 'non-compliant':
    for violation in safety['violations']:
        print(f"Violation: {violation['type']} - Count: {violation['count']}")
```

## Quick Reference

| Component | Purpose |
|-----------|---------|
| `ConstructionImageAnalyzer` | Main analysis engine |
| `OCREngine` | Text extraction |
| `ObjectDetector` | Object detection |
| `TableExtractor` | Table extraction |
| `ProgressAnalyzer` | Progress analysis |
| `ImageAnalysisResult` | Complete analysis result |

## Resources

- **Book**: "Data-Driven Construction" by Artem Boiko, Chapter 2.4
- **Website**: https://datadrivenconstruction.io

## Next Steps

- Use [cad-to-data](../cad-to-data/SKILL.md) for CAD/BIM extraction
- Use [defect-detection-ai](../../../DDC_Innovative/defect-detection-ai/SKILL.md) for defects
- Use [safety-compliance-checker](../../../DDC_Innovative/safety-compliance-checker/SKILL.md) for safety