技能详情(站内镜像,无评论)
许可证:MIT-0
MIT-0 ·免费使用、修改和重新分发。无需归因。
版本:v2.1.0
统计:⭐ 3 · 1.6k · 7 current installs · 7 all-time installs
⭐ 3
安装量(当前) 7
🛡 VirusTotal :良性 · OpenClaw :良性
Package:datadrivenconstruction/csv-handler
安全扫描(ClawHub)
- VirusTotal :良性
- OpenClaw :良性
OpenClaw 评估
The skill is an instruction-only CSV processing helper that is internally consistent with its stated purpose and does not request unexpected credentials or network access; the only notable issue is a missing declaration of Python library dependencies (pandas, chardet).
目的
Name/description match the instructions and included Python code (CSV encoding/delimiter detection, cleaning, merging, splitting). Required binary (python3) is appropriate for a Python-based implementation. The declared filesystem permission in claw.json aligns with reading/writing CSV files.
说明范围
SKILL.md contains concrete Python code for reading, profiling, cleaning, merging, and splitting CSVs and limits actions to files the user provides. It will read files from disk (user-supplied paths) and write split/merged outputs; there are no instructions to read unrelated system configuration or to transmit data to external endpoints. Note: because it reads arbitrary files the user supplies, only provide non-sensitive data if you have concerns.
安装机制
This is an instruction-only skill (no install spec), which is low risk. However, the bundled Python code depends on third-party packages (pandas, chardet) that are not declared in the skill metadata or install spec — the environment must already have these libraries installed or the code will fail.
证书
The skill requests no environment variables, no credentials, and no external config paths. That is proportionate for a CSV-processing utility.
持久
The skill is not set to always:true and uses normal agent invocation. It does not attempt to modify other skills or system-wide settings in the provided materials.
综合结论
This skill appears to do what it says: profile and clean construction CSV exports. Before installing or running it: 1) Ensure your runtime has python3 with required libraries (pandas, chardet) since the skill assumes those but doesn't install them. 2) Only supply files you are comfortable having the agent read — the skill will read user-provided file paths and may write output files. 3) There are no declared network endpoints or credential req…
安装(复制给龙虾 AI)
将下方整段复制到龙虾中文库对话中,由龙虾按 SKILL.md 完成安装。
请把本段交给龙虾中文库(龙虾 AI)执行:为本机安装 OpenClaw 技能「Csv Handler」。简介:Handle CSV files from construction software exports. Auto-detect delimiters, en…。
请 fetch 以下地址读取 SKILL.md 并按文档完成安装:https://raw.githubusercontent.com/openclaw/skills/refs/heads/main/skills/datadrivenconstruction/csv-handler/SKILL.md
(来源:yingzhi8.cn 技能库)
SKILL.md
---
name: "csv-handler"
description: "Handle CSV files from construction software exports. Auto-detect delimiters, encodings, and clean messy data."
homepage: "https://datadrivenconstruction.io"
metadata: {"openclaw": {"emoji": "🏷️", "os": ["darwin", "linux", "win32"], "homepage": "https://datadrivenconstruction.io", "requires": {"bins": ["python3"]}}}
---
# CSV Handler for Construction Data
## Overview
CSV is the universal exchange format in construction - from scheduling exports to cost databases. This skill handles encoding issues, delimiter detection, and data cleaning.
## Python Implementation
```python
import pandas as pd
import csv
from typing import Dict, Any, List, Optional, Tuple
from pathlib import Path
from dataclasses import dataclass
import chardet
@dataclass
class CSVProfile:
"""Profile of CSV file."""
encoding: str
delimiter: str
has_header: bool
row_count: int
column_count: int
columns: List[str]
class ConstructionCSVHandler:
"""Handle CSV files from construction software."""
COMMON_DELIMITERS = [',', ';', 't', '|']
COMMON_ENCODINGS = ['utf-8', 'utf-8-sig', 'latin-1', 'cp1252', 'iso-8859-1']
def __init__(self):
self.last_profile: Optional[CSVProfile] = None
def detect_encoding(self, file_path: str) -> str:
"""Detect file encoding."""
with open(file_path, 'rb') as f:
raw = f.read(10000)
result = chardet.detect(raw)
return result.get('encoding', 'utf-8') or 'utf-8'
def detect_delimiter(self, file_path: str, encoding: str) -> str:
"""Detect CSV delimiter."""
with open(file_path, 'r', encoding=encoding, errors='replace') as f:
sample = f.read(5000)
# Count occurrences
counts = {d: sample.count(d) for d in self.COMMON_DELIMITERS}
# Return most common that appears consistently
if counts:
return max(counts, key=counts.get)
return ','
def profile_csv(self, file_path: str) -> CSVProfile:
"""Profile CSV file."""
encoding = self.detect_encoding(file_path)
delimiter = self.detect_delimiter(file_path, encoding)
# Read sample
df = pd.read_csv(file_path, encoding=encoding, delimiter=delimiter,
nrows=10, on_bad_lines='skip')
has_header = not df.columns[0].replace('.', '').replace('-', '').isdigit()
# Full row count
with open(file_path, 'r', encoding=encoding, errors='replace') as f:
row_count = sum(1 for _ in f) - (1 if has_header else 0)
profile = CSVProfile(
encoding=encoding,
delimiter=delimiter,
has_header=has_header,
row_count=row_count,
column_count=len(df.columns),
columns=list(df.columns)
)
self.last_profile = profile
return profile
def read_csv(self, file_path: str,
encoding: Optional[str] = None,
delimiter: Optional[str] = None,
clean: bool = True) -> pd.DataFrame:
"""Read CSV with auto-detection."""
# Auto-detect if not provided
if encoding is None:
encoding = self.detect_encoding(file_path)
if delimiter is None:
delimiter = self.detect_delimiter(file_path, encoding)
# Read with error handling
df = pd.read_csv(
file_path,
encoding=encoding,
delimiter=delimiter,
on_bad_lines='skip',
low_memory=False
)
if clean:
df = self.clean_dataframe(df)
return df
def clean_dataframe(self, df: pd.DataFrame) -> pd.DataFrame:
"""Clean construction CSV data."""
# Clean column names
df.columns = [self._clean_column_name(c) for c in df.columns]
# Remove empty rows and columns
df = df.dropna(how='all')
df = df.dropna(axis=1, how='all')
# Strip whitespace from strings
for col in df.select_dtypes(include=['object']):
df[col] = df[col].str.strip() if df[col].dtype == 'object' else df[col]
return df
def _clean_column_name(self, name: str) -> str:
"""Clean column name."""
if not isinstance(name, str):
return str(name)
# Remove special characters, replace spaces
clean = name.strip().lower()
clean = clean.replace(' ', '_').replace('-', '_')
clean = ''.join(c for c in clean if c.isalnum() or c == '_')
return clean
def merge_csvs(self, file_paths: List[str],
on_column: Optional[str] = None) -> pd.DataFrame:
"""Merge multiple CSV files."""
dfs = []
for path in file_paths:
df = self.read_csv(path)
df['_source_file'] = Path(path).name
dfs.append(df)
if not dfs:
return pd.DataFrame()
if on_column and on_column in dfs[0].columns:
result = dfs[0]
for df in dfs[1:]:
result = pd.merge(result, df, on=on_column, how='outer')
return result
return pd.concat(dfs, ignore_index=True)
def split_csv(self, df: pd.DataFrame,
group_column: str,
output_dir: str) -> List[str]:
"""Split CSV by column values."""
output_path = Path(output_dir)
output_path.mkdir(parents=True, exist_ok=True)
files = []
for value in df[group_column].unique():
subset = df[df[group_column] == value]
filename = f"{group_column}_{value}.csv"
filepath = output_path / filename
subset.to_csv(filepath, index=False)
files.append(str(filepath))
return files
def convert_types(self, df: pd.DataFrame,
type_map: Dict[str, str] = None) -> pd.DataFrame:
"""Convert column types intelligently."""
df = df.copy()
if type_map:
for col, dtype in type_map.items():
if col in df.columns:
try:
df[col] = df[col].astype(dtype)
except:
pass
else:
# Auto-convert
for col in df.columns:
# Try numeric
try:
df[col] = pd.to_numeric(df[col])
continue
except:
pass
# Try datetime
try:
df[col] = pd.to_datetime(df[col])
except:
pass
return df
def export_csv(self, df: pd.DataFrame,
file_path: str,
encoding: str = 'utf-8-sig',
delimiter: str = ',') -> str:
"""Export DataFrame to CSV."""
df.to_csv(file_path, encoding=encoding, sep=delimiter, index=False)
return file_path
# Specialized handlers
class ScheduleCSVHandler(ConstructionCSVHandler):
"""Handler for project schedule CSVs."""
SCHEDULE_COLUMNS = ['task_id', 'task_name', 'start_date', 'end_date',
'duration', 'predecessors', 'resources']
def parse_schedule(self, file_path: str) -> pd.DataFrame:
"""Parse schedule CSV."""
df = self.read_csv(file_path)
# Convert date columns
for col in df.columns:
if 'date' in col.lower() or 'start' in col.lower() or 'end' in col.lower():
try:
df[col] = pd.to_datetime(df[col])
except:
pass
return df
class CostCSVHandler(ConstructionCSVHandler):
"""Handler for cost/estimate CSVs."""
def parse_costs(self, file_path: str) -> pd.DataFrame:
"""Parse cost CSV."""
df = self.read_csv(file_path)
# Find and convert numeric columns
for col in df.columns:
if any(word in col.lower() for word in ['cost', 'price', 'amount', 'total', 'qty', 'quantity']):
df[col] = pd.to_numeric(df[col].replace(r'[$,]', '', regex=True), errors='coerce')
return df
```
## Quick Start
```python
handler = ConstructionCSVHandler()
# Profile CSV first
profile = handler.profile_csv("export.csv")
print(f"Encoding: {profile.encoding}, Delimiter: '{profile.delimiter}'")
# Read with auto-detection
df = handler.read_csv("export.csv")
print(f"Loaded {len(df)} rows, {len(df.columns)} columns")
```
## Common Use Cases
### 1. Merge Multiple Exports
```python
files = ["jan_export.csv", "feb_export.csv", "mar_export.csv"]
merged = handler.merge_csvs(files)
```
### 2. Split by Category
```python
handler.split_csv(df, group_column='category', output_dir='./split_files')
```
### 3. Schedule Import
```python
schedule_handler = ScheduleCSVHandler()
schedule = schedule_handler.parse_schedule("p6_export.csv")
```
## Resources
- **DDC Book**: Chapter 2.1 - Structured Data