Skill230 estrellas del repoactualizado 13d ago

csv-excel-merger

The csv-excel-merger skill combines multiple CSV or Excel files into a single consolidated dataset by automatically matching columns across different schemas, deduplicating records based on identified keys, and resolving conflicts between overlapping data. Use it when users need to merge spreadsheets from different sources, consolidate data exports with varying column names or formats, or combine multiple files into one unified output with verified row counts and data integrity.

Ver fuente Repositorio: claude-skills

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/OneWave-AI/claude-skills /tmp/csv-excel-merger && cp -r /tmp/csv-excel-merger/csv-excel-merger ~/.claude/skills/csv-excel-merger

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# CSV/Excel Merger

Merge multiple CSV or Excel files with automatic column matching, deduplication, and conflict resolution.

## Contents

- [Workflow](#workflow) — the step-by-step merge process
- [Verification](#verification) — confirm the merge before handing it back
- [Special cases](#special-cases) — encoding, compound keys, large files
- [Guidelines](#guidelines) — quality and transparency standards
- [Example triggers](#example-triggers)
- `references/merge_strategies.md` — column matching, conflict resolution, and dedup options
- `references/output_template.md` — the merge-report format

## Workflow

1. **Inspect the inputs.** Determine file count, format (CSV / Excel / TSV), and whether the files are attached or read from disk. Read each header; identify column names, data types, and encoding (UTF-8, Latin-1). Note the candidate primary key.

2. **Plan the merge.** Match columns across files to one unified schema, choose a conflict-resolution rule, and pick a deduplication strategy. See `references/merge_strategies.md` for the matching heuristics and the full set of options.

3. **Execute the merge** with pandas:

   ```python
   import pandas as pd

   df1 = pd.read_csv("file1.csv")
   df2 = pd.read_csv("file2.csv")

   # Normalize, then map column names onto the unified schema
   for df in (df1, df2):
       df.columns = df.columns.str.lower().str.strip()
   df2 = df2.rename(columns={"firstname": "first_name", "e_mail": "email"})

   merged = pd.concat([df1, df2], ignore_index=True)
   merged = merged.drop_duplicates(subset=["email"], keep="last")
   merged.to_csv("merged_output.csv", index=False)
   ```

4. **Verify the result** before reporting — see [Verification](#verification).

5. **Report** using the layout in `references/output_template.md`, then offer export options: CSV (UTF-8), Excel (.xlsx), JSON, SQL INSERT statements, or Parquet for large datasets.

## Verification

Never hand back a merge without checking it. After merging, assert the row math holds and the key is actually unique:

```python
total_in = len(df1) + len(df2)
assert len(merged) > 0, "merge produced an empty frame"
assert len(merged) <= total_in, "more rows than inputs — check the concat/join"
assert merged["email"].is_unique, "duplicate keys remain after dedup"

print(f"in: {total_in} rows | out: {len(merged)} rows | removed: {total_in - len(merged)}")
print(f"null keys: {merged['email'].isna().sum()} | columns: {list(merged.columns)}")
```

Report rows in vs. out, duplicates removed, and per-column completeness so the user can sanity-check the numbers against their own expectations.

## Special cases

- **Compound keys** — when no single column is unique, key on a tuple: `subset=["email", "company"]`.
- **Mixed data types** — standardize dates, phone numbers, and country codes; strip whitespace and normalize casing *before* deduping, or near-duplicates slip through.
- **Missing columns** — fill absent columns with empty values and flag them in the report; never silently drop data.
- **Large files (>100MB)** — read in chunks (`pd.read_csv(path, chunksize=...)`), report progress, and estimate memory before loading everything at once.

## Guidelines

- **Column matching** — prefer exact, then case-insensitive, then fuzzy. Always emit the original → unified mapping so every match is auditable, and allow manual override.
- **Data quality** — trim whitespace, standardize formats, flag invalid values, preserve types.
- **Transparency** — track the source file for every surviving row, log each merge decision, and report all conflicts with their resolutions.
- **Performance** — chunk large files, process in batches, and show progress on long-running merges.

## Example triggers

- "Merge these three CSV files"
- "Combine multiple Excel sheets into one file"
- "Deduplicate and merge customer data"
- "Join spreadsheets with different column names"
- "Consolidate contact lists from different sources"

Del mismo repositorio

accessibility-auditorSkill

Audit websites for accessibility issues and WCAG compliance. Use when checking accessibility, fixing a11y issues, or ensuring WCAG compliance.

agent-armySkill

Deploy a 2-layer parallel agent hierarchy for large, parallelizable work — big refactors, multi-file migrations, codebase-wide audits, bulk generation. Layer 1 is 3-50+ specialist agents, each with its own full context window; Layer 2 is 2+ sub-agents per member. Includes git safety, tiered sizing, a pre-deploy gate, phantom-completion checks, and multi-wave follow-up.

agent-swarm-deployerSkill

Deploys swarms of sub-agents for massive parallel data processing tasks. Unlike agent-army (which is for code changes), this is for DATA tasks -- processing 1000 documents, analyzing datasets, bulk content generation. Configurable swarm size, task distribution, result aggregation, progress tracking, and error recovery.

agent-team-builderSkill

Designs and deploys custom agent teams for specific business workflows. Interactive discovery of business processes, then generates complete team configurations with specialized agent roles, tool access, communication protocols, and handoff rules.

agent-to-agentSkill

Agent-to-Agent (A2A) communication protocol. Connect two or more Claude agents that pass messages, share context, delegate tasks, and collaborate. Implements structured handoffs, shared memory, and multi-agent conversations.

ai-readiness-assessmentSkill

Assesses how ready a business is for AI adoption across six dimensions. Evaluates data maturity, tech stack, team skills, process documentation, budget, and culture. Generates a comprehensive ai-readiness-report.md with scores, gap analysis, and recommended starting points. Aligned with OneWave AI's audit methodology.

animateSkill

Generate animated videos and motion graphics from natural language descriptions. Creates a standalone Vite + React project with Framer Motion scenes that auto-play in the browser. Use when the user wants to create animations, motion graphics, video intros, animated presentations, or product demos.

api-documentation-writerSkill

Generate comprehensive API documentation including endpoint descriptions, request/response examples, authentication guides, error codes, and SDKs. Creates OpenAPI/Swagger specs, REST API docs, and developer-friendly reference materials. Use when users need to document APIs, create technical references, or write developer documentation.