Skip to main content
ClaudeWave
Skill408 repo starsupdated 7mo ago

csv-data-visualizer

CSV Data Visualizer creates interactive Plotly charts and statistical analyses from CSV files, including histograms, scatter plots, box plots, correlation heatmaps, and time series visualizations. Use this skill when users need exploratory data analysis, distribution analysis, relationship comparisons, trend identification, or automated data profiling with presentation-ready interactive visualizations.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/ailabs-393/ai-labs-claude-skills /tmp/csv-data-visualizer && cp -r /tmp/csv-data-visualizer/packages/skills/csv-data-visualizer ~/.claude/skills/csv-data-visualizer
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# CSV Data Visualizer

## Overview

This skill enables comprehensive data visualization and analysis for CSV files. It provides three main capabilities: (1) creating individual interactive visualizations using Plotly, (2) automatic data profiling with statistical summaries, and (3) generating multi-plot dashboards. The skill is optimized for exploratory data analysis, statistical reporting, and creating presentation-ready visualizations.

## When to Use This Skill

Invoke this skill when users request:
- "Visualize this CSV data"
- "Create a histogram/scatter plot/box plot from this data"
- "Show me the distribution of [column]"
- "Generate a dashboard for this dataset"
- "Profile this CSV file" or "Analyze this data"
- "Create a correlation heatmap"
- "Show trends over time"
- "Compare [variable] across [categories]"

## Core Capabilities

### 1. Individual Visualizations

Create specific chart types for detailed analysis using the `visualize_csv.py` script.

**Available Chart Types:**

**Statistical Plots:**
```bash
# Histogram - distribution of numeric data
python3 scripts/visualize_csv.py data.csv --histogram column_name --bins 30

# Box plot - show quartiles and outliers
python3 scripts/visualize_csv.py data.csv --boxplot column_name

# Box plot grouped by category
python3 scripts/visualize_csv.py data.csv --boxplot salary --group-by department

# Violin plot - distribution with probability density
python3 scripts/visualize_csv.py data.csv --violin column_name --group-by category
```

**Relationship Analysis:**
```bash
# Scatter plot with automatic trend line
python3 scripts/visualize_csv.py data.csv --scatter height weight

# Scatter plot with color and size encoding
python3 scripts/visualize_csv.py data.csv --scatter x y --color category --size value

# Correlation heatmap for all numeric columns
python3 scripts/visualize_csv.py data.csv --correlation
```

**Time Series:**
```bash
# Line chart for single variable
python3 scripts/visualize_csv.py data.csv --line date sales

# Multiple variables on same chart
python3 scripts/visualize_csv.py data.csv --line date "sales,revenue,profit"
```

**Categorical Data:**
```bash
# Bar chart (counts categories automatically)
python3 scripts/visualize_csv.py data.csv --bar category

# Pie chart for composition
python3 scripts/visualize_csv.py data.csv --pie region
```

**Output Formats:**
Specify output file with desired format extension:
```bash
# Interactive HTML (default)
python3 scripts/visualize_csv.py data.csv --histogram age -o output.html

# Static image formats
python3 scripts/visualize_csv.py data.csv --scatter x y -o plot.png
python3 scripts/visualize_csv.py data.csv --correlation -o heatmap.pdf
python3 scripts/visualize_csv.py data.csv --bar category -o chart.svg
```

### 2. Automatic Data Profiling

Generate comprehensive data quality and statistical reports using the `data_profile.py` script.

**Text Report (default):**
```bash
python3 scripts/data_profile.py data.csv
```

**HTML Report:**
```bash
python3 scripts/data_profile.py data.csv -f html -o report.html
```

**JSON Report:**
```bash
python3 scripts/data_profile.py data.csv -f json -o profile.json
```

**What the Profiler Provides:**
- File information (size, dimensions)
- Dataset overview (shape, memory usage, duplicates)
- Column-by-column analysis (types, missing data, unique values)
- Missing data patterns and completeness
- Statistical summary for numeric columns (mean, std, quartiles, skewness, kurtosis)
- Categorical column analysis (frequency counts, most/least common values)
- Data quality checks (high missing data, duplicate rows, constant columns, high cardinality)

**When to Use Profiling:**
Always recommend running data profiling BEFORE creating visualizations when:
- User is unfamiliar with the dataset
- Data quality is unknown
- Need to identify appropriate visualization types
- Exploring a new dataset for the first time

### 3. Multi-Plot Dashboards

Create comprehensive dashboards with multiple visualizations using the `create_dashboard.py` script.

**Automatic Dashboard:**
Analyzes data types and automatically creates appropriate visualizations:
```bash
python3 scripts/create_dashboard.py data.csv
```

Custom output location:
```bash
python3 scripts/create_dashboard.py data.csv -o my_dashboard.html
```

Control number of plots:
```bash
python3 scripts/create_dashboard.py data.csv --max-plots 9
```

**Custom Dashboard from Config:**
Create a JSON configuration file specifying exact plots:
```bash
python3 scripts/create_dashboard.py data.csv --config config.json
```

**Dashboard Config Format:**
```json
{
  "title": "Sales Analysis Dashboard",
  "plots": [
    {"type": "histogram", "column": "revenue"},
    {"type": "box", "column": "revenue", "group_by": "region"},
    {"type": "scatter", "column": "advertising", "group_by": "revenue"},
    {"type": "bar", "column": "product_category"},
    {"type": "correlation"}
  ]
}
```

**Dashboard Plot Types:**
- `histogram`: Distribution of numeric column
- `box`: Box plot, optionally grouped by category
- `scatter`: Relationship between two numeric columns
- `bar`: Count of categorical values
- `correlation`: Heatmap of numeric correlations

## Workflow Decision Tree

Use this decision tree to determine the appropriate approach:

```
User provides CSV file
│
├─ "Profile this data" / "Analyze this data" / Unfamiliar dataset
│  └─> Run data_profile.py first
│     Then offer visualization options based on findings
│
├─ "Create dashboard" / "Overview of the data" / Multiple visualizations needed
│  ├─ User knows exact plots wanted
│  │  └─> Create JSON config → run create_dashboard.py with config
│  └─ User wants automatic dashboard
│     └─> Run create_dashboard.py (auto mode)
│
└─ Specific visualization requested ("histogram", "scatter plot", etc.)
   └─> Use visualize_csv.py with appropriate flag
```

## Best Practices

### Starting Analysis
1. **Always profile first** for unfamiliar datasets: `python3 scripts/data_p
brand-analyzerSkill

This skill should be used when the user requests brand analysis, brand guidelines creation, brand audits, or establishing brand identity and consistency standards. It provides comprehensive frameworks for analyzing brand elements and creating actionable brand guidelines based on requirements.

business-analytics-reporterSkill

This skill should be used when analyzing business sales and revenue data from CSV files to identify weak areas, generate statistical insights, and provide strategic improvement recommendations. Use when the user requests a business performance report, asks to analyze sales data, wants to identify areas of weakness, or needs recommendations on business improvement strategies.

business-document-generatorSkill

This skill should be used when the user requests to create professional business documents (proposals, business plans, or budgets) from templates. It provides PDF templates and a Python script for generating filled documents from user data.

cicd-pipeline-generatorSkill

This skill should be used when creating or configuring CI/CD pipeline files for automated testing, building, and deployment. Use this for generating GitHub Actions workflows, GitLab CI configs, CircleCI configs, or other CI/CD platform configurations. Ideal for setting up automated pipelines for Node.js/Next.js applications, including linting, testing, building, and deploying to platforms like Vercel, Netlify, or AWS.

codebase-documenterSkill

This skill should be used when writing documentation for codebases, including README files, architecture documentation, code comments, and API documentation. Use this skill when users request help documenting their code, creating getting-started guides, explaining project structure, or making codebases more accessible to new developers. The skill provides templates, best practices, and structured approaches for creating clear, beginner-friendly documentation.

data-analystSkill

This skill should be used when analyzing CSV datasets, handling missing values through intelligent imputation, and creating interactive dashboards to visualize data trends. Use this skill for tasks involving data quality assessment, automated missing value detection and filling, statistical analysis, and generating Plotly Dash dashboards for exploratory data analysis.

docker-containerizationSkill

This skill should be used when containerizing applications with Docker, creating Dockerfiles, docker-compose configurations, or deploying containers to various platforms. Ideal for Next.js, React, Node.js applications requiring containerization for development, production, or CI/CD pipelines. Use this skill when users need Docker configurations, multi-stage builds, container orchestration, or deployment to Kubernetes, ECS, Cloud Run, etc.

docxSkill

Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When Claude needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks