Slash Command430 repo starsupdated 7mo ago

generate

The /generate slash command creates production-ready data analysis code in a specified programming language (Python, R, SQL, or JavaScript) for a designated analysis type (data-cleaning, statistical, visualization, machine-learning, or custom). Use this command when you need scaffolded, documented code with proper error handling and best practices implemented for your specific analytical task.

View source Repository: claude-data-analysis

Install in Claude Code

Copy

mkdir -p ~/.claude/commands && curl -fsSL https://raw.githubusercontent.com/liangdabiao/claude-data-analysis/HEAD/.claude/commands/generate.md -o ~/.claude/commands/generate.md

Then start a new Claude Code session; the slash command loads automatically.

Definition

generate.md

# Code Generation Command

Generate data analysis code in `$1` language for `$2` analysis type using the code-generator subagent.

## Context
- Programming language: $1 (python, r, sql, javascript)
- Analysis type: $2 (data-cleaning, statistical, visualization, machine-learning, custom)
- Current working directory: !`pwd`
- Output directory: ./generated_code/
- Available libraries and frameworks based on language

## Your Task

Use the code-generator subagent to create high-quality, production-ready analysis code:

### 1. Requirements Analysis
- Understand the specific analysis requirements
- Identify appropriate libraries and frameworks
- Consider data types and volumes
- Plan for scalability and performance

### 2. Code Architecture
- Design modular, reusable code structure
- Implement proper error handling
- Include comprehensive documentation
- Add unit tests where appropriate

### 3. Implementation
- Write clean, efficient, and maintainable code
- Include proper data validation
- Implement best practices for the language
- Add logging and debugging capabilities

### 4. Documentation
- Create comprehensive code documentation
- Include usage examples and tutorials
- Provide troubleshooting guidance
- Document dependencies and requirements

## Language Support

### Python
- **Libraries**: pandas, numpy, matplotlib, seaborn, scikit-learn, plotly
- **Use Cases**: Data cleaning, statistical analysis, machine learning, visualization
- **Output**: Jupyter notebooks, Python scripts, modules

### R
- **Libraries**: tidyverse, ggplot2, dplyr, caret, shiny
- **Use Cases**: Statistical analysis, data visualization, bioinformatics
- **Output**: R scripts, R Markdown documents, Shiny apps

### SQL
- **Dialects**: PostgreSQL, MySQL, SQLite, BigQuery, Redshift
- **Use Cases**: Data extraction, aggregation, reporting, ETL
- **Output**: SQL queries, stored procedures, views

### JavaScript
- **Libraries**: D3.js, Plotly.js, Chart.js, TensorFlow.js
- **Use Cases**: Web visualizations, interactive dashboards, client-side ML
- **Output**: HTML/JS files, Node.js scripts, web applications

## Analysis Types

### Data Cleaning
- Missing value handling
- Outlier detection and treatment
- Data type conversion
- Normalization and standardization
- Feature engineering

### Statistical Analysis
- Descriptive statistics
- Hypothesis testing
- Correlation and regression
- Time series analysis
- ANOVA and t-tests

### Visualization
- Chart creation code
- Dashboard implementation
- Interactive visualizations
- Custom plot types
- Animation and transitions

### Machine Learning
- Data preprocessing
- Model training and evaluation
- Feature selection
- Hyperparameter tuning
- Model deployment

### Custom
- User-specific requirements
- Domain-specific analysis
- Integration with existing systems
- Performance optimization
- Custom algorithms

## Expected Output

### Code Files
- `generated_code/$1_$2_analysis.py` - Main analysis script
- `generated_code/$1_$2_utils.py` - Utility functions
- `generated_code/$1_$2_config.py` - Configuration settings
- `generated_code/$1_$2_test.py` - Unit tests
- `generated_code/requirements_$1.txt` - Dependencies

### Documentation
- **README.md**: Usage instructions and examples
- **API Documentation**: Function and class documentation
- **Tutorials**: Step-by-step guides
- **Troubleshooting**: Common issues and solutions

## Code Quality Standards

### Python Code Standards
```python
"""
High-quality Python code template for data analysis
"""

import pandas as pd
import numpy as np
from typing import Dict, List, Optional
import logging
from pathlib import Path

class DataAnalyzer:
    """
    Data analysis class with comprehensive functionality

    Args:
        data_path (str): Path to input data file
        config (Dict): Configuration parameters

    Attributes:
        data (pd.DataFrame): Loaded dataset
        config (Dict): Configuration settings
        logger (logging.Logger): Logger instance
    """

    def __init__(self, data_path: str, config: Dict = None):
        self.data_path = Path(data_path)
        self.config = config or {}
        self.data = None
        self.logger = self._setup_logger()

    def _setup_logger(self) -> logging.Logger:
        """Set up logging configuration"""
        logger = logging.getLogger(__name__)
        logger.setLevel(logging.INFO)
        return logger

    def load_data(self) -> pd.DataFrame:
        """
        Load data from file with error handling

        Returns:
            pd.DataFrame: Loaded dataset

        Raises:
            FileNotFoundError: If data file doesn't exist
            ValueError: If data format is invalid
        """
        try:
            # Implementation with proper error handling
            pass
        except Exception as e:
            self.logger.error(f"Error loading data: {e}")
            raise
```

### SQL Code Standards
```sql
-- High-quality SQL template for data analysis
-- Include proper comments and documentation

-- Analysis: Customer Segmentation
-- Purpose: Identify customer segments based on purchase behavior
-- Dependencies: customers, orders, order_items tables

WITH customer_summary AS (
    -- Calculate customer-level metrics
    SELECT
        c.customer_id,
        c.customer_name,
        c.signup_date,
        COUNT(DISTINCT o.order_id) AS total_orders,
        SUM(oi.quantity * oi.unit_price) AS total_revenue,
        AVG(oi.quantity * oi.unit_price) AS avg_order_value,
        MAX(o.order_date) AS last_order_date
    FROM customers c
    LEFT JOIN orders o ON c.customer_id = o.customer_id
    LEFT JOIN order_items oi ON o.order_id = oi.order_id
    GROUP BY c.customer_id, c.customer_name, c.signup_date
),

segment_calculation AS (
    -- Calculate RFM metrics and segments
    SELECT
        customer_id,
        customer_name,
        total_orders,
        total_revenue,
        avg_order_value,
        -- Recency: days since last order
        DATEDIFF(CURRENT_DATE, last