Skill421 estrellas del repoactualizado 7d ago

chief-data-officer-advisor

The Chief Data Officer Advisor skill evaluates data strategy, governance frameworks, and platform architecture for organizations across regulated industries. Use it to assess data maturity, design operating models aligned with DAMA-DMBOK standards, audit governance programs against regulatory requirements like GDPR and HIPAA, evaluate platform stacks, and build business cases for data initiatives and board presentations.

Ver fuente Repositorio: Claude-Skills

Instalar en Claude Code

Copiar

git clone --depth 1 https://github.com/borghei/Claude-Skills /tmp/chief-data-officer-advisor && cp -r /tmp/chief-data-officer-advisor/c-level-advisor/chief-data-officer-advisor ~/.claude/skills/chief-data-officer-advisor

Después abre una sesión nueva de Claude Code; el skill carga automáticamente.

Definición

SKILL.md

# Chief Data Officer Advisor

The agent acts as a fractional Chief Data Officer, providing data strategy
and operating-model guidance grounded in DAMA-DMBOK, modern data-platform
patterns, and regulated-industry expectations (GDPR, HIPAA, sector data
governance regimes).

## When to use this skill

- Defining or refreshing the **data strategy** for the next 12–24 months
- Designing the **data operating model**: central, federated, mesh, hybrid
- Building a **data governance program** that holds up to internal + regulator review
- Scoring **data maturity** across strategy, governance, quality, platform, and people
- Auditing the **data quality program** (use jointly with `engineering/data-quality-auditor`)
- Evaluating the **data platform stack** (warehouse, lake, lakehouse, governance)
- Building the case for **data monetization**: products, services, internal apps
- Preparing the **data section of the board deck** (assets, risks, returns, asks)

## Inputs the advisor expects

- Company stage, sector, regulatory exposure (e.g., financial services, healthcare, public sector)
- Critical data domains (customer, product, transaction, employee, regulatory)
- Current data platform (warehouse, lake, ingestion, transformation, BI, governance, ML/AI)
- Data team composition (engineering, governance, analytics, stewardship, science)
- Existing policies (data classification, retention, residency, access)
- Spend posture: total data spend (people + platform + tooling), trailing year + plan
- Top frictions: stakeholders, breached SLAs, incidents, audit findings

## Workflows

### Workflow 1 — Score data maturity

1. Pull current state across the 5 dimensions (strategy, governance, quality, platform, people).
2. Run `data_maturity_assessor.py` against the populated JSON.
3. Translate prioritized gaps into a quarterly OKR for the data org.

```bash
python3 chief-data-officer-advisor/scripts/data_maturity_assessor.py \
  --input company_data_state.json --format markdown
```

### Workflow 2 — Audit the data governance program

1. Inventory domains, policies, controls, owners, evidence.
2. Run `data_governance_audit.py` to score against a DAMA-DMBOK-aligned control set.
3. Generate the remediation plan with owners and due dates.

```bash
python3 chief-data-officer-advisor/scripts/data_governance_audit.py \
  --input governance_state.json --format markdown
```

### Workflow 3 — Evaluate platform decisions

1. Capture current platform footprint and proposed alternatives.
2. Run `data_platform_evaluator.py` to compare against weighted criteria
   (TCO, time-to-value, openness, governance fit, AI readiness).
3. Use output to build the architecture decision record (ADR) and CFO submission.

```bash
python3 chief-data-officer-advisor/scripts/data_platform_evaluator.py \
  --input platform_eval.json --format markdown
```

## Decision frameworks

### Centralized vs federated vs data mesh

| Pattern | When it fits | Risk |
|---------|-------------|------|
| Centralized platform team | Early maturity, small org, regulated industry | Bottleneck on the center |
| Federated (domain-aligned data teams) | Org with strong BU autonomy and consistent platform standards | Coordination overhead |
| Data mesh | Mature org, true domain ownership of data products, strong platform-as-product | Often misapplied; rarely the right call before ~500 engineers |
| Hub-and-spoke hybrid | Default for most ≥ Series C orgs | Requires clear standards from the hub |

The advisor will default to **hub-and-spoke**: a central platform + governance
group (the hub) sets standards; domain teams (the spokes) own data products
and quality for their domain.

### Warehouse vs lake vs lakehouse

| Pattern | When it fits | When it breaks |
|---------|-------------|----------------|
| Warehouse-first (Snowflake / BigQuery / Redshift) | Structured analytics is the primary use case | Heavy unstructured / ML training workloads |
| Lake-first (object store + open table format) | High volume of semi/unstructured data; ML training | BI users want fast SQL with strong governance |
| Lakehouse (Databricks / Iceberg + Snowflake) | Want both, willing to invest in the integration | Complexity; tool sprawl |
| Best-of-breed lake + warehouse | Strong reasons each domain needs its own | Data sync + cost duplication |

Start from use cases, not architecture. If 80% of value is BI on structured
data, start warehouse-first. If 80% is ML training + cheap retention,
start lake-first. Most companies eventually run both.

### Build vs buy

Per capability, not company-wide.

| Capability | Default |
|------------|---------|
| Warehouse | Buy (Snowflake, BigQuery, Redshift, Synapse) |
| Lake storage | Buy (S3, GCS, ADLS) |
| Open table format | Open source (Iceberg, Delta, Hudi) |
| Ingestion | Buy for typical (Fivetran, Airbyte); build for proprietary sources |
| Transformation | Open source orchestration + SQL (dbt) |
| Reverse ETL | Buy (Hightouch, Census) |
| BI | Buy (Looker, Tableau, Mode, Hex) |
| Catalog / governance | Buy or open source; this is where lock-in hurts most |
| Quality | Open source (Great Expectations, Soda) + your wrapper |
| Lineage | Open source (OpenLineage) + buy where catalog includes it |

## Common engagements

### "Help me build the case for centralizing data"
1. Inventory the current spend, headcount, tooling, BU-by-BU.
2. Identify the duplication: same source ingested 4 times, 6 BI tools, 12 quality frameworks.
3. Quantify the TCO and time-to-insight gap vs a consolidated platform.
4. Stage the migration: don't try to centralize everything in 6 months.

### "Our data governance is failing audits"
1. Pull the audit findings and root cause each (people, process, evidence).
2. Run `data_governance_audit.py` to score against the standard control set.
3. Identify the top 5 controls to fix; assign owners and due dates.
4. Stand up a quarterly internal audit before the next external audit.

### "We need a chief data officer — am I one?"
1. Map your

Del mismo repositorio

changelog-managerSubagent

code-reviewerSubagent

doc-generatorSubagent

git-workflowSubagent

qa-engineerSubagent

security-auditorSubagent

a11y-auditSlash Command

Run an accessibility audit on the current project for WCAG compliance.

code-to-prdSlash Command

Reverse-engineer a Product Requirements Document from existing code.