Skip to main content
ClaudeWave
Skill336 repo starsupdated today

sap-hana-ml

This Claude Code skill provides the SAP HANA ML Python client library (hana-ml), enabling machine learning workflows directly within SAP HANA databases. Use it to build and deploy classification, regression, clustering, and time series models leveraging PAL and APL algorithms with lazy evaluation DataFrames and in-database processing for datasets too large for local memory.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/secondsky/sap-skills /tmp/sap-hana-ml && cp -r /tmp/sap-hana-ml/plugins/sap-hana-ml/skills/sap-hana-ml ~/.claude/skills/sap-hana-ml
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# SAP HANA ML Python Client (hana-ml)

## Related Skills

- **dependency-upgrade**: Use for secure dependency pinning and upgrade workflows in Python/auxiliary tooling used alongside HANA ML stacks

**Package Version**: 2.22.241011  
**Last Verified**: 2025-11-27

## Table of Contents

- [Installation & Setup](#installation--setup)
- [Quick Start](#quick-start)
- [Core Libraries](#core-libraries)
- [Common Patterns](#common-patterns)
- [Best Practices](#best-practices)
- [Bundled Resources](#bundled-resources)

---

## Installation & Setup

```bash
pip install hana-ml
```

**Requirements**: Python 3.8+, SAP HANA 2.0 SPS03+ or SAP HANA Cloud

---

## Quick Start

### Connection & DataFrame

```python
from hana_ml import ConnectionContext

# Connect
conn = ConnectionContext(
    address='<hostname>',
    port=443,
    user='<username>',
    password='<password>',
    encrypt=True
)

# Create DataFrame
df = conn.table('MY_TABLE', schema='MY_SCHEMA')
print(f"Shape: {df.shape}")
df.head(10).collect()
```

### PAL Classification

```python
from hana_ml.algorithms.pal.unified_classification import UnifiedClassification

# Train model
clf = UnifiedClassification(func='RandomDecisionTree')
clf.fit(train_df, features=['F1', 'F2', 'F3'], label='TARGET')

# Predict & evaluate
predictions = clf.predict(test_df, features=['F1', 'F2', 'F3'])
score = clf.score(test_df, features=['F1', 'F2', 'F3'], label='TARGET')
```

### APL AutoML

```python
from hana_ml.algorithms.apl.classification import AutoClassifier

# Automated classification
auto_clf = AutoClassifier()
auto_clf.fit(train_df, label='TARGET')
predictions = auto_clf.predict(test_df)
```

### Model Persistence

```python
from hana_ml.model_storage import ModelStorage

ms = ModelStorage(conn)
clf.name = 'MY_CLASSIFIER'
ms.save_model(model=clf, if_exists='replace')
```

---

## Core Libraries

### PAL (Predictive Analysis Library)
- **100+ algorithms** executed in-database
- Categories: Classification, Regression, Clustering, Time Series, Preprocessing
- **Key classes**: `UnifiedClassification`, `UnifiedRegression`, `KMeans`, `ARIMA`
- See: `references/PAL_ALGORITHMS.md` for complete list

### APL (Automated Predictive Library)
- **AutoML capabilities** with automatic feature engineering
- **Key classes**: `AutoClassifier`, `AutoRegressor`, `GradientBoostingClassifier`
- See: `references/APL_ALGORITHMS.md` for details

### DataFrames
- **Lazy evaluation** - builds SQL until `collect()` called
- **In-database processing** for optimal performance
- See: `references/DATAFRAME_REFERENCE.md` for complete API

### Visualizers
- **EDA plots**, model explanations, metrics
- **SHAP integration** for model interpretability
- See: `references/VISUALIZERS.md` for 14 visualization modules

---

## Common Patterns

### Train-Test Split
```python
from hana_ml.algorithms.pal.partition import train_test_val_split

train, test, val = train_test_val_split(
    data=df,
    training_percentage=0.7,
    testing_percentage=0.2,
    validation_percentage=0.1
)
```

### Feature Importance
```python
# APL models
importance = auto_clf.get_feature_importances()

# PAL models
from hana_ml.algorithms.pal.preprocessing import FeatureSelection
fs = FeatureSelection()
fs.fit(train_df, features=features, label='TARGET')
```

### Pipeline
```python
from hana_ml.algorithms.pal.pipeline import Pipeline
from hana_ml.algorithms.pal.preprocessing import Imputer, FeatureNormalizer

pipeline = Pipeline([
    ('imputer', Imputer(strategy='mean')),
    ('normalizer', FeatureNormalizer()),
    ('classifier', UnifiedClassification(func='RandomDecisionTree'))
])
```

---

## Best Practices

1. **Use lazy evaluation** - Operations build SQL without execution until `collect()`
2. **Leverage in-database processing** - Keep data in HANA for performance
3. **Use Unified interfaces** - Consistent APIs across algorithms
4. **Save models** - Use `ModelStorage` for persistence
5. **Explain predictions** - Use SHAP explainers for interpretability
6. **Monitor AutoML** - Use `PipelineProgressStatusMonitor` for long-running jobs

---

## Bundled Resources

### Reference Files
- **`references/DATAFRAME_REFERENCE.md`** (479 lines)
  - ConnectionContext API, DataFrame operations, SQL generation
  
- **`references/PAL_ALGORITHMS.md`** (869 lines)
  - Complete PAL algorithm reference (100+ algorithms)
  - Classification, Regression, Clustering, Time Series, Preprocessing
  
- **`references/APL_ALGORITHMS.md`** (534 lines)
  - AutoML capabilities, automated feature engineering
  - AutoClassifier, AutoRegressor, GradientBoosting classes
  
- **`references/VISUALIZERS.md`** (704 lines)
  - 14 visualization modules (EDA, SHAP, metrics, time series)
  - Plot types, configuration, export options
  
- **`references/SUPPORTING_MODULES.md`** (626 lines)
  - Model storage, spatial analytics, graph algorithms
  - Text mining, statistics, error handling

---

## Error Handling

```python
from hana_ml.ml_exceptions import Error

try:
    clf.fit(train_df, features=features, label='TARGET')
except Error as e:
    print(f"HANA ML Error: {e}")
```

---

## Documentation

- **Official Docs**: [https://help.sap.com/doc/1d0ebfe5e8dd44d09606814d83308d4b/2.0.07/en-US/hana_ml.html](https://help.sap.com/doc/1d0ebfe5e8dd44d09606814d83308d4b/2.0.07/en-US/hana_ml.html)
- **PyPI Package**: [https://pypi.org/project/hana-ml/](https://pypi.org/project/hana-ml/)
claude-automation-recommenderSkill

Analyze a codebase and recommend Claude Code automations (hooks, subagents, skills, plugins, MCP servers). Use when user asks for automation recommendations, wants to optimize their Claude Code setup, mentions improving Claude Code workflows, asks how to first set up Claude Code for a project, or wants to know what Claude Code features they should use.

claude-md-improverSkill

Audit and improve CLAUDE.md files in repositories. Use when user asks to check, audit, update, improve, or fix CLAUDE.md files. Scans for all CLAUDE.md files, evaluates quality against templates, outputs quality report, then makes targeted updates. Also use when the user mentions "CLAUDE.md maintenance" or "project memory optimization".

dependency-upgradeSkill

Secure dependency upgrades with supply chain protection, cooldowns, and staged rollout. Use when upgrading deps, configuring security policies, or preventing supply chain attacks.

grill-meSkill

Interview the user relentlessly about a plan or design until reaching shared understanding, resolving each branch of the decision tree. Use when user wants to stress-test a plan, get grilled on their design, or mentions "grill me".

sap-abap-cdsSkill

Comprehensive SAP ABAP CDS (Core Data Services) reference for data modeling, view development, and semantic enrichment. Use when creating CDS views or view entities, defining data models with annotations, working with associations and cardinality, implementing input parameters, using built-in functions, writing CASE expressions, implementing access control with DCL, handling CURR/QUAN data types, troubleshooting CDS errors, querying CDS views from ABAP, or displaying data with SALV IDA. Covers ABAP 7.4+ through ABAP Cloud.

sap-abapSkill

|

sap-ai-coreSkill

|

sap-api-styleSkill

|