observing-agentforce
Analyze production Agentforce agent behavior using session traces and Data Cloud. TRIGGER when: user queries STDM session data or Data Cloud trace records; investigates production agent failures, regressions, or performance issues; asks about session traces, conversation logs, or agent metrics; wants to reproduce a reported production issue in preview; runs findSessions or trace analysis queries. DO NOT TRIGGER when: user creates, modifies, or debugs .agent files during development (use developing-agentforce); writes or runs test specs (use testing-agentforce); uses sf agent preview for local development iteration; deploys or publishes agents.
git clone --depth 1 https://github.com/SalesforceAIResearch/agentforce-adlc /tmp/observing-agentforce && cp -r /tmp/observing-agentforce/skills/observing-agentforce ~/.claude/skills/observing-agentforceSKILL.md
# Agentforce Observability
Improve Agentforce agents using session trace data and live preview testing.
**Three-phase workflow:**
- **Observe** -- Query STDM sessions from Data Cloud (if available), OR run test suites + preview with local traces as fallback
- **Reproduce** -- Use `sf agent preview` to simulate problematic conversations live
- **Improve** -- Edit the `.agent` file directly, validate, publish, verify
---
## Platform Notes
- Shell examples below use bash syntax. On Windows, use PowerShell equivalents or Git Bash.
- Replace `python3` with `python` on Windows.
- Replace `/tmp/` with `$env:TEMP\` (PowerShell) or `%TEMP%\` (cmd).
- Replace `jq` with `python -c "import json,sys; ..."` if jq is not installed.
---
## Routing
Gather these inputs before starting:
- **Org alias** (required)
- **Agent API name** (required for preview and deploy; ask if not provided)
- **Agent file path** (optional) -- path to the `.agent` file, typically `force-app/main/default/aiAuthoringBundles/<AgentName>/<AgentName>.agent`. Auto-detect if not provided.
- **Session IDs** (optional) -- analyze specific sessions; if absent, query last 7 days
- **Days to look back** (optional, default 7)
Determine intent from user input:
- **No specific action** -> run all three phases: Observe -> surface issues -> ask if user wants to Reproduce and/or Improve
- **"analyze" / "sessions" / "what's wrong"** -> Phase 1 only, then suggest next steps
- **"reproduce" / "test" / "preview"** -> Phase 2 (run Phase 1 first if no issues in hand)
- **"fix" / "improve" / "update"** -> Phase 3 (run Phase 1 first if no issues in hand)
### Resolve agent name
Before any STDM query, resolve the user-provided agent name against the org to get the exact `MasterLabel` and `DeveloperName`:
```bash
sf data query --json \
--query "SELECT Id, MasterLabel, DeveloperName FROM GenAiPlannerDefinition WHERE MasterLabel LIKE '%<user-provided-name>%' OR DeveloperName LIKE '%<user-provided-name>%'" \
-o <org>
```
- `MasterLabel` = display name used by STDM `findSessions` and Agent Builder UI (e.g. "Order Service")
- `DeveloperName` = API name with version suffix used in metadata (e.g. "OrderService_v9")
- The `--api-name` flag for `sf agent preview/activate/publish` uses `DeveloperName` **without** the `_vN` suffix (e.g. "OrderService")
Store these values:
- `AGENT_MASTER_LABEL` -- for `findSessions()` agent filter
- `AGENT_API_NAME` -- `DeveloperName` without `_vN` suffix, for `sf agent` CLI commands
- `PLANNER_ID` -- the Salesforce record ID for this agent
### Locate the .agent file
**Step 1 -- Search locally:**
```bash
find <project-root>/force-app/main/default/aiAuthoringBundles -name "*.agent" 2>/dev/null
```
If the user provided an agent file path, use that directly. Otherwise, search for files matching `AGENT_API_NAME`.
**Step 2 -- If not found locally, retrieve from the org:**
```bash
sf project retrieve start --json --metadata "AiAuthoringBundle:<AGENT_API_NAME>" -o <org>
```
> **Known bug:** `sf project retrieve start` creates a double-nested path: `force-app/main/default/main/default/aiAuthoringBundles/...`. Fix it immediately after retrieve:
```bash
if [ -d "force-app/main/default/main/default/aiAuthoringBundles" ]; then
mkdir -p force-app/main/default/aiAuthoringBundles
cp -r force-app/main/default/main/default/aiAuthoringBundles/* \
force-app/main/default/aiAuthoringBundles/
rm -rf force-app/main/default/main
fi
```
**Step 3 -- Validate the retrieved file:**
Read the `.agent` file and verify it has proper Agent Script structure:
- `system:` block with `instructions:`
- `config:` block with `developer_name:`
- `start_agent` or `subagent` blocks with `reasoning: instructions:`
- Each subagent should have distinct `instructions:` content (not identical across subagents)
Store the resolved path as `AGENT_FILE` for Phase 3.
---
## Phase 0: Discover Data Space
Before running any STDM query, determine the correct Data Cloud Data Space API name.
```bash
sf api request rest "/services/data/v63.0/ssot/data-spaces" -o <org>
```
Note: `sf api request rest` is a beta command -- do not add `--json` (that flag is unsupported and causes an error).
The response shape is:
```json
{
"dataSpaces": [
{
"id": "0vhKh000000g3DjIAI",
"label": "default",
"name": "default",
"status": "Active",
"description": "Your org's default data space."
}
],
"totalSize": 1
}
```
The `name` field is the API name to pass to `AgentforceOptimizeService`.
**Decision logic:**
- If the command fails (e.g. 404 or permission error), fall back to `'default'` and note it as an assumption.
- Filter to only `status: "Active"` entries.
- If exactly one active Data Space exists, use it automatically and confirm to the user: "Using Data Space: `<name>`".
- If multiple active Data Spaces exist, show the list (label + name) and ask the user which to use.
Store the selected `name` value as `DATA_SPACE` for all subsequent steps.
### Prerequisite check: STDM DMOs
After deploying the helper class (step 1.0), run a quick probe to verify the STDM Data Model Objects exist in Data Cloud:
```bash
sf apex run -o <org> -f /dev/stdin << 'APEX'
ConnectApi.CdpQueryInput qi = new ConnectApi.CdpQueryInput();
qi.sql = 'SELECT ssot__Id__c FROM "ssot__AiAgentSession__dlm" LIMIT 1';
try {
ConnectApi.CdpQueryOutputV2 out = ConnectApi.CdpQuery.queryAnsiSqlV2(qi, '<DATA_SPACE>');
System.debug('STDM_CHECK:OK rows=' + (out.data != null ? out.data.size() : 0));
} catch (Exception e) {
System.debug('STDM_CHECK:FAIL ' + e.getMessage());
}
APEX
```
**If `STDM_CHECK:FAIL`:** STDM is not activated. Inform the user and switch to **Phase 1-ALT**:
> STDM (Session Trace Data Model) is not available in this org. To enable: Setup -> Data Cloud -> Data Streams and verify "Agentforce Activity" is active. **Proceeding with fallback: test suites + local traces.**
**If `STDM_CHECK:OK`**, proceed to Phase 1 (STDM path).
--Writes Agentforce Agent Script (.agent) files from requirements
Platform engineer — scaffolds Flow/Apex metadata and deploys agent bundles
Plan-mode orchestrator for the Agent Development Life Cycle
Tests Agentforce agents and optimizes based on session trace analysis
Build, modify, debug, and deploy agents with Agentforce Agent Script. TRIGGER when: user creates, modifies, or asks about .agent files or aiAuthoringBundle metadata; changes agent behavior, responses, or conversation logic; designs agent actions, tools, subagents, or flow control; writes or reviews an Agent Spec; previews, debugs, deploys, publishes, or tests agents; uses Agent Script CLI commands (sf agent generate/preview/publish/test). DO NOT TRIGGER when: Apex development, Flow building, Prompt Template authoring, Experience Cloud configuration, or general Salesforce CLI tasks unrelated to Agent Script.
Run OWASP LLM Top 10 security assessments against live Agentforce agents. TRIGGER when: user asks for security testing, OWASP scan, red-teaming, penetration testing, security grade, vulnerability assessment, prompt injection test, data leakage test, excessive agency test, security posture check, or hardening recommendations. DO NOT TRIGGER when: user runs functional smoke tests or batch tests (use testing-agentforce); performs static safety review of .agent file content (use developing-agentforce Section 15); analyzes production session traces (use observing-agentforce); writes or modifies .agent files.
Write, run, and analyze structured test suites for Agentforce agents. TRIGGER when: user writes or modifies test spec YAML (AiEvaluationDefinition); runs sf agent test create, run, run-eval, or results commands; asks about test coverage strategy, metric selection, or custom evaluations; interprets test results or diagnoses test failures; asks about batch testing, regression suites, or CI/CD test integration. DO NOT TRIGGER when: user creates, modifies, previews, or debugs .agent files (use developing-agentforce); deploys or publishes agents; writes Agent Script code; uses sf agent preview for development iteration; analyzes production session traces (use observing-agentforce); requests OWASP, security, or red-team testing (use securing-agentforce).