Skip to main content
ClaudeWave
Skill146 repo starsupdated yesterday

verify-refs

Audit-only verification of manuscript references against PubMed and CrossRef. Detects fabricated or mismatched citations and writes qc/reference_audit.json. Does not modify references/ or refs.bib.

Install in Claude Code
Copy
git clone --depth 1 https://github.com/Aperivue/medsci-skills /tmp/verify-refs && cp -r /tmp/verify-refs/skills/verify-refs ~/.claude/skills/verify-refs
Then start a new Claude Code session; the skill loads automatically.

SKILL.md

# Verify References (Audit-Only)

You help a medical researcher prevent reference hallucinations before submission.
This skill audits an existing manuscript or bibliography. It **does not write**
to `references/` or `manuscript/_src/refs.bib`. It does not discover new
literature; use `/search-lit` for discovery and `/lit-sync` for bib management.

## When to Use

- Before journal submission, especially for `.docx` manuscripts inherited from
  coauthors or external editors.
- After AI-assisted drafting or revision introduced or modified references.
- When a reviewer or collaborator flags a possibly fabricated citation.
- Before `/sync-submission` freezes a journal package.

## Inputs

1. Manuscript or bibliography path: `.md`, `.docx`, `.bib`, `.txt`, or `.tsv`.
2. Optional project root. Default: current working directory.
3. Optional flags passed to the script:
   - `--offline`: extract and classify references without API verification.
   - `--timeout N`: HTTP timeout seconds.

## Companion: pandoc citation key check

For markdown manuscripts using pandoc `[@bibkey]` citations, validate citation
keys first to catch undefined/unused keys before this audit. If you also use the
companion `manage-refs` skill, run its `check_citation_keys.py` for this;
otherwise use your reference manager's citation-key check.

Then run `verify_refs.py` against the .bib to validate each entry against
PubMed/CrossRef. The two checks are complementary: a citation-key check catches
mis-keyed cites; `verify_refs.py` catches fabricated metadata.

## Deterministic Script

Run the bundled script rather than verifying citations by memory:

```bash
python "${CLAUDE_SKILL_DIR}/scripts/verify_refs.py" manuscript/manuscript.md --project-root .
```

For hooks or quick manual runs, use the wrapper:

```bash
"${CLAUDE_SKILL_DIR}/scripts/verify_cli.sh" manuscript/manuscript.md --offline
```

**Manual pre-submission strict run** (Phase 1A.5):

```bash
"${CLAUDE_SKILL_DIR}/scripts/verify_cli.sh" manuscript/index.qmd --strict
```

`--strict` forbids `--offline` and exits non-zero on any UNVERIFIED row.
Full checkpoint protocol: `references/manual_checkpoint_guide.md`.

The script uses DOI, PMID, CrossRef, and PubMed E-utilities where available. If
network verification fails, it records `UNVERIFIED` rather than silently passing.

## Output Contract (v1.3.0)

| Artifact | Path | Purpose |
|---|---|---|
| Audit JSON | `qc/reference_audit.json` | Sole output — row-level status (OK/MISMATCH/UNVERIFIED/FABRICATED), counts, `cited_authors[]`/`actual_authors[]`, `duplicate_findings[]`, submission-safe flag, full records |

**v1.2.0 (2026-05)** adds `duplicate_findings[]` to the audit JSON. Verbatim PMID or DOI duplicates within the reference list are flagged as MAJOR findings (resolves `/peer-review` Phase 2A P7). DOI normalization strips `https://doi.org/`, `http://dx.doi.org/`, `doi:` prefixes plus trailing slashes before comparison so `https://doi.org/10.x/abc/` and `10.x/abc` collapse to one key. Both `submission_safe` and `fully_verified` now require `duplicate_findings` to be empty.

**v1.3.0 (2026-05)** extends the author cross-check from first-author-only to the **full author list** and bumps `schema_version` to 4. For BibTeX inputs, every cited author family name is compared index-by-index against the authoritative source, and the cited-vs-source author counts are compared. PubMed `efetch.fcgi` (XML full record) is the truth source when a PMID is present — it is authoritative for given/family names where CrossRef is not (a documented case where CrossRef returned a wrong given name that PubMed efetch corrected). Records now carry `cited_authors[]`, `actual_authors[]`, `cited_author_count`, and `actual_author_count`. Motivation: a real AI-assisted manuscript registered a reference with a correct first author but seven of ten fabricated co-author names, and the first-author-only check passed it. Plain-text / TSV inputs, which cannot be parsed into a confident full list, degrade gracefully to the first-author check.

**Removed in Phase 1A.2** (per `docs/artifact_contract.md`):
- `references/verified_references.tsv` — record-level details now live inside `reference_audit.json` under `records[]`.
- `references/library.bib` — never this skill's concern. `/search-lit` produces candidates; `/lit-sync` (via Better BibTeX) writes `manuscript/_src/refs.bib`.

Sole-writer enforcement: `scripts/validate_project_contract.py` will flag any `references/*` file written by this skill as drift.

## Workflow

1. Identify the input file and project root.
2. Run `scripts/verify_refs.py`.
3. Read `qc/reference_audit.json`.
4. Report all `FABRICATED` and `MISMATCH` rows first (from `records[]`).
5. Report all `duplicate_findings[]` entries (verbatim PMID/DOI duplicates — cite renumbering required).
6. If `UNVERIFIED` rows remain, list them as manual checks and do not call the
   manuscript fully submission-safe. Rows with `note = "pagination_placeholder"`
   (`e000–e000` / `in press` / `TBD` / `forthcoming`) need the citation resolved
   before submission; `/self-review` Phase 2.5c decides whether any is a P0 blocker.
7. If the user needs a human-readable table, summarize from `records[]` in chat — do not write a TSV.

## Quality Gates

- Gate 1: stop submission if any row is `FABRICATED`.
- Gate 2: require user confirmation before accepting `UNVERIFIED` references.
- Gate 3: rerun after any reference edits.
- Gate 4 (added 2026-04-26; extended to full-author in v1.3.0): the cited
  author list is cross-checked against the authoritative source (PubMed efetch
  preferred, then CrossRef, then PubMed esummary). A row whose DOI/PMID resolves
  but whose cited authors do not match — at any index, or in total count — is
  downgraded to `MISMATCH`. First-author mismatches get
  `note = "first-author hallucination suspected"`; #2..#N family or count
  mismatches get `note = "non-first-author hallucination or count mismatch"`.
  This catches the LLM failure m
skillsSkill
academic-aioSkill

Medical AI paper optimization for AI search engines (Perplexity, ChatGPT web, Elicit, Consensus, SciSpace) and RAG-based literature tools. Applies when drafting or reviewing titles, abstracts, structured summary boxes (Key Points / Research in Context / Plain-Language Summary), manuscripts for high-impact medical AI journals (Lancet Digital Health, Radiology, Radiology-AI, npj Digital Medicine, Nature Medicine), preprints (medRxiv/arXiv), GitHub README + CITATION.cff + Zenodo archives, and Hugging Face model/dataset cards. Integrates TRIPOD+AI, CLAIM 2024, STARD-AI, TRIPOD-LLM, DECIDE-AI reporting requirements with generative engine optimization (GEO) principles. Produces a visible pass/fail checklist.

add-journalSkill

>

analyze-statsSkill

Statistical analysis for medical research papers. Generates reproducible Python/R code with publication-ready tables and figures. Supports diagnostic accuracy, inter-rater agreement, meta-analysis, survival analysis, survey data, group comparisons, regression, propensity score, and repeated measures.

author-strategySkill

PubMed author profile analysis. Author name → PubMed fetch → study type classification → visualization → strategy report.

batch-cohortSkill

Generate N analysis scripts from a single methodology template × multiple exposure/outcome combinations. The "80-person team" pattern — same validated method, swap variables only. Produces batch R/Python code + summary matrix.

calc-sample-sizeSkill

>

check-reportingSkill

Check manuscript compliance with medical research reporting guidelines. Supports 32 guidelines including STROBE, CONSORT, STARD, STARD-AI, TRIPOD, TRIPOD+AI, ARRIVE, PRISMA, PRISMA-DTA, PRISMA-P, CARE, SPIRIT, CLAIM, MI-CLEAR-LLM, SQUIRE 2.0, CLEAR, MOOSE, GRRAS, SWiM, AMSTAR 2, and risk of bias tools (QUADAS-2, QUADAS-C, RoB 2, ROBINS-I, ROBINS-E, ROBIS, ROB-ME, PROBAST, PROBAST+AI, NOS, COSMIN, RoB NMA). Generates item-by-item assessment with PRESENT/MISSING/PARTIAL status.