heavy-file-ingestion-claude-desktop
This Claude Code skill instructs Claude Desktop to avoid directly ingesting large files like PDFs, Word documents, and spreadsheets. Instead, it directs users to first convert files to lightweight formats such as markdown or CSV, either through provided Python commands or manual export, before analysis begins. Use this skill when users request file reading or analysis in Claude Desktop to optimize token usage and improve processing efficiency.
git clone --depth 1 https://github.com/NateBJones-Projects/OB1 /tmp/heavy-file-ingestion-claude-desktop && cp -r /tmp/heavy-file-ingestion-claude-desktop/skills/heavy-file-ingestion/variants/claude-desktop ~/.claude/skills/heavy-file-ingestion-claude-desktopSKILL.md
# Heavy File Ingestion For Claude Desktop ## Problem Claude Desktop does not have the same local shell workflow as coding agents, so it should avoid pretending it can efficiently process bulky files raw. ## Trigger Conditions - The user asks Claude Desktop to read a PDF, PPTX, DOCX, XLSX, or another bulky attachment - The file would cost too much context for too little value - The user would be better served by a converted markdown or CSV artifact ## Process 1. Do not ingest the raw heavyweight file by default. 1. First ask for the cheapest workable artifact: - PDF or DOCX: markdown - PPTX: markdown slide outline - XLSX: CSV per sheet or a small sample plus sheet names 1. If the user has not converted it yet, offer exact commands they can run outside Claude Desktop. ### Suggested Conversion Commands ```bash python convert_heavy_file.py /absolute/path/to/file.pdf python convert_heavy_file.py /absolute/path/to/file.docx python convert_heavy_file.py /absolute/path/to/file.pptx python convert_heavy_file.py /absolute/path/to/file.xlsx ``` If the script is not available, say so and ask the user for: - a markdown export - a CSV export - or a small representative excerpt 1. Once the user provides the converted artifact, create a quick index: - file type - sections, slides, or sheet names - row counts or page counts if available - any obvious extraction-quality problems 2. Only then analyze the content. ## Client Rules - Be explicit about the tradeoff: converting first is cheaper and usually better. - If the user insists on staying inside Claude Desktop, ask for a smaller excerpt rather than taking the whole file raw. - Use raw ingestion only for genuinely small files where conversion would cost more effort than it saves.
Use Nate Jones OB1 Agent Memory from OpenClaw with provenance, scope, review, and use-policy discipline.
Continuous learning system that extracts reusable knowledge from work sessions. Triggers: (1) /aiception command, (2) 'save this as a skill' or 'extract a skill from this', (3) 'what did we learn?', (4) after non-obvious debugging or trial-and-error discovery. Creates new skills when valuable reusable knowledge is identified. Integrates with Open Brain to prevent duplicates.
Morning digest of yesterday's Open Brain thoughts, drafted to Gmail
Generate infographic images from any research doc, Open Brain thoughts, or analysis. Auto-chunks content, writes prompts, generates images via Gemini API (free tier), and saves to media/. Use --premium for better text rendering.
|
Use when processing voice transcripts, brain dumps, stream-of-consciousness notes, or any raw multi-topic capture. Extracts every idea thread, then evaluates each one with deep brainstorming, then captures results to Open Brain. Trigger on transcripts, exports, "process this", "pan for gold", "brain dump", "what did I say", or multi-topic markdown files.
|