company-product-context
# company-product-context This Claude Code skill systematically compiles comprehensive business intelligence by extracting data from company PDF documents, conducting targeted web research, and synthesizing industry knowledge into a structured context report. Use it when building detailed company profiles, conducting competitive analysis, preparing for business development conversations, or establishing baseline knowledge about organizations before client engagements or partnership evaluations.
git clone --depth 1 https://github.com/lofcz/LLMTornado /tmp/company-product-context && cp -r /tmp/company-product-context/src/LlmTornado.Tests/Static/Files/Skills/company-product-context ~/.claude/skills/company-product-contextSKILL.md
## Company Product Context Compiler
This skill extracts information from company PDF documents, conducts web research, and synthesizes industry knowledge to create a comprehensive company product context report.
Copy this checklist and track your progress:
```
Company Product Context Progress:
- [ ] Step 1: Gather company materials and identify sources
- [ ] Step 2: Extract information from PDF documents
- [ ] Step 3: Structure extracted data
- [ ] Step 4: Conduct web research and validation
- [ ] Step 5: Synthesize industry knowledge
- [ ] Step 6: Compile comprehensive product context
- [ ] Step 7: Generate final report
- [ ] Step 8: Export deliverables
```
## **Step 1: Gather company materials and identify sources**
Collect all available company information:
**Required Inputs:**
- Company PDF documents (annual reports, product sheets, presentations, etc.)
- Company name and website URL
- Industry/sector information
- Specific products or services to focus on (if applicable)
**Actions:**
1. Request all relevant PDF files from user
2. Confirm company name, website, and primary industry
3. Ask about specific focus areas or products of interest
4. Identify any competitive context needed
**Expected in INPUT_DIR:**
- `*.pdf` - Company documents
- `company_info.txt` - Basic company details (optional)
## **Step 2: Extract information from PDF documents**
Extract structured information from all provided PDF files.
**Use the Python script for PDF extraction:**
```python
import os
import re
from pathlib import Path
import PyPDF2
import json
def extract_pdf_content(pdf_path):
"""Extract text content from PDF file."""
text_content = []
metadata = {}
try:
with open(pdf_path, 'rb') as file:
pdf_reader = PyPDF2.PdfReader(file)
# Extract metadata
if pdf_reader.metadata:
metadata = {
'title': pdf_reader.metadata.get('/Title', ''),
'author': pdf_reader.metadata.get('/Author', ''),
'subject': pdf_reader.metadata.get('/Subject', ''),
'pages': len(pdf_reader.pages)
}
else:
metadata = {'pages': len(pdf_reader.pages)}
# Extract text from all pages
for page_num, page in enumerate(pdf_reader.pages, 1):
try:
text = page.extract_text()
if text.strip():
text_content.append({
'page': page_num,
'text': text
})
except Exception as e:
print(f"Error extracting page {page_num}: {e}")
except Exception as e:
print(f"Error reading PDF {pdf_path}: {e}")
return None
return {
'filename': os.path.basename(pdf_path),
'metadata': metadata,
'content': text_content
}
def extract_key_sections(text):
"""Extract key sections from text based on common headers."""
sections = {
'company_overview': [],
'products_services': [],
'business_model': [],
'market_position': [],
'financials': [],
'technology': [],
'customers': [],
'strategy': [],
'other': []
}
# Keywords for section identification
keywords = {
'company_overview': ['about us', 'company overview', 'who we are', 'introduction', 'history'],
'products_services': ['products', 'services', 'solutions', 'offerings', 'portfolio'],
'business_model': ['business model', 'revenue model', 'how we work', 'operations'],
'market_position': ['market', 'industry', 'competitive', 'position', 'landscape'],
'financials': ['financial', 'revenue', 'earnings', 'profit', 'growth'],
'technology': ['technology', 'platform', 'infrastructure', 'technical', 'innovation'],
'customers': ['customers', 'clients', 'partners', 'case study', 'testimonial'],
'strategy': ['strategy', 'vision', 'mission', 'goals', 'objectives', 'roadmap']
}
lines = text.split('\n')
current_section = 'other'
for line in lines:
line_lower = line.lower().strip()
# Check if line is a section header
for section, section_keywords in keywords.items():
if any(keyword in line_lower for keyword in section_keywords):
if len(line_lower) < 100: # Likely a header
current_section = section
break
if line.strip():
sections[current_section].append(line)
return sections
def analyze_company_info(extracted_data):
"""Analyze extracted data for key company information."""
analysis = {
'company_name': '',
'industry': '',
'products': [],
'key_terms': [],
'metrics': [],
'urls': [],
'emails': []
}
all_text = ''
for doc in extracted_data:
for page in doc['content']:
all_text += page['text'] + '\n'
# Extract URLs
url_pattern = r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'
analysis['urls'] = list(set(re.findall(url_pattern, all_text)))
# Extract emails
email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
analysis['emails'] = list(set(re.findall(email_pattern, all_text)))
# Extract potential metrics (numbers with units/context)
metrics_pattern = r'\$?\d+\.?\d*\s*(?:million|billion|trillion|k|M|B|%|percent|users|customers|employees)'
analysis['metrics'] = re.findall(metrics_pattern, all_text, re.IGNORECASE)
return analysis
def main():
input_dir = os.environ.get('INPUT_DIR', '/tmp')
output_dir = '/tmp/extracted_data'
os.makedirs(output_dir, exist_ok=True)
# Find all PDF filesA step-by-step guide to synthesizing research from multiple sources into a coherent summary.
Generates a structured skill template based on provided specifications.
This skill provides a comprehensive context extraction system for large codebases. It intelligently analyzes code structure, dependencies, and relationships to extract relevant context for understanding, debugging, or modifying code.
Performs comprehensive, multi-layered research on any topic with structured analysis and synthesis of information from multiple sources.
Generates comprehensive code tutorials on LlmTornado API formatted for Medium publication with examples, explanations, and best practices.
Extracts text and tables from PDF files, fills forms, and merges documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.
Generates Anthropic Skills with complete workflow including GitHub PR creation and local download verification.
Generates complete Anthropic SKILL packages with proper structure, documentation, and automated download verification.