Skill31.9k repo starsupdated today

parallel-web

The parallel-web skill provides a unified web toolkit for searching, extracting, and enriching data with emphasis on academic and scientific sources. Use it for web searches prioritizing peer-reviewed papers and scholarly databases, fetching content from specific URLs or PDFs, adding web-sourced fields to multiple entities via bulk enrichment, or generating exhaustive multi-source research reports. Route requests to web search for single lookups, web extract for specific URLs, data enrichment for batch operations across multiple items, or deep research for comprehensive investigations.

View source Repository: scientific-agent-skills

Install in Claude Code

Copy

git clone --depth 1 https://github.com/K-Dense-AI/scientific-agent-skills /tmp/parallel-web && cp -r /tmp/parallel-web/skills/parallel-web ~/.claude/skills/parallel-web

Then start a new Claude Code session; the skill loads automatically.

Definition

SKILL.md

# Parallel Web Toolkit

A unified skill for all web-powered tasks: searching, extracting, enriching, and researching — with academic and scientific sources as the default priority.

## Routing — pick the right capability

Read the user's request and match it to one of the capabilities below. For web search, extract, enrichment, and deep research, read the corresponding reference file for detailed instructions.

| User wants to... | Capability | Where |
|---|---|---|
| Look something up, research a topic, find current info | **Web Search** | `references/web-search.md` |
| Fetch content from a specific URL (webpage, article, PDF) | **Web Extract** | `references/web-extract.md` |
| Add web-sourced fields to a list of companies/people/products | **Data Enrichment** | `references/data-enrichment.md` |
| Get an exhaustive, multi-source report (user says "deep research", "exhaustive", "comprehensive") | **Deep Research** | `references/deep-research.md` |
| Install or authenticate parallel-cli | **Setup** | Below |
| Check status of a running research/enrichment task | **Status** | Below |
| Retrieve completed research results by run ID | **Result** | Below |

### Decision guide

- **Default to Web Search** for a single lookup, research question, or "what is X?" query. It's fast and cost-effective. When the query touches a scientific or technical topic, include academic domains (see `references/web-search.md`) to surface peer-reviewed and preprint sources alongside general results.
- **Use Web Extract** when the user provides a URL or asks you to read/fetch a specific page. Prefer this over the built-in WebFetch tool. Particularly useful for extracting full text from academic PDFs, preprint servers, and journal articles.
- **Use Data Enrichment** when the user has **multiple entities** (a CSV, a list of companies/people/products, or even a short inline list) and wants to find or add the same kind of information for each one. The key signal is a repeated lookup across a set of items — e.g., "find the CEO for each of these companies" or "get the founding year for Apple, Stripe, and Anthropic." Even if the user doesn't say "enrich," use `parallel-cli enrich` whenever the task is the same query applied to multiple entities. Do NOT use Web Search in a loop for this — the enrichment pipeline handles batching, parallelism, and structured output automatically.
- **Use Deep Research only** when the user explicitly asks for deep, exhaustive, or comprehensive research. It is 10-100x slower and more expensive than Web Search — never default to it. Deep research is especially valuable for literature reviews and multi-paper synthesis.
- If `parallel-cli` is not found when running any command, follow the Setup section below.

### Academic source priority

Across all capabilities, prefer academic and scientific sources when the query is technical or scientific in nature. This means:
- Peer-reviewed journal articles and conference proceedings over blog posts or news articles
- Preprints (arXiv, bioRxiv, medRxiv) when peer-reviewed versions aren't available
- Institutional and government sources (NIH, WHO, NASA, NIST) over commercial sites
- Primary research over secondary summaries

When citing academic sources, include author names and publication year where available (e.g., [Smith et al., 2025](url)) in addition to the standard citation format. If a DOI is present, prefer the DOI link.

## Context chaining

Several capabilities support multi-turn context via `interaction_id`. When a research or enrichment task completes, it returns an `interaction_id`. If the user asks a follow-up question related to that task, pass `--previous-interaction-id` to carry context forward automatically. This avoids restating what was already found.

---

## Setup

If `parallel-cli` is not installed, install and authenticate:

```bash
curl -fsSL https://parallel.ai/install.sh | bash
```

If unable to install that way, use uv instead:

```bash
uv tool install "parallel-web-tools[cli]"
```

Then authenticate. First, check if a `.env` file exists in the project root and contains `PARALLEL_API_KEY`. If so, load it with `dotenv`:

```bash
dotenv -f .env run parallel-cli auth
```

If `dotenv` isn't available, install it with `pip install python-dotenv[cli]` or `uv pip install python-dotenv[cli]`.

If there's no `.env` file or it doesn't contain the key, fall back to interactive login:

```bash
parallel-cli login
```

Or set the key manually: `export PARALLEL_API_KEY="your-key"`

Verify with:

```bash
parallel-cli auth
```

If `parallel-cli` is not found after install, add `~/.local/bin` to PATH.

## Check task status

```bash
parallel-cli research status "$RUN_ID" --json
```

Report the current status to the user (running, completed, failed, etc.).

## Get completed result

```bash
parallel-cli research poll "$RUN_ID" --json
```

Present results in a clear, organized format.

More from this repository

adaptyvSkill

How to use the Adaptyv Bio Foundry API and Python SDK for protein experiment design, submission, and results retrieval. Use this skill whenever the user mentions Adaptyv, Foundry API, protein binding assays, protein screening experiments, BLI/SPR assays, thermostability assays, or wants to submit protein sequences for experimental characterization. Also trigger when code imports `adaptyv`, `adaptyv_sdk`, or `FoundryClient`, or references `foundry-api-public.adaptyvbio.com`.

aeonSkill

This skill should be used for time series machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search. Use when working with temporal data, sequential patterns, or time-indexed observations requiring specialized algorithms beyond standard ML approaches. Particularly suited for univariate and multivariate time series analysis with scikit-learn compatible APIs.

anndataSkill

Data structure for annotated matrices in single-cell analysis. Use when working with .h5ad files or integrating with the scverse ecosystem. This is the data format skill—for analysis workflows use scanpy; for probabilistic models use scvi-tools; for population-scale queries use cellxgene-census.

arboretoSkill

Infer gene regulatory networks (GRNs) from gene expression data using scalable algorithms (GRNBoost2, GENIE3). Use when analyzing transcriptomics data (bulk RNA-seq, single-cell RNA-seq) to identify transcription factor-target gene relationships and regulatory interactions. Supports distributed computation for large-scale datasets.

astropySkill

Core Python library for astronomy and astrophysics workflows that need Astropy APIs, including units/quantities, coordinates, FITS I/O, tables, time systems, WCS, and cosmology. Use when implementing or debugging astronomical data analysis code with Astropy.

autoskillSkill

Observe the user's screen via screenpipe, detect repeated research workflows, match them against existing scientific-agent-skills, and draft new skills (or composition recipes that chain existing ones) for the patterns not yet covered. Use when the user asks to analyze their recent work and propose skills based on what they actually do. Requires the screenpipe daemon (https://github.com/screenpipe/screenpipe) running locally on port 3030 — the skill has no other data source and will refuse to run if screenpipe is unreachable. All detection runs locally; only redacted cluster summaries reach the LLM.

benchling-integrationSkill

Benchling Python SDK and REST API integration for registry entities, inventory, ELN entries, workflows, Benchling Apps, and Data Warehouse queries. Use when automating lab data with benchling-sdk or the v2 API.

bgpt-paper-searchSkill

Search scientific papers and retrieve structured experimental data extracted from full-text studies via the BGPT MCP server. Returns 25+ fields per paper including methods, results, sample sizes, quality scores, and conclusions. Use for literature reviews, evidence synthesis, and finding experimental details not available in abstracts alone.