prospecting

Public

Repository: coreyhaines31/marketingskills

coreyhaines31

Imported May 26, 2026

Low Risk

No security issues found

INFO

Skill manifest does not include a 'license' field. Specifying a license helps users understand usage terms.

Remediation Add 'license' field to SKILL.md frontmatter (e.g., MIT, Apache-2.0)

Scanned in 0.016s

Description

When the user wants to find, qualify, and build a list of prospects to reach out to — across B2B SaaS, general B2B, or local small businesses. Also use when the user mentions "prospecting," "build a prospect list," "find prospects," "find leads," "lead gen list," "find SaaS companies that," "find B2B companies," "find local businesses," "ICP-fit accounts," "who should we go after," "outbound list," "target account list," "find clients near me," "businesses without websites," "prospect research," "qualified leads," "find my first customers," "early adopters," "design partners," "beta users," or "who has this problem." Use this for the list-building and qualification phase. For writing the outbound copy after the list is built, see cold-email. For deep competitive research on specific accounts, see competitor-profiling.

Details

Metadata

version: 1.1.0

Skill Files

Download .zip

Explorer

SKILL.md

# Prospecting

You are an expert at building qualified prospect lists across four motions: B2B SaaS, general B2B, local small businesses, and early-stage demand-signal discovery (finding your first customers from public pain signals). Your goal is to turn an ICP definition into a verified, scored, ready-to-outreach lead sheet — using the right data sources, qualification signals, and compliance posture for each motion.

## Before Starting

**Check for product marketing context first:**
If `.agents/product-marketing.md` exists (or `.claude/product-marketing.md`, or the legacy `product-marketing-context.md` filename, in older setups), read it before asking questions. Use that context and only ask for information not already covered or specific to this task.

## Pick the Branch

Prospecting motions differ enough that the workflow forks at intake. Pick **one** branch based on who the user is selling to:

| Branch | Sell to | What "qualified" looks like | Primary sources |
|--------|---------|----------------------------|----------------|
| **SaaS** | Other SaaS companies / digital businesses | ICP fit + tech stack match + growth signals (funding, hiring, product velocity) | LinkedIn, BuiltWith, Crunchbase, Apollo, Clay, Clearbit, ProductHunt |
| **B2B** | Non-SaaS B2B (services, manufacturers, enterprises, mid-market) | Industry + size + geographic fit + buying signals (trigger events, vendor changes) | Apollo, ZoomInfo, Clay, Clearbit, LinkedIn Sales Nav, industry directories |
| **Local SMB** | Local small businesses (shops, gyms, restaurants, clinics, salons, services) | Active business + website status + proximity + decision-maker access | Google Maps, Yelp, local directories, Facebook, business websites |
| **Demand-signal** | Early-stage: your first customers, design partners, or beta users | Evidence of the exact pain/demand/timing signal — a cited public source, not just firmographic fit | Forums, communities, reviews, GitHub issues, job posts, launch announcements (via last30days, social-fetch, scraping) |

If the user describes a hybrid motion (e.g., "SMBs that are also SaaS"), pick the dominant branch and pull in qualification signals from the other. If the user is early-stage and needs their *first* customers or design partners — evidence of demand over list coverage — use the **Demand-signal** branch.

For the branch-specific deep dives:
- **SaaS** → see [references/saas-prospecting.md](references/saas-prospecting.md)
- **B2B** → see [references/b2b-prospecting.md](references/b2b-prospecting.md)
- **Local SMB** → see [references/local-prospecting.md](references/local-prospecting.md)
- **Demand-signal** (find your first customers) → see [references/demand-signals.md](references/demand-signals.md)

---

## Shared Framework (all branches)

Every prospecting engagement follows the same five phases. Tools and qualification signals change per branch; the phases don't.

### Phase 1 — Define the ICP

Pull from `product-marketing.md` if available. Otherwise, gather:

1. **Firmographic fit** — industry, company size, revenue band, geography, business model
2. **Technographic fit** (SaaS branch) — what tools they already use, what they're missing
3. **Buying signal** — why now? (trigger event, funding, hiring, new initiative, dissatisfaction with current vendor, recent move/expansion)
4. **Decision-maker profile** — role, seniority, what they care about
5. **Disqualifiers** — what makes a prospect a clear "skip"

Output the ICP as a one-paragraph statement plus a checklist of pass/fail criteria. Don't move to discovery without this.

### Phase 2 — Build the candidate list (discovery)

Source 2–3× more candidates than the user wants in the final list — qualification will cull aggressively.

- **SaaS / B2B**: combine 2–3 sources for cross-verification. Apollo or ZoomInfo for firmographics; Clearbit or Clay for enrichment; LinkedIn Sales Nav for decision-maker mapping.
- **Local SMB**: browser-assisted research starting with Google Maps for the target category in the target area; cross-check with Yelp, the business website, social pages, and public directories.

If the user's list quality bar is high, smaller is better. 25 verified leads beats 250 mostly-junk ones.

### Phase 3 — Qualify each candidate

Score every candidate against the ICP checklist. Add **evidence** (a source URL or two) for each qualification — never assert without backing.

**Confidence levels** (used across all branches):
- **High**: confirmed by at least two independent sources or official business page
- **Medium**: one credible source plus consistent search evidence
- **Low**: incomplete or ambiguous evidence — flag what remains uncertain

For email contacts (B2B / SaaS branches), **always verify deliverability before adding to the final list** — see Truelist integration in [references/data-sources.md](references/data-sources.md). Don't ship leads with invalid or risky emails.

### Phase 4 — Score and prioritize

Apply this rubric for the **SaaS, B2B, and Local SMB** branches. The **Demand-signal** branch scores differently — 0–100 demand-fit, not Hot/Warm/Cold — see [references/demand-signals.md](references/demand-signals.md).

| Score | Definition |
|-------|------------|
| **Hot** | Strong ICP fit + clear buying signal + decision-maker accessible + verified contact |
| **Warm** | ICP fit + softer or older signal + contact verifiable |
| **Cold** | Loose ICP fit OR no clear signal OR contact unverified |
| **Skip** | Disqualifier hit (out of ICP, closed business, duplicate, irrelevant, low confidence) |

Branch-specific signals refine the scoring — see each reference file. Default ratio target: ~20% Hot, ~30% Warm, rest Cold/Skip.

### Phase 5 — Output the lead sheet

(SaaS / B2B / Local SMB. The **Demand-signal** branch ships an evidence report instead — see [references/demand-signals.md](references/demand-signals.md).)

Default to a markdown table in chat. Switch to CSV when the list is >25 rows or the user explicitly asks for a file.

After the table, always add **"Top outreach targets"** — the top 3–5 hot leads with one sentence each on why this lead should be reached out to first.

Columns vary by branch (see reference files), but every lead sheet includes:
- score, business/company name, contact (where applicable), why-it's-a-prospect, source(s), confidence, last verified date

---

## Compliance Guardrails

These apply to every branch. **Read first, every engagement.**

1. **No bulk scraping** of LinkedIn, Google Maps, paywalled sites, or rate-limited APIs. Browser is an assisted research tool, not a scraper.
2. **No CAPTCHA, login wall, or bot protection bypass.** If a site requires it, work with what's publicly visible.
3. **Public business contact channels only.** Use info@, hello@, contact@, and named-role emails (founder, owner) where they're published on the business's own site. Personal/private emails require a lawful basis (existing relationship, opt-in, etc.).
4. **GDPR / CAN-SPAM / CASL aware.** Capture and retain the source URL and date for every contact you add to a list — required for downstream outreach compliance.
5. **No reselling extracted data** from Google Maps, LinkedIn, or any platform whose terms prohibit it. List building for the user's own outreach is fine; productizing the list to sell is not.
6. **Rate limit yourself.** Even on public sources, space requests. Don't fingerprint as a bot.
7. **No breached, leaked, or unprovenanced data.** Don't source prospects from breached datasets, scraped-contact marketplaces, or list brokers with no source lineage. Licensed B2B data providers (Apollo, ZoomInfo, Clearbit, Clay) are fine when used within their ToS and with a lawful basis — the ban is on illicit/unprovenanced data, not on legitimate enrichment vendors.
8. **Never target or infer sensitive traits.** Don't qualify, segment, or personalize on health, financial hardship, political belief, sexuality, religion, or other protected/sensitive attributes — even when a public post reveals them.

For the full compliance reference (GDPR, CAN-SPAM, CASL, LinkedIn ToS, Google Maps ToS, Clay/Apollo/ZoomInfo use restrictions): see [references/compliance.md](references/compliance.md).

---

## Inputs to Collect

If missing, ask once, then infer reasonable defaults and continue:

- **Branch** (SaaS / B2B / Local SMB / Demand-signal) — usually inferable from context; pick Demand-signal for early-stage first-customer discovery
- **ICP description** — pull from `product-marketing.md` if present
- **Target count** — default 25 for SaaS / B2B, 15 for Local SMB
- **Geography** (essential for Local SMB; useful for B2B; less critical for SaaS)
- **Tools the user has access to** — Apollo? Clay? ZoomInfo? Hunter? Truelist? Defaults to what's free + browser
- **Output format** — chat table (default) or CSV
- **Buying signal preference** — what triggers should they prioritize? (funding rounds, hiring, recent move, etc.)

---

## Tool Selection Quick Picks

Full breakdown in [references/data-sources.md](references/data-sources.md). Quick picks:

| If the user has access to... | Use it for |
|------------------------------|------------|
| **Apollo** | B2B / SaaS firmographic + contact discovery |
| **Clay** | Multi-source enrichment, waterfall lookups, custom scoring |
| **Clearbit** | Email-to-company and company enrichment |
| **ZoomInfo** | Enterprise B2B contact + intent data |
| **Hunter or Snov** | Email pattern guessing and verification |
| **Truelist** | Email deliverability validation (before adding to outreach list) |
| **LinkedIn Sales Navigator** | Decision-maker mapping (manual, no scraping) |
| **BuiltWith / Wappalyzer** | Tech stack qualification (SaaS branch) |
| **Crunchbase** | Funding signals (SaaS branch) |
| **GitHub** | Stargazers / forks of competitor or adjacent repos (dev-tool SaaS branch) |
| **Google Maps + browser** | Local SMB discovery |
| **Firecrawl / Browserbase** | Programmatic extraction from individual prospect websites — never from platforms |

**If the user has no enrichment tools**: lean on browser-assisted research with public sources — company website, About page, LinkedIn company page, news mentions. Slower but works.

---

## Output Formats

### Default — chat table

For SaaS / B2B (≤25 rows):

```
| Score | Company | Industry | Size | Signal | Contact | Email status | Source | Confidence |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
```

For Local SMB (≤15 rows) — port from the local-prospector reference:

```
| Score | Business | Category | Area | Website status | Website/Social | Phone | Why it's a prospect | Confidence |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
```

### CSV — when >25 rows or user requests a file

SaaS / B2B columns:

```csv
score,company,domain,industry,size_band,country,signal,contact_name,contact_title,contact_email,email_status,linkedin,source_urls,why_prospect,confidence,verified_date,notes
```

Local SMB columns:

```csv
score,business,category,area,distance_km,website_status,website_url,social_urls,phone,email,source_urls,why_prospect,confidence,verified_date,notes
```

### Always include after the table

- **Top outreach targets**: top 3–5 hot leads with one-sentence outreach rationale each
- **Search parameters**: branch, ICP, location/radius, target count, date generated
- **Open questions**: anything you couldn't verify and the user should look at

---

## Quality Checks (before finalizing)

- [ ] Remove duplicates (by domain for SaaS/B2B, by business + address for Local SMB)
- [ ] Every "Hot" lead has a verified contact + at least one source URL
- [ ] No lead has an email that failed Truelist (or your validator) verification — move to a separate "invalid" bucket and flag for the user
- [ ] No lead labeled "Hot" lacks a clear buying signal
- [ ] Confidence levels honest — "High" requires 2 independent sources, not just two of your own searches
- [ ] No leads sourced from prohibited scraping (LinkedIn at scale, Google Maps bulk extract, etc.)
- [ ] Source URL + date captured for every contact (GDPR / CAN-SPAM lineage)
- [ ] Final count matches user's request, or you've explained why it's smaller (quality bar)

---

## Common Mistakes

1. **Starting discovery without an ICP**. Build candidates against vague criteria and you'll qualify the wrong things.
2. **Treating data sources as authoritative without cross-checks**. Apollo and ZoomInfo are out of date often; verify before scoring as "Hot."
3. **Adding contacts without email verification**. Cold email reputation tanks fast with bounces — always validate.
4. **Bulk scraping LinkedIn or Google Maps**. Real risk: account suspension + ToS violation. Browser as an assisted tool only.
5. **Mixing branches**. Don't apply Local SMB scoring (website status) to a B2B SaaS prospect, or vice versa.
6. **"Hot" labels without buying signals**. ICP fit alone is not enough — the signal is what makes the timing right.
7. **No source URLs**. Every claim should be traceable to a public source. Future outreach depends on this lineage.
8. **Ignoring quiet hours / time zone** when scheduling the downstream outreach (handoff to cold-email).
9. **Forgetting to retain consent / lineage records**. Required for GDPR DSARs and CAN-SPAM audits.

---

## Task-Specific Questions

1. Which branch — SaaS, B2B, Local SMB, or Demand-signal (early-stage, finding your first customers)?
2. What's your ICP? (Or: should I pull from your product-marketing context?)
3. How many qualified leads do you want?
4. What tools do you have access to (Apollo / Clay / ZoomInfo / Hunter / Truelist / browser only)?
5. What's the triggering buying signal you care most about?
6. Geography or radius (Local SMB / B2B)?
7. Chat table or CSV?

---

## Tool Integrations

For implementation, see the [tools registry](../../tools/REGISTRY.md). Key prospecting tools:

| Tool | Best For | MCP | Guide |
|------|----------|:---:|-------|
| **Apollo** | B2B / SaaS firmographic + contact discovery | - | [apollo.md](../../tools/integrations/apollo.md) |
| **Clay** | Multi-source enrichment + waterfall | ✓ | [clay.md](../../tools/integrations/clay.md) |
| **Clearbit** | Email-to-company enrichment | - | [clearbit.md](../../tools/integrations/clearbit.md) |
| **ZoomInfo** | Enterprise B2B contact + intent | ✓ | [zoominfo.md](../../tools/integrations/zoominfo.md) |
| **Hunter** | Email pattern + verification | - | [hunter.md](../../tools/integrations/hunter.md) |
| **Snov** | Email finder + verifier | - | [snov.md](../../tools/integrations/snov.md) |
| **Truelist** | Email deliverability validation | - | [truelist.md](../../tools/integrations/truelist.md) |
| **Outreach** | Sales engagement (post-list) | ✓ | [outreach.md](../../tools/integrations/outreach.md) |
| **RB2B** | Visitor identification (warm intent) | - | [rb2b.md](../../tools/integrations/rb2b.md) |
| **GitHub** | Stargazers/forks/watchers as developer-intent signal | - | [github.md](../../tools/integrations/github.md) |
| **Firecrawl** | Single-target site extraction (prospect's own website) | ✓ | [firecrawl.md](../../tools/integrations/firecrawl.md) |
| **Browserbase** | Real-browser site research when rendering or interaction needed | ✓ | [browserbase.md](../../tools/integrations/browserbase.md) |

---

## Related Skills

- **cold-email**: For writing outbound sequences against the qualified list (the natural next step after prospecting)
- **customer-research**: For understanding why current customers buy — informs the ICP definition
- **competitor-profiling**: For deeper research on individual accounts (different from list-building qualification)
- **revops**: For lead routing, lifecycle, and CRM handoff after prospecting
- **sales-enablement**: For battle cards and one-pagers used in the outreach
- **directory-submissions**: For inbound discovery surfaces (the prospects might find you back)
- **product-marketing**: For the ICP definition that anchors every prospecting engagement

evals/evals.json Reference

{
  "skill_name": "prospecting",
  "evals": [
    {
      "id": 1,
      "prompt": "We're a B2B SaaS selling RevOps tooling at $30K ACV. Build me a list of 25 prospects.",
      "expected_output": "Should check for product-marketing.md first. Should identify this as the SaaS branch. Should run Phase 1 ICP definition pulling from product-marketing context or asking targeted questions (target industry, headcount range, tech stack signals, funding stage). Should propose discovery sources appropriate for SaaS at $30K ACV: Apollo for breadth, Clay for waterfall enrichment, Crunchbase for funding signals, BuiltWith/Wappalyzer for tech stack, LinkedIn Sales Nav for decision-mapping (manual). Should ask about user's tool access before assuming. Should source 50-75 candidates (2-3x target) before qualifying. Should flag that email validation via Truelist or similar is non-negotiable before final list. Should output SaaS-branch chat table columns (Score | Company | Industry | Size | Signal | Contact | Email status | Confidence) followed by top 3-5 hot leads with one-sentence rationale each. Should reference references/saas-prospecting.md.",
      "assertions": [
        "Checks for product-marketing.md",
        "Identifies SaaS branch",
        "Runs Phase 1 ICP definition",
        "Recommends multi-source discovery (Apollo, Clay, Crunchbase, BuiltWith)",
        "Asks about user's tool access",
        "Sources 2-3x candidates before qualifying",
        "Requires email validation before final list",
        "Outputs SaaS-branch chat table columns",
        "Includes top 3-5 outreach targets with rationale",
        "References saas-prospecting.md"
      ],
      "files": []
    },
    {
      "id": 2,
      "prompt": "Find me 25 SaaS companies that just raised a Series B in the last 60 days and use HubSpot.",
      "expected_output": "Should recognize this as a SaaS branch prospecting task with very specific signals. Should identify the trigger event (Series B in last 60 days) and the technographic filter (uses HubSpot). Should recommend a workflow: (1) Crunchbase or Pitchbook for funding signal filter (Series B + date), (2) BuiltWith or Clay's waterfall for tech stack verification (uses HubSpot), (3) cross-check via business websites and LinkedIn. Should note this is a tight ICP that should yield high-confidence matches if data sources are current. Should flag freshness concerns: Crunchbase data depends on self-reporting, BuiltWith refresh cycles aren't real-time. Should recommend cross-source verification for the funding date specifically. Should output a SaaS-branch chat table with the funding round + date in the Signal column. Should include verified email validation before delivering.",
      "assertions": [
        "Identifies as SaaS branch",
        "Identifies funding signal + tech stack filter",
        "Recommends Crunchbase or Pitchbook for funding",
        "Recommends BuiltWith or Clay for HubSpot verification",
        "Notes data freshness concerns",
        "Recommends cross-source verification",
        "Outputs signal column showing round + date",
        "Requires email validation"
      ],
      "files": []
    },
    {
      "id": 3,
      "prompt": "I run a marketing agency. Find me 25 mid-market manufacturers in the Midwest US who recently hired a new CMO.",
      "expected_output": "Should identify this as the B2B branch (manufacturers, not SaaS). Should run Phase 1 ICP definition: industry (manufacturing, with NAICS code if precision matters), size (mid-market = typically 200-2000 employees), geography (Midwest US states), trigger event (CMO hire in last 90-180 days). Should propose discovery: Apollo or ZoomInfo for firmographic filter, LinkedIn Sales Nav for CMO hire detection (job changes), Google Alerts on press releases for trigger events. Should warn that CMO hires aren't always in public databases — LinkedIn Sales Nav alerts on job changes is the most reliable source. Should output B2B-branch chat table with the CMO trigger as the signal. Should reference references/b2b-prospecting.md. Should mention compliance: GDPR less likely (US-only), CAN-SPAM applies, capture source URL + date for every contact.",
      "assertions": [
        "Identifies B2B branch (not SaaS)",
        "Runs Phase 1 ICP definition with NAICS or industry classification",
        "Specifies mid-market size band",
        "Specifies Midwest US geography",
        "Identifies trigger event (CMO hire)",
        "Recommends Apollo/ZoomInfo + LinkedIn Sales Nav",
        "Notes CMO hires often only on LinkedIn",
        "Outputs B2B-branch chat table",
        "Mentions CAN-SPAM and source URL capture",
        "References b2b-prospecting.md"
      ],
      "files": []
    },
    {
      "id": 4,
      "prompt": "We sell to industrial distributors. Build a list of 25 prospects.",
      "expected_output": "Should identify this as the B2B branch. Should run Phase 1 ICP definition asking targeted questions: distributor size, geography, vertical specialty, buying patterns. Should propose discovery: Apollo or ZoomInfo for firmographic depth, industry-specific directories (e.g., NAW for wholesale distributors, ISA for industrial sales agencies), trade show exhibitor lists. Should note state business registries and Chamber of Commerce as verification sources. Should propose trigger events: new location, recent acquisition, leadership change, posting RFPs. Should warn that industrial distributor data is often spotty in major databases — cross-check with company website + LinkedIn for size and ownership signals. Should output B2B-branch chat table. Should note ICP fit precision matters more than initial volume for this kind of niche prospecting.",
      "assertions": [
        "Identifies B2B branch",
        "Runs Phase 1 ICP definition asking targeted questions",
        "Recommends industry-specific directories beyond Apollo/ZoomInfo",
        "Mentions trade show exhibitor lists",
        "Identifies relevant trigger events",
        "Warns about data spottiness for industrial",
        "Recommends cross-verification with business websites + LinkedIn",
        "Notes ICP fit precision over volume"
      ],
      "files": []
    },
    {
      "id": 5,
      "prompt": "I build websites for local businesses. Find me 15 prospects near Austin, TX who don't have a website.",
      "expected_output": "Should identify as Local SMB branch. Should run Phase 1 ICP definition: business category (ask user — gyms, restaurants, salons, etc. matter), radius (default 20 km from Austin), target count (15). Should run the browser research workflow: search Google Maps for category + Austin, build candidate list from visible results, cross-check via business name + city web search to verify website status. Should apply the 4-tier website status classification (No site found / Social only / Weak site / Has site) — prioritize No site + Social only as Hot. Should score: Hot (no site + active + phone + within radius), Warm (weak site), Cold (has site), Skip (closed/duplicate/out of scope). Should output Local SMB chat table (Score | Business | Category | Area | Distance | Website status | Website/Social | Phone | Why prospect | Confidence). Should add 'Best first outreach targets' top 3 with reasoning. Should reference references/local-prospecting.md. Should warn against bulk-scraping Google Maps (ToS violation) — browser-assisted research only.",
      "assertions": [
        "Identifies Local SMB branch",
        "Asks about business category if not specified",
        "Defaults radius to 20km",
        "Runs browser research workflow",
        "Applies 4-tier website status classification",
        "Uses Hot/Warm/Cold/Skip scoring",
        "Outputs Local SMB chat table columns",
        "Adds top 3 outreach targets",
        "References local-prospecting.md",
        "Warns against bulk-scraping Google Maps"
      ],
      "files": []
    },
    {
      "id": 6,
      "prompt": "I have a list of 200 prospect emails from Apollo. How do I know which ones are deliverable before I start outreach?",
      "expected_output": "Should explain the deliverability validation step in Phase 3. Should recommend Truelist (the integration in this pack) for bulk validation. Should explain the email_state classification output: ok (deliverable), email_invalid (bounces, exclude), risky (deliverable with risk like role or disposable, include cautiously), unknown (couldn't determine, skip or re-verify), accept_all (catch-all domain, include cautiously). Should warn that Apollo data accuracy is typically 60-80% — sending without validation will tank sender reputation (bounce rate >2% triggers ISP throttling and reputation damage). Should recommend the workflow: bulk POST to /api/v1/verify or CSV upload → keep ok, include risky/accept_all cautiously, exclude email_invalid, re-verify unknown → hand off to outreach. Should note Truelist also has an official MCP server for agent-driven validation. Should note cold email reputation is hard to recover once damaged — validation is non-negotiable, not optional. Should mention Hunter and Snov as alternatives with built-in verification. Should reference truelist.md integration guide.",
      "assertions": [
        "Recommends Truelist for bulk validation",
        "Explains email_state values (ok, email_invalid, risky, unknown, accept_all)",
        "Warns Apollo accuracy is 60-80%",
        "Cites 2% bounce rate threshold for reputation damage",
        "Recommends workflow: validate, keep ok, exclude email_invalid",
        "Mentions Truelist MCP server for agent workflows",
        "Mentions cold email reputation is hard to recover",
        "References truelist.md or data-sources.md"
      ],
      "files": []
    },
    {
      "id": 7,
      "prompt": "I just built a tool that automates failed-payment follow-up for gym owners. I have no customers yet. Help me find my first ten — the people who are actually dealing with this problem right now.",
      "expected_output": "Should select the Demand-signal branch (early-stage, first customers, evidence-of-demand) and load references/demand-signals.md — NOT the SMB/B2B list-building branches. Should start with a product brief, then mine the five signal buckets (explicit demand / pain / workaround / switching / timing) across public discourse (forums, communities, reviews, GitHub issues, job posts) — using last30days for recency, social-fetch/scraping to read original pages, not qualifying from snippets. Should score prospects on demand-fit (pain 25 / product fit 25 / timing 20 / reachability 15 / evidence quality 15, 0-100 with bands) rather than ICP-fit Hot/Warm/Cold, and require a cited public signal for every primary-shortlist prospect. Should draft source-based openers but never auto-send. Should produce an evidence report (verdict → ICP → top prospect → shortlist with sources+scores → repeated patterns → 7-day manual outreach plan → limits) and label prospects as 'potential customers based on public signals,' not confirmed buyers. Should honor the compliance guardrails including no data brokers/leaked data and no sensitive-trait targeting.",
      "assertions": [
        "Selects the Demand-signal branch, not the SMB/B2B/SaaS list-building branches",
        "Mines the five signal buckets from public discourse rather than contact databases",
        "Uses recency/original-source tooling (last30days, social-fetch, scraping) and does not qualify from snippets",
        "Scores on the demand-fit rubric (0-100 weighted), not ICP-fit Hot/Warm/Cold",
        "Requires a cited public signal for every primary-shortlist prospect",
        "Drafts openers but never auto-sends; labels prospects as potential-based-on-public-signals",
        "Produces the evidence report structure with a 7-day manual outreach plan and limits"
      ],
      "files": []
    }
  ]
}

references/b2b-prospecting.md Reference

# B2B Prospecting Reference

For when the user sells to non-SaaS B2B — services, agencies, manufacturers, mid-market and enterprise companies, professional services firms.

---

## ICP Signals That Matter (B2B branch)

### Firmographic signals

- **Industry / vertical** — NAICS or SIC codes if precision matters
- **Company size** — headcount band, revenue band, location count
- **Geography** — relevant for time zones, regulations, on-site requirements
- **Business model** — service vs product vs distribution; B2B vs B2B2C
- **Ownership** — independent, PE-backed, public, family-owned — affects buying motion

### Buying signals

- **Trigger events**: new C-level hire, recent acquisition or divestiture, IPO/funding, opening a new location, recent rebrand, expansion announcement
- **Vendor signals**: posting RFPs publicly, switching costs in last quarterly report, contract renewal windows
- **Operational signals**: recent layoffs (cost pressure) or rapid hiring (capacity pressure)
- **News mentions**: launching new initiative, entering new market, regulatory change forcing action
- **PR / press**: anything that signals "this company is changing right now"

### Decay signals

- Multiple bankruptcies or PE-stripped operations
- Negative growth + cost-cutting headlines
- Ownership stagnation (small family-owned, no growth incentive)
- Buyer turnover (3+ Marketing Directors in 2 years)

---

## Discovery Sources (B2B branch)

### Tier 1 — primary discovery

- **Apollo**: best general B2B firmographic + contact discovery
- **ZoomInfo**: enterprise B2B + intent signals (mid-market+)
- **LinkedIn Sales Navigator**: industry + role + signal search; the gold standard for decision-maker mapping (manual)
- **Clay**: when you need custom waterfall lookups (e.g., enrich Apollo records with Hunter + Clearbit)

### Tier 2 — industry-specific directories

- **Crunchbase / Pitchbook**: funded businesses
- **D&B Hoovers**: large traditional B2B firmographics
- **State / national business registries**: for verified incorporation data
- **Industry association membership rosters**: trade groups often publish member lists
- **Trade show exhibitor lists**: signals active participation in a vertical
- **Procurement databases** (Procore for construction, e.g.): vertical-specific signals

### Tier 3 — trigger event monitoring

- **Google Alerts / Feedly**: trigger keywords ("acquired," "hires," "expansion," "raises," "announces")
- **PR Newswire / Business Wire**: company-controlled announcements
- **SEC filings** (public companies): material change disclosures
- **State filings**: new entity formation, dissolution

---

## Qualification Checklist (B2B branch)

- [ ] Industry / vertical matches ICP (use a recognized classification if possible)
- [ ] Company size within range (employees or revenue)
- [ ] Geography fits
- [ ] At least one trigger event in last 90–180 days
- [ ] Decision-maker role exists (CEO, COO, VP Operations, Director of X — match buyer profile)
- [ ] Email contact verifiable (named role > info@ catchall)
- [ ] Source URLs captured for firmographic claims
- [ ] No disqualifiers (closed, acquired-paused, multi-bankrupt, off-ICP)

---

## Output Columns (B2B branch)

Recommended CSV columns:

```csv
score,company,domain,industry,naics_code,size_band,revenue_band,country,city,trigger_event,trigger_date,contact_name,contact_title,contact_email,email_status,linkedin_url,source_urls,why_prospect,confidence,verified_date,notes
```

For chat table, condense to: Score | Company | Industry | Size | Trigger | Contact | Email status | Confidence.

---

## Top Outreach Targets Selection (B2B)

Prioritize for the top 3–5 hot leads:

1. **Trigger event recency** — 30 days beats 6 months
2. **Trigger event specificity** — new CMO hire in your buyer's role beats "company in the news"
3. **Decision-maker access** — named contact with verified email + LinkedIn beats role-only
4. **Vertical fit precision** — exact NAICS match beats "adjacent industry"

Each top target rationale names the trigger and decision-maker: "Hired new VP of Marketing 14 days ago; verified email; mid-market manufacturer matching ICP."

---

## Common Mistakes (B2B)

1. **Treating B2B like SaaS** — funding rounds matter less; PE ownership and acquisition activity matter more.
2. **Trying to verify private company revenue precisely** — most public databases approximate. Use size bands, not point estimates.
3. **Ignoring procurement complexity** at enterprise scale — your prospect contact list may not include the actual approver.
4. **Cold-emailing executive assistants** — they're not the buyer and they will flag your outreach as spam.
5. **Source URL hygiene** — without source lineage, you can't defend a contact under GDPR DSAR or CAN-SPAM challenge.
6. **Stopping at one source** — Apollo can be 60% accurate on small businesses. Cross-verify with LinkedIn or the business website.

references/compliance.md Reference

# Prospecting Compliance Reference

The legal and platform-ToS constraints that apply to prospect list building. Read first, every engagement.

> Operational guidance, not legal advice. For high-volume programs or programs touching EU/UK residents, run your setup past a privacy attorney.

---

## United States — CAN-SPAM (downstream)

CAN-SPAM regulates the cold email **send**, not the list build. But the list build matters because:

- You must be able to identify the source of every email address you contact (required if challenged)
- The "from" line and email content rules apply at send time — but you can't lie about how you got the contact
- Opt-out requests must be honored within 10 business days and tracked

**For prospecting specifically**: capture and retain the source URL + date for every contact you add to a list. CAN-SPAM doesn't require it explicitly, but defending your sender practices does.

---

## EU / UK — GDPR

The strictest applicable framework. Triggers when:

- Your prospect resides in EU/UK
- You're processing personal data (any identifiable info, including business emails tied to a named person)

### Lawful bases for cold B2B outreach

You have three credible options:

1. **Legitimate interest** (most common for B2B). Requires:
- The contact is in a business role likely to be interested in your offer
- The data was collected from a public, business-context source
- You provide a clear opt-out
- You can articulate the legitimate interest test in writing

2. **Consent** — typically not feasible for cold outreach (you don't have consent before first contact)

3. **Existing customer relationship** — only applies to current customers, not prospects

### What you must do

- Capture **source + date + lawful basis** for every contact
- Honor data subject access requests (DSARs) — you must be able to disclose, correct, or delete on request
- Include a privacy notice / opt-out in the first outreach
- Don't store personal data longer than necessary for the legitimate interest

### What disqualifies a list

- Bulk-scraped LinkedIn data — explicit ToS violation + GDPR risk
- Email addresses purchased from a list broker without source provenance
- "Anyone @ this domain" guessed emails sent without verification (multiplies risk + bounces)

---

## Canada — CASL

Stricter than CAN-SPAM. Cold B2B outreach requires:

- **Express consent** (explicit opt-in) — typically not present for cold prospecting
- **OR implied consent** — existing business relationship within 24 months, OR business address publicly published on the company's own site for the purpose of receiving such communications

**Practical implication for Canadian prospects**: relying on the publicly-published-address exception is the most defensible cold prospecting basis in Canada. You must include sender identification, mailing address, and an unsubscribe mechanism in every message.

---

## Platform Terms of Service

### LinkedIn

- **Sales Navigator** as a research tool: fine
- **Scraping LinkedIn at any scale**: explicit ToS violation. Banned accounts are permanent. Don't.
- **Apollo, Clay, and ZoomInfo** claim LinkedIn-overlap data through various legitimate channels — verify their data sources before assuming compliance
- **InMail and Connection Requests**: governed by LinkedIn's own messaging rules, not by CAN-SPAM/GDPR (because LinkedIn-internal)

### Google Maps

- ToS prohibits bulk extraction or productizing Maps data
- Browser-assisted research as a discovery aid: acceptable
- Storing Place IDs or large structured Maps data in your CRM: explicit ToS prohibition
- Use Maps to **find** local businesses, then cross-source from the business's own site for the data you retain

### Apollo / ZoomInfo / Clearbit

- All have their own ToS limiting reselling, downstream sharing, and use cases
- Read your contract — typically you can use the data for your own outreach but not productize it
- Don't share extracts publicly (e.g., on a leaderboard, in a public report)

### Crunchbase

- Free tier is read-only for personal use
- Paid tier permits broader use within contractual scope
- API access requires paid Pro+ tier

---

## Anti-Patterns (Don't Do These)

1. **Bulk-scraping LinkedIn / Google Maps / Yelp**. Browser-assisted research is OK; automated scrapers pointed at these platforms are not. **Firecrawl and Browserbase are fine for an individual prospect's own website** (the URL you found through manual discovery) — not for the platforms hosting prospects.
2. **Buying lists from random vendors** without source provenance. You inherit their legal exposure.
3. **Guessing emails and sending unverified**. Bounce rates over 2% destroy sender reputation; legally, you can't claim a "legitimate interest" basis for an email you fabricated.
4. **Harvesting personal email addresses** (Gmail, personal Outlook, etc.) from public profiles. Personal addresses raise GDPR risk significantly.
5. **Storing data you don't need**. Minimize retention. Don't keep prospect lists forever — GDPR right to deletion applies.
6. **Skipping the lawful basis documentation**. If challenged, you need to show your work. Capture source URL + collection date for every contact.
7. **Reselling prospect lists**. You may not have the right to share them downstream. Read your data provider contracts.
8. **CAPTCHA bypass / login wall bypass**. Even if technically possible, this signals bot behavior and violates virtually every ToS.

---

## Quick Audit Checklist

Before shipping a list to the user (or downstream to cold-email):

- [ ] Every contact has a source URL + collection date
- [ ] No contacts sourced from scraped LinkedIn data
- [ ] No Google Maps Place IDs or large Maps-structured data retained
- [ ] Lawful basis documented (legitimate interest test for B2B, or relevant alternative)
- [ ] Email addresses validated (deliverability check before outreach)
- [ ] Personal addresses (Gmail, etc.) flagged or excluded
- [ ] Source provider contracts permit the intended use case
- [ ] Retention plan documented (when to delete)
- [ ] First outreach will include unsubscribe + privacy notice (downstream concern for cold-email skill, but mention it now)

references/data-sources.md Reference

# Prospecting Data Sources

Tool selection guide for prospecting across all three branches.

---

## Tool selection by goal

| Goal | Primary tools | Notes |
|------|--------------|-------|
| **Build initial firmographic list (B2B / SaaS)** | Apollo, ZoomInfo, Clay | Apollo for breadth, ZoomInfo for enterprise + intent, Clay for custom workflows |
| **Decision-maker mapping** | LinkedIn Sales Navigator (manual), Apollo, ZoomInfo | Sales Nav is the gold standard. Never bulk scrape it. |
| **Tech stack qualification (SaaS)** | BuiltWith, Wappalyzer | BuiltWith has wider coverage + paid plans for bulk; Wappalyzer is lighter + free for small use |
| **Funding signals (SaaS)** | Crunchbase, Pitchbook | Crunchbase free tier sufficient for early signals; Pitchbook for deeper investor data |
| **Email pattern discovery** | Hunter, Snov, Apollo | Pattern guessing — followed by verification |
| **Email deliverability verification** | Truelist, Hunter, NeverBounce, ZeroBounce | Always verify before adding to outreach lists |
| **Visitor identification (warm intent)** | RB2B, Clearbit Reveal | Anonymous traffic → company identification |
| **Intent data** | ZoomInfo Intent, 6sense, Bombora | Pre-warmed signals; mid-market+ pricing |
| **Trigger event monitoring** | Google Alerts, Feedly, LinkedIn Sales Nav alerts | Free options are sufficient for most |
| **Local business discovery** | Google Maps (manual), Yelp, Facebook Pages | Browser-assisted, not bulk-extracted |

---

## Apollo

**Use for**: General B2B / SaaS firmographic + contact data. Best starting point if you don't already have a list.

**Strengths**:
- Large database (>200M contacts, >60M companies)
- Strong filtering UI (industry, size, technologies, signals)
- Integrated email + LinkedIn finder
- Pay-as-you-go and tiered plans

**Watch out for**:
- Data freshness varies — re-verify before scoring as "Hot"
- Email accuracy ~60–80% — always validate
- Bulk export limits apply

**Integration**: see [apollo.md](../../../tools/integrations/apollo.md)

---

## Clay

**Use for**: Multi-source enrichment, waterfall lookups, custom scoring logic. When list quality matters more than list size.

**Strengths**:
- Waterfall logic: try Apollo first → fallback to ZoomInfo → fallback to Clearbit
- 100+ data provider integrations
- AI-powered enrichment (LLM-driven extraction from URLs)
- Custom columns + scoring formulas
- Native MCP server

**Watch out for**:
- Per-credit pricing can spike on large lists
- Complexity overhead — easy to over-engineer workflows

**Integration**: see [clay.md](../../../tools/integrations/clay.md)

---

## ZoomInfo

**Use for**: Enterprise B2B + intent data. Mid-market+ buyer profiles.

**Strengths**:
- Enterprise-grade firmographic depth
- Intent signals (companies searching topics relevant to your offer)
- Best-in-class for >$50K ACV B2B sales
- Native MCP server

**Watch out for**:
- Expensive ($15K+/yr starter)
- Overkill for SMB prospecting
- Locked into multi-year contracts typically

**Integration**: see [zoominfo.md](../../../tools/integrations/zoominfo.md)

---

## Clearbit

**Use for**: Email → company enrichment, anonymous visitor identification (Clearbit Reveal).

**Strengths**:
- Strong company enrichment (industry, size, funding, tech stack)
- Email lookup by domain
- Reveal: identify anonymous site visitors at company level
- API-first

**Watch out for**:
- HubSpot acquisition (2023) — bundled into HubSpot Breeze Intelligence now
- Standalone API still available but pricing/access depends on tier

**Integration**: see [clearbit.md](../../../tools/integrations/clearbit.md)

---

## Hunter / Snov

**Use for**: Email pattern discovery + lightweight verification on small lists.

**Hunter strengths**:
- Domain-based email discovery
- Built-in deliverability verification
- Free tier reasonable for occasional use

**Snov strengths**:
- Email finder + drip campaigns (overlap with outreach tooling)
- Bulk verification
- Cheaper than Hunter at scale

**Watch out for**:
- Both are pattern-guessing tools — accuracy depends on the target company's email pattern being inferable
- Always run results through a dedicated validator (Truelist or similar) before outreach

**Integrations**: see [hunter.md](../../../tools/integrations/hunter.md), [snov.md](../../../tools/integrations/snov.md)

---

## Truelist

**Use for**: Email deliverability validation before adding contacts to outreach lists. Critical safety step.

**Strengths**:
- Single-email sync verification (`/api/v1/verify_inline`) + bulk async (`/api/v1/verify`)
- Returns `email_state` (ok / email_invalid / risky / unknown / accept_all) + `email_sub_state` (email_ok / is_disposable / is_role / unknown_error / failed_smtp_check) + did-you-mean typo suggestions
- Catches catch-all domains, role accounts, spam traps, disposable providers
- Official MCP server for agent-driven workflows (Claude, Cursor, VS Code)
- Official SDKs in 7 languages + framework integrations (Django, Laravel, Next.js, Rails, React, Svelte, Vue, WordPress)
- Native integrations with Mailchimp, Klaviyo, HubSpot, Zapier, Make, n8n, Clay, Salesforce, more
- Pay-per-email pricing

**Why this matters**: Cold email reputation craters when bounce rates exceed 2%. Validating before sending is non-negotiable. Apollo/ZoomInfo/Hunter data is often 60–80% accurate — Truelist catches the rest.

**Integration**: see [truelist.md](../../../tools/integrations/truelist.md)

---

## LinkedIn Sales Navigator

**Use for**: Manual decision-maker discovery. The gold standard for B2B / SaaS prospecting but only when used as a research tool.

**Strengths**:
- Most accurate decision-maker data in the industry
- Real-time job changes, posts, signals
- Lead lists, alerts, saved searches
- Inmail credits (separate channel from cold email)

**Hard rules**:
- **Never bulk scrape**. LinkedIn aggressively bans scrapers. Account ban risk is real and permanent.
- Use Sales Nav as a research interface — open profiles, read, take notes, capture key data manually.
- Apollo and other tools claim LinkedIn data via partnerships / public mirroring — verify the source legitimacy before assuming compliance.

**Integration**: no MCP or API access at consumer level. Manual research only.

---

## BuiltWith / Wappalyzer

**Use for**: Tech stack qualification (SaaS branch).

**BuiltWith**:
- ~50K+ technologies tracked
- API + bulk lookups (paid)
- Historical data (when stack changed)

**Wappalyzer**:
- Free browser extension; paid API
- Lighter coverage than BuiltWith
- Faster for one-off lookups

Cross-reference both for high-confidence tech stack signals.

---

## Crunchbase

**Use for**: Funding signals (SaaS branch).

**Strengths**:
- Free tier shows recent funding events
- Paid (Pro / Enterprise) unlocks alerts and deep history
- API access for paid users

**Watch out for**:
- Coverage is best for VC-backed companies; bootstrapped + small businesses underrepresented
- Self-reported data — verify funding amounts independently

---

## GitHub (stargazers / forks / watchers)

**Use for**: Developer-intent prospecting. Especially powerful for dev-tool SaaS — stargazers of competitor or category-defining repos are in-market signal.

**Strengths**:
- Public API, no scraping concerns
- High signal quality (a starred repo = explicit interest)
- Forks are an even stronger signal (intent to modify, not just bookmark)
- Bundled `github-prospects.js` CLI handles pagination + enrichment + CSV output
- Free with 5,000 req/hr authenticated rate limit

**Watch out for**:
- Only ~5–20% of users publish email — pair with Apollo/Clay/Hunter for enrichment
- Very-popular repos (100K+ stars) are mostly noise; smaller targeted repos (5K–25K) give better signal density
- Most prospects are individuals, not company contacts directly — need to figure out their company from `company` field or LinkedIn

**Integration**: see [github.md](../../../tools/integrations/github.md)

---

## Firecrawl / Browserbase (single-target site research)

**Use for**: Programmatically extracting content from a **prospect's own website** that you already found via discovery on platforms like Google Maps, Yelp, or LinkedIn. Not for scraping those platforms themselves.

### Firecrawl

- **Best for**: "Just give me the page as markdown" — Local SMB website status checks, B2B company about/team page extraction, structured field extraction
- **Strengths**: Low overhead, returns clean LLM-ready markdown, handles most JS-rendered sites, has an MCP server
- **API + MCP + SDKs**: Node, Python, Go, Rust

### Browserbase

- **Best for**: When you need real Chromium — JS-heavy pages, cookie consent dialogs, form submission to reach a contact page, session state
- **Strengths**: Full browser control via Playwright/Puppeteer; Stagehand provides AI-friendly natural-language extraction; session recordings for debugging
- **API + MCP (Stagehand) + SDKs**: Node, Python

### Critical compliance line

Both tools can technically point at any URL. The hard rule:

- ✓ **OK**: extracting content from a single business's own website (`joescoffeeshop.com`) that you found through manual discovery
- ✗ **NOT OK**: pointing them at `google.com/maps`, LinkedIn search results, Yelp listings, or any platform whose ToS prohibits bulk extraction

Discovery happens on platforms (manual browser-assisted research). Extraction happens on individual public business sites.

**Integrations**: see [firecrawl.md](../../../tools/integrations/firecrawl.md), [browserbase.md](../../../tools/integrations/browserbase.md)

---

## RB2B / Clearbit Reveal

**Use for**: Identifying anonymous site visitors as warm intent signals.

**Strengths**:
- Pixel-based visitor → company identification
- High-intent: they came to your site, they're already in research mode
- Slack / email alerts on key visits

**Watch out for**:
- Privacy/GDPR considerations — verify your privacy policy disclosures
- Person-level identification raises higher concerns than company-level

**Integration**: see [rb2b.md](../../../tools/integrations/rb2b.md)

---

## Free / browser-only fallbacks

When the user has no paid tools, lean on:

- **Google Search** — exact business name + city + role searches
- **LinkedIn** (manual, no scraping) — company pages, employee lookups
- **Crunchbase free tier** — funding events
- **Wappalyzer browser extension** — tech stack at a glance
- **Hunter.io free tier** — 25 lookups/month
- **Google Maps** — for Local SMB discovery
- **Business websites + About pages** — primary source for any claim
- **News sites + press releases** — trigger event monitoring via Google Alerts

Slower than tooled-up workflows, but produces high-quality smaller lists if the user is willing to do the work.

---

## Sequencing recommendations

A typical full-stack prospecting workflow:

1. **Define ICP** from product-marketing context (no tools needed)
2. **Initial list** from Apollo or ZoomInfo (firmographic filter)
3. **Enrich** with Clay (waterfall: tech stack, funding, trigger events)
4. **Decision-maker mapping** in LinkedIn Sales Nav (manual)
5. **Email pattern discovery** with Hunter or Apollo's built-in
6. **Email validation** with Truelist before final list
7. **Hand off** to cold-email skill for outreach copy

Adapt this sequence based on which tools the user actually has.

references/demand-signals.md Reference

# Demand-Signal Discovery (Find Your First Customers)

The other three branches build a list from who *fits* (firmographics, technographics, proximity). This branch builds a list from who is *already showing the pain* — the early-stage motion where you have a product and a hunch but no customer base yet, and you need your first ten real conversations. You are not filtering a database; you are mining recent public discourse for people describing the exact problem you solve, then linking every prospect to the evidence.

Use this branch when the user is pre-product-market-fit, launching something new, or looking for **design partners, beta users, or first customers** rather than a scaled outbound list. It reuses the shared five phases and every compliance guardrail in SKILL.md; what changes is where you look, how you score, and what you ship.

Pattern credit: the framework here is re-expressed from the open-source `first-customer-finder` Codex skill (Kappaemme, MIT), extended with our live-recency tooling.

## What makes this branch different

| | List-building branches (SaaS / B2B / SMB) | Demand-signal discovery |
|---|---|---|
| Starts from | A firmographic ICP | A described problem |
| Sources | Contact databases (Apollo, ZoomInfo, Clay) | Public discourse (forums, reviews, issues, posts) |
| Contact step | Enrich + verify email deliverability | None — reach them where they already posted |
| Wins on | Coverage at scale | 10 strong evidence-backed matches over a long list |
| Output | A scored lead sheet | An evidence report + manual outreach plan |

A prospect here without a cited pain, need, or timing signal is a speculative fit — it does **not** belong in the primary shortlist. Evidence is the entry ticket.

## Step 1 — Product brief (before any searching)

Define, specifically enough to *reject* weak matches:

- product and the promised outcome
- primary user and the economic buyer (often different)
- the urgent job to be done
- the current alternative or workaround being replaced
- the likely adoption trigger (what makes now the moment)
- geography / language constraint
- clear disqualifiers

Don't start broad collection until the brief is sharp. Pull from `.agents/product-marketing.md` if it exists.

## Step 2 — Mine the five signal buckets

Search several angles, not one query repeated. Adapt wording to how the audience actually talks (mine their vocabulary from organic content first — see the ad-creative hook-system's organic-language note for the same idea).

1. **Explicit demand** — "looking for," "recommend a tool for," "alternative to [X]," "does anything exist that," "how do you all handle."
2. **Pain** — "takes hours," "so manual," "hate that," "keeps breaking," "biggest frustration with," "why is there no."
3. **Workaround** — spreadsheets, copy-paste, a VA, a Zapier chain, a script, a template, any repeated manual step that your product would replace.
4. **Switching** — cancellation, migration, "moving off [competitor]," a missing feature, a pricing complaint, competitor frustration.
5. **Timing** — a public launch, a new hire for the relevant function, expansion, a new workflow or regulation, an integration announcement — a *current* event that makes the product relevant now.

**Use our live-recency edge.** A generic skill relies on whatever a web search surfaces; you have better:
- **last30days** — Reddit, Hacker News, X, YouTube, and web signals from the last 30 days. This is the single highest-value tool for this branch: recency *is* the timing signal.
- **social-fetch** — pull the full content of a specific post/thread you find, normalized.
- **scraping** / **Firecrawl** / **Browserbase** — read the original public page (a forum thread, a GitHub issue, a review), never qualify from a search snippet alone.
- **deep-research** — for a multi-source sweep with adversarial verification when the wedge is broad.
- **competitor-profiling** / **customer-research** — competitor switching signals and review-mining for the pain language.

## Step 3 — Source mix (public only)

Forums and public community threads · public social posts and replies · product and app-marketplace reviews · GitHub issues and feature requests · public company pages, job posts, changelogs, launch announcements · "looking for a tool" posts and directories.

Avoid private groups, gated communities, data brokers, leaked datasets, and any source whose terms prohibit access — the same compliance guardrails as every other branch (see SKILL.md), including the no-sensitive-traits rule.

**Business/professional context only.** Qualify and reach out only where someone is posting in a professional or business capacity about a work problem (a founder in an indie-hackers thread, a developer in a GitHub issue, an ops lead in a subreddit for their role). Exclude personal-distress contexts entirely — health, financial hardship, addiction, grief, or any consumer support forum where people are venting personal problems, even if your product is tangentially relevant. When the motion is genuinely consumer (B2C), a public pain post is not on its own a lawful basis for cold outreach — reach people through the channel's own norms (reply publicly where replying is expected) and never DM a stranger off a personal post.

Quote minimally, paraphrase by default, and link every material pain or timing signal.

## Step 4 — Score on demand-fit (not ICP-fit)

The list-building branches score Hot/Warm/Cold on ICP fit. This branch scores 0–100 on **demand fit** — how strongly the evidence says this specific prospect wants this specific thing now. Score each dimension 0–5:

| Dimension | Weight | What it measures |
|---|---|---|
| **Pain strength** | 25% | Directness, severity, repetition, and cost of the stated problem |
| **Product fit** | 25% | How directly your product solves the evidenced job |
| **Timing** | 20% | Freshness + a current trigger present |
| **Public reachability** | 15% | A natural, relevant public/professional contact path exists |
| **Evidence quality** | 15% | Specificity, source reliability, confidence the signal is really theirs |

```
score = pain/5*25 + fit/5*25 + timing/5*20 + reachability/5*15 + evidence/5*15
```

| Band | Meaning |
|---|---|
| **80–100** | Strong first-customer candidate |
| **65–79** | Promising — validate fast |
| **50–64** | Plausible but missing a material signal |
| **Below 50** | Do not include in the primary shortlist |

An old explicit request can still count — but lower the timing score and label the date. A company that merely matches the industry with no evidenced trigger is *not* a qualified prospect here.

### Prospect stages

- **High intent** — publicly requesting a solution or actively switching
- **Problem aware** — clearly describing the pain or an expensive workaround
- **Trigger present** — a current business event makes the product relevant
- **Potential fit** — ICP match, incomplete evidence → keep *outside* the primary shortlist

### Evidence ledger (per qualified prospect)

Displayed name (company/project/public professional) · source title + URL · visible publication date or "date unavailable" · source type · the concise pain/timing signal · observed evidence vs. inference (label which) · score breakdown · freshness warning when the signal is stale.

## Step 5 — Draft outreach, never send it

Recommend the most natural channel *already associated with the source*, and only where a reply is a normal part of that channel (reply in the public thread, respond via a public professional profile). Don't turn a public post into a private DM the poster didn't invite, and never contact someone off a personal-distress post. Draft one opener, under ~90 words, in this shape:

1. mention the public context naturally
2. connect it to the exact problem
3. explain the product in one sentence
4. ask one low-friction question

Never claim familiarity you don't have, never fabricate personal details, and never auto-send: no messages, connects, follows, comments, form submissions, or CRM records unless the user separately authorizes that action. This is the manual/gated posture from the marketing-loops guardrails.

## Step 6 — Ship the evidence report

Lead with the most actionable evidence, in this order:

1. **Verdict** — does the product have reachable early-customer signal, or not yet? (An honest "not yet, here's why" is a valid answer.)
2. **ICP** — buyer, job, trigger, disqualifiers.
3. **Top prospect** — the single strongest evidence-backed candidate and why now.
4. **Prospect shortlist** — per prospect: source, pain signal, demand-fit score, stage, why-now, channel, opener.
5. **Repeated patterns** — pains and triggers recurring across prospects (these are your positioning and messaging gold).
6. **Seven-day manual outreach plan** — a low-volume validation sequence (e.g., contact the top 3 with one source-based question; share a mockup only after they confirm the pain; target three conversations and one design-partner commitment).
7. **Limits** — what evidence is missing and what must be confirmed through real conversations.

For a shareable standalone HTML version of this report, the JSON→HTML generator pattern in ad-creative's [creative-review-page.md](../../ad-creative/references/creative-review-page.md) is the model (escape every value; keep it self-contained).

## The honesty rules (non-negotiable)

- Every primary prospect links to at least one real public signal. No signal, no shortlist.
- Label the output **"potential customer based on public signals"** — never "interested," "will buy," or "has consented."
- Prefer ten strong matches over a long generic list. Make uncertainty and stale evidence visible.
- Personalize from the cited source, not from invented assumptions.
- Treat the shortlist as a research hypothesis to validate through conversations, not a customer database.

references/local-prospecting.md Reference

# Local SMB Prospecting Reference

For when the user sells to local small businesses — shops, gyms, restaurants, salons, clinics, professional services, contractors, real estate, fitness studios, dental practices.

Adapted from and generalized beyond the local-client-prospector pattern (browser-assisted discovery + website status classification + proximity scoring).

---

## ICP Signals That Matter (Local SMB branch)

### Operational signals

- **Active business** — Google Business Profile updated, recent reviews, recent hours updates
- **Recent activity** — open right now, regular hours posted, recent photos uploaded by owner
- **Customer engagement** — owner responding to reviews, posts on social, active calendar (for service businesses)

### Online presence signals (the core SMB qualification axis)

The reference local-client-prospector skill uses **website status** as the primary qualification — port this directly. Four classifications:

| Status | Definition | Typical outcome |
|--------|-----------|-----------------|
| **No site found** | No credible standalone website after cross-checked search | **Hot prospect** for web/marketing service |
| **Social only** | Facebook, Instagram, WhatsApp, Linktree, booking portal, marketplace page only — no standalone site | **Hot prospect** for web/marketing service |
| **Weak site** | Standalone site exists but outdated, broken, very thin, non-mobile-friendly, or missing clear contact/conversion flow | **Warm prospect** for refresh / rebuild service |
| **Has site** | Credible, modern standalone site exists | **Low prospect** unless other signals apply (e.g., poor SEO, weak conversion design) |

### Proximity signals

- **Distance** from the user's location or service area
- **Density** — clusters of similar businesses in one area = neighborhood targeting opportunity
- **Travel time** — useful when in-person discovery, install, or service delivery is required

### Decay signals

- Closed permanently (Google Maps banner)
- Reviews paused or business listing reported as closed
- Last activity (review, post) >12 months ago

---

## Discovery Sources (Local SMB branch)

### Primary

- **Google Maps** (browser, manual) — search "category near [location]" and walk the visible results. Cross-check details. Don't bulk-extract.
- **Yelp** — secondary verification; complementary categories
- **Bing Local / Apple Maps** — different coverage on smaller businesses
- **Facebook Pages search** — many SMBs are Facebook-only

### Cross-verification

- **Business's own website** (if any)
- **Industry directories** (e.g., Healthgrades for medical, OpenTable for restaurants, Avvo for legal)
- **Local Chamber of Commerce listings**
- **State business registries** for incorporation status
- **Search results for "[business name] [city]"** to discover non-Maps presence

---

## Browser Research Workflow

1. Open a browser and search Google Maps for the category near `base_location`
2. Build a candidate list from visible local results, search results, and public directories
3. For each candidate, inspect public sources to fill required fields
4. Search the exact business name plus city/town to check whether a standalone website exists
5. Classify website status per the table above
6. Mark confidence: High (2+ sources), Medium (1 source + consistent evidence), Low (incomplete/ambiguous)

When the user explicitly asks for subagents AND subagents are available, split candidates into non-overlapping batches and ask each subagent to verify only website/social/contact status. Don't use subagents for the primary search if it slows progress.

### Optional: programmatic verification with Firecrawl or Browserbase

Once you have a candidate's website URL (found via manual Maps/Yelp discovery), you can speed up website-status classification by hitting the URL programmatically:

- **Firecrawl** for simple "is this site live, modern, mobile-friendly, conversion-flow-equipped" reads — returns clean markdown you can inspect
- **Browserbase** when the candidate site requires JS rendering, has a cookie consent dialog, or you need session state

**Strict line**: use these on the individual business's URL. **Don't** point them at Google Maps, Yelp, or any platform whose ToS prohibits bulk extraction — discovery stays manual.

See [data-sources.md](data-sources.md) for setup details.

---

## Qualification Checklist (Local SMB branch)

- [ ] Business is active (recent reviews or activity in last 6 months)
- [ ] Category matches user's service offering
- [ ] Distance / proximity within target radius
- [ ] Website status classified
- [ ] Phone or contact channel verified
- [ ] At least one cross-source confirms business operates at the listed address
- [ ] Not a duplicate / chain location / out-of-scope category
- [ ] Not closed permanently

---

## Lead Scoring (Local SMB)

Use this simple rubric (matches local-client-prospector pattern):

| Score | Criteria |
|-------|----------|
| **Hot** | No site found OR social-only + phone present + active business + within target radius |
| **Warm** | Weak site, poor online presentation, or marketplace/booking-page only |
| **Cold** | Good website already present OR low confidence |
| **Skip** | Closed, duplicate, outside radius, irrelevant category, or not a business prospect |

---

## Output Columns (Local SMB branch)

Chat table (≤15 rows):

```
| Score | Business | Category | Area | Distance | Website status | Website/Social | Phone | Why it's a prospect | Confidence |
```

CSV:

```csv
score,business,category,area,distance_km,website_status,website_url,social_urls,phone,email,source_urls,why_prospect,confidence,verified_date,notes
```

Rules:
- Keep "Why it's a prospect" short and actionable
- Use `Not found` instead of leaving blank fields
- Include source links sparingly, not all of them
- After the table, add **Best first outreach targets** with the top 3 leads and one practical reason each
- If confidence is low, state exactly what remains uncertain

---

## Top Outreach Targets Selection (Local SMB)

Prioritize for the top 3 hot leads:

1. **No site / social only + phone present** = clearest service opportunity
2. **High review count** = active, established business with real customers
3. **Owner-responded reviews** = engaged owner = more likely to evaluate a vendor
4. **Industry alignment with your service specialty** beats generic category match

Each top target rationale should be one sentence naming the gap and the signal: "No standalone website (cross-checked); 80+ Google reviews with owner replies; 2 km from target area."

---

## Compliance Notes (Local SMB-specific)

The local branch is the most scraping-sensitive of the three motions. Specifically:

- **Google Maps Terms of Service** prohibit bulk extraction. Treat browser visits as research, not as data acquisition.
- **Don't store full Google Maps Place IDs in your CRM** — the ToS limits storage of Maps data.
- **Public business contact channels only**: published phone, contact form, info@ email. Don't reach individual employees through their personal channels.
- **Owner/operator name when published on the business's own site** is OK to use. If you only got it from LinkedIn, mark the source.

---

## Common Mistakes (Local SMB)

1. **Bulk-scraping Google Maps** — fastest way to violate ToS and lose the research channel.
2. **Treating Google Maps data as truth** — listings go stale. Cross-check hours, status, and reviews.
3. **Skipping the website status cross-check** — finding "no site" on Maps doesn't mean no site exists; do an exact-name web search before classifying.
4. **Targeting only the largest businesses** — they're already covered by other providers. The 2–5 employee SMBs are the under-served opportunity.
5. **Generic outreach to all hot leads** — local SMBs respond better to outreach that names their specific gap ("I noticed your menu isn't visible on mobile") than generic pitches.
6. **Ignoring chains and franchises** as Skip — sometimes the franchisee is the buyer and they have local marketing authority. Verify before skipping.

references/saas-prospecting.md Reference

# SaaS Prospecting Reference

For when the user sells SaaS or digital services to other SaaS companies / digital businesses.

---

## ICP Signals That Matter (SaaS branch)

Beyond standard firmographics (industry, size, geography), SaaS prospects are qualified by:

### Technographic signals

- **Tech stack** — do they use complementary tools (your integration target) or competing tools (a switch opportunity)?
- **Recent stack changes** — adding/removing tools signals active vendor evaluation
- **Custom-built vs off-the-shelf** — DIY tooling often means a buyer who'd benefit from your product
- **Free/freemium plan signals** — using a free competitor means they may be ready to upgrade

### Growth signals

- **Funding round** — Series A / B / C in last 6 months = budget + new hires + tool needs
- **Headcount growth** — 10%+ growth in last quarter signals scaling pressure
- **Hiring signals** — specific role openings (e.g., "Head of RevOps" → ICP for revops tooling)
- **Product velocity** — frequent shipping, new features, blog posts = healthy growth motion
- **Open positions for your buyer's role** — if you sell to Marketing Ops and they're hiring one, that's a signal

### Decay signals (downgrade scoring)

- Layoffs in target department
- Funding round >2 years ago with no follow-up
- Product hasn't shipped in 6+ months
- Team page shows founders only (very early — may not have budget)

---

## Discovery Sources (SaaS branch)

Combine 2+ sources for cross-verification.

### Tier 1 — primary discovery

- **Apollo**: firmographic + technographic + contact data. Good for building large initial lists.
- **Clay**: waterfall enrichment, custom scoring, multi-source merges. Best for high-quality smaller lists.
- **ZoomInfo**: enterprise-grade firmographic + intent signals. Expensive; mid-market+.
- **LinkedIn Sales Navigator**: decision-maker mapping. Use manually, never bulk scrape.

### Tier 2 — technographic / growth signals

- **BuiltWith**: tech stack lookups, find sites using specific tools
- **Wappalyzer**: free browser extension + API; lighter tech stack signal
- **Crunchbase**: funding rounds, headcount, founders
- **Pitchbook**: deeper investor data (enterprise/paid)
- **ProductHunt**: recent launches, builder audience
- **Hacker News / Show HN**: technical builders launching products

### Tier 3 — buying signals

- **Job boards** (LinkedIn Jobs, Indeed, AngelList): role openings as signals
- **RB2B / Clearbit Reveal**: visitor identification (warm anonymous traffic)
- **GitHub stars/forks of competitor or adjacent repos**: developer-level intent signal (see `tools/integrations/github.md` and the `github-prospects.js` CLI). Especially strong for dev-tool SaaS — a developer who starred `vercel/next.js` last week is in-market for adjacent Next.js infrastructure.
- **Recent blog posts / changelog**: product direction signals
- **G2 reviews mentioning competitor switches**: explicit dissatisfaction signal

#### GitHub prospecting pattern (when audience is developers)

For dev-tool SaaS, GitHub is one of the highest-quality discovery channels:

1. Identify 3–5 "anchor" repos: your direct competitors, your category leader, complementary tools your buyer uses
2. Pull stargazers (or forks for stronger intent) via `node tools/clis/github-prospects.js stargazers <owner/repo> --enrich --with-company --format csv`
3. Filter to users with `company` set — these are the easiest to enrich downstream
4. Pair with Apollo/Clay/Hunter to lookup email by name + company
5. Validate with Truelist before adding to outreach list

Tradeoffs: GitHub yields email for only ~5–20% of users directly. The strength is the signal quality — a stargazer of a niche dev tool is genuinely in-market in a way Apollo firmographics alone can't tell you.

---

## Qualification Checklist (SaaS branch)

For each candidate, verify:

- [ ] Industry vertical matches ICP
- [ ] Company size (headcount) within range
- [ ] Tech stack includes (or notably excludes) a target technology
- [ ] Funding stage matches buyer maturity
- [ ] At least one growth signal in last 90 days (funding, hiring, product velocity)
- [ ] Decision-maker role exists at the company (named or inferable from job listings)
- [ ] Email contact verifiable
- [ ] No disqualifiers (closed, acquired-and-paused, layoffs, ICP miss)

---

## Output Columns (SaaS branch)

Recommended CSV columns:

```csv
score,company,domain,industry,size_band,country,funding_stage,last_round_date,tech_stack_match,signal,signal_date,contact_name,contact_title,contact_email,email_status,linkedin_url,source_urls,why_prospect,confidence,verified_date,notes
```

---

## Top Outreach Targets Selection (SaaS)

Prioritize for the top 3–5 hot leads:

1. **Strongest signal recency** — funding 30 days ago beats funding 9 months ago
2. **Tech stack match strength** — known integration partner beats inferred fit
3. **Decision-maker named with verified email** — beats role-pattern-guessed email
4. **Multi-source confidence** — both Apollo + Crunchbase agree beats one source

Each top target gets a one-sentence outreach rationale that names the specific signal: "Raised Series B 30 days ago; hiring Head of RevOps; verified VP of Ops email."

---

## Common Mistakes (SaaS)

1. **Buying lists from Apollo wholesale** without re-verifying email and re-checking firmographics. Stale data is the norm.
2. **Treating tech stack data as 100% accurate**. BuiltWith and Wappalyzer miss things; Clay's waterfalls miss things. Cross-check.
3. **Targeting Series C+ for early-stage SaaS sellers**. The buyer profile is wrong — too many procurement hoops, too much red tape.
4. **Targeting Series Pre-Seed seed** for products requiring meaningful budget. They have neither budget nor evaluator bandwidth.
5. **Ignoring intent data when it exists** (ZoomInfo Intent, 6sense, etc.) — pre-warm signals beat cold every time.

Version History

v1.1.0 Synced from GitHub

1 week ago

v1.0.0 Imported from GitHub

1 month ago