← Back to Blog
Pipeline Factory October 16, 2024 8 min read

How to Verify if a Company is a SaaS Company at Scale Using Clay.com

A battle-tested multi-agent workflow with scoring system that achieved 95% accuracy across 100,000+ companies

The Problem: AI Agents Don't Work at Scale

If you've used Clay for TAM research, you've probably hit this wall: the AI agents work great in testing, but fall apart at scale.

You test with 10 companies, get 7-8 correct identifications, and think "great, this works!" Then you scale to 1,000 companies and suddenly you're getting tech consulting firms, staffing agencies, and marketplaces mixed in with legitimate SaaS companies.

After processing over 100,000 companies and achieving 95% accuracy, I figured out why this happens—and how to fix it.

Key Insight: The primary industry descriptions and company data you get from LinkedIn, Apollo, or Clay's native enrichment simply aren't detailed enough for AI to make accurate decisions at scale.

Why Single Agents Fail

The common misclassifications I kept seeing:

  • IT Services & IT Consulting companies that actually sell software products
  • Software companies that are really tech-enabled consulting operations
  • Overly broad "Technology, Information & Internet" categorizations
  • Hiring platforms that are actually international staffing services

The root cause? AI needs six critical pieces of information that basic enrichment doesn't provide:

  1. Detailed explanations of what they offer
  2. Core service or product identification
  3. Primary revenue model
  4. Revenue source details
  5. Target customer identification
  6. Service delivery methods

The Multi-Agent Solution

Instead of asking one AI agent to do everything, I built a two-agent system that orchestrates specialized tasks:

Agent 1: The Offering Agent (Claygent Argon)

This agent scrapes the company website and extracts:

  • Subscription-based model confirmation
  • Detailed offering descriptions
  • Core product/service identification
  • Revenue model analysis
  • Customer targeting details

I use Claygent Argon for this because it's more accurate and cost-effective than GPT-4o mini alternatives. The quality difference is worth the slight cost increase.

Agent 2: The Scoring System (GPT-4o Mini)

A point-based evaluation framework with weighted indicators:

Positive SaaS Indicators:

  • "Web-based platform" (+3 points)
  • API architecture mentions (+2 points)
  • Subscription/recurring revenue models (+3 points)
  • "SaaS" or "Software as a Service" explicitly mentioned (+2 points)

Critical Red Flags:

  • "Hourly consulting rates" (-10 points) — This is the biggest red flag
  • "Ongoing service relationships" (-8 points)
  • "Project-based" work (-8 points)
  • Staffing/recruiting language (-10 points)

Edge Case Rules:

  • Staffing agencies disqualified regardless of technology enablement
  • Financial services and real estate technology typically excluded
  • Marketplace platforms require specialized assessment

The Secret Weapon: Examples in Prompts

Here's what made the biggest difference: you need to provide examples.

This is more important than the scoring rubric itself. Each example should include:

  • The company's dataset
  • Your analysis narrative
  • The final score
  • Your reasoning

Give the AI 5-7 diverse examples covering:

  • Clear SaaS companies (score: 8-10)
  • Clear non-SaaS companies (score: 0-2)
  • Edge cases (score: 4-6)

This trains the AI on how you think about classification, not just what rules to follow.

Performance Metrics

Threshold: Companies scoring above 5.5 are classified as SaaS

Accuracy: Consistent 95% accuracy across 100,000+ companies

Why 95% and not 100%? Because at scale, even manual verification has errors. 95% accounts for human error in the validation process.

Implementation Guidelines

  1. Reject single-agent solutions when working at scale
  2. Supplement basic data with website enrichment
  3. Use graduated scoring rather than binary yes/no classification
  4. Define edge cases explicitly in your prompts
  5. Prioritize examples over complex rules
  6. Test with 50-100 companies before large-scale deployment

Free Template Access

Want the complete Clay workflow and prompt templates? I'm offering them for free.

Connect with me on LinkedIn and message me "SaaS Verification Template" — I'll send you:

  • The complete Clay workflow template
  • Both agent prompts (with examples)
  • The scoring rubric breakdown
  • Edge case definitions

Key Takeaways

If you're doing TAM research at scale in Clay:

  • Single AI agents fail because they lack detailed company data
  • Multi-agent systems with specialized roles perform dramatically better
  • A two-agent approach (website scraper + scoring system) achieves 95% accuracy
  • Examples in prompts are more valuable than complex scoring rules
  • Graduated scoring (0-10) beats binary classification

This workflow has processed 100,000+ companies for our GTM engineering clients. It's battle-tested, production-ready, and yours for free.

Questions? Reach out on LinkedIn — I'm always happy to talk Clay workflows and GTM engineering.


About Pipeline Factory: Pipeline Factory is a GTM engineering consultancy that builds systematic outbound infrastructure for B2B SaaS companies. We've processed 100,000+ companies through Clay workflows and help clients achieve 95%+ accuracy in TAM research.

Need Help Building Clay Workflows?

We build custom TAM research workflows, signal-based campaigns, and systematic outbound infrastructure for B2B SaaS companies.

Let's Talk