Shamim Shams Search

Automate LinkedIn Outreach Messages Using AI and Python

· 8 min read
Automate LinkedIn Outreach Messages Using AI and Python

Cold messages fail for a simple reason. They're not actually cold — they're warm in theory and cold in execution. You've researched the person, found something interesting, written what you think is a relevant opener. Then you send the same structure to 150 people and wonder why three reply.

The bottleneck is drafting. Personalization takes time: reading a profile, finding a hook, writing an opener that references it, keeping the message short enough that someone actually reads it. Ten minutes per person, 200 people, do the math.

Here's what changes the equation: a Python script that reads structured contact data and passes each profile to Claude, which generates a message specific enough to land. You still review and send manually — LinkedIn's Terms of Service prohibit automated messaging, and you should read each message before it goes out anyway — but the drafting step drops from ten minutes to ten seconds.

What We're Actually Building

A batch processor: CSV in, CSV out. Each row in the input is a prospect with profile context. Each row in the output gets a generated outreach message attached.

The output looks like this:

{
  "name": "Sarah Chen",
  "generated_message": "Hi Sarah — saw your GitHub Actions write-up last month. The bit about parallelizing test suites across matrix builds solved a problem I've been poking at for weeks. Curious how you handled artifact retention across concurrent jobs. Worth a quick async exchange?"
}

That message references something specific, makes a clear connection, and asks for something low-friction. It's the kind of thing you'd write yourself if you had time to write it for 200 people.

Setup

pip install anthropic python-dotenv

Store your API key:

export ANTHROPIC_API_KEY=sk-ant-your-key-here

Or put it in a .env file for repeated runs:

ANTHROPIC_API_KEY=sk-ant-your-key-here

The Input Data

Your script is only as good as your data. The minimum viable CSV:

name,title,company,headline,recent_activity,connection_type
Sarah Chen,Engineering Manager,Acme Corp,"Building distributed systems at scale","Shared write-up on GitHub Actions matrix parallelization","Second-degree (via James Park)"
Marcus Williams,Head of Platform,NovaTech,"Infrastructure reliability at NovaTech","Commented on post about eBPF-based network monitoring","Cold outreach"
Priya Nair,Principal Engineer,DataCo,"ML infrastructure at scale","Published article on managing feature stores in production","Alumni — same bootcamp cohort"

recent_activity is doing most of the work. "Posted about Kubernetes" is useless. "Commented on a post about eBPF-based network monitoring in production" gives Claude something concrete to reference. Be specific when filling this column — it's the difference between a message that reads like research and one that reads like a template.

connection_type changes the tone. Cold outreach to someone you share no context with is different from messaging a second-degree contact through someone you both know well. If you have a mutual connection's name, include it.

Generating the Messages

import anthropic
import csv
import time
from dotenv import load_dotenv

load_dotenv()

client = anthropic.Anthropic()


def generate_outreach_message(prospect: dict) -> str:
    prompt = f"""Write a personalized LinkedIn outreach message for this prospect.

Prospect:
- Name: {prospect['name']}
- Title: {prospect['title']}
- Company: {prospect['company']}
- Profile headline: {prospect['headline']}
- Recent activity: {prospect['recent_activity']}
- Connection context: {prospect['connection_type']}

Rules:
- Maximum 3–4 sentences
- Open with something specific from their recent activity or role — not a generic compliment
- One sentence about your context or reason for reaching out
- One low-pressure ask — a specific question, not "hop on a call"
- Do not use: "I came across your profile", "I was impressed", "synergy", "touch base"
- Write as a peer, not a vendor
- First name only, no formal opener

Return only the message. No subject line, no explanation."""

    response = client.messages.create(
        model="claude-sonnet-4-6",  # or claude-opus-4-8 for more nuanced profiles
        max_tokens=300,
        messages=[{"role": "user", "content": prompt}]
    )

    return response.content[0].text.strip()

The banned phrases list in the prompt isn't defensive programming — it's experience. Without those constraints, Claude will occasionally produce "I came across your profile and was very impressed" even when you've given it specific context. The model reaches for polite formulas when input data is thin. Banning the most common ones forces it to use what you actually provided.

Does Sentence Count Actually Matter?

Three to four sentences is the sweet spot for LinkedIn messages. Long enough to be specific, short enough that someone reads it on a phone during a commute. I've tested longer formats — they don't get better reply rates. A longer message signals more effort to respond in kind, and people avoid that calculus.

That said, your audience changes this. Hiring managers and investors read LinkedIn differently than engineers do. A two-sentence message reads as confident and specific to one group and dismissive to another. Tune the max_tokens and the length instruction for your use case.

Batch Processing

def process_prospects(input_file: str, output_file: str) -> None:
    with open(input_file, newline='', encoding='utf-8') as f:
        prospects = list(csv.DictReader(f))

    print(f"Processing {len(prospects)} prospects...")
    results = []

    for i, prospect in enumerate(prospects):
        print(f"  [{i + 1}/{len(prospects)}] {prospect['name']} — {prospect['company']}")

        try:
            message = generate_outreach_message(prospect)
            results.append({**prospect, 'generated_message': message, 'status': 'ok'})
        except anthropic.RateLimitError:
            print("    Rate limit hit — waiting 60s")
            time.sleep(60)
            try:
                message = generate_outreach_message(prospect)
                results.append({**prospect, 'generated_message': message, 'status': 'ok'})
            except Exception as e:
                results.append({**prospect, 'generated_message': '', 'status': f'error: {e}'})
        except anthropic.APIError as e:
            print(f"    API error: {e}")
            results.append({**prospect, 'generated_message': '', 'status': f'error: {e}'})

        if i < len(prospects) - 1:
            time.sleep(1.2)

    fieldnames = list(prospects[0].keys()) + ['generated_message', 'status']

    with open(output_file, 'w', newline='', encoding='utf-8') as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(results)

    ok = sum(1 for r in results if r['status'] == 'ok')
    print(f"\nDone. {ok}/{len(prospects)} succeeded → {output_file}")


if __name__ == "__main__":
    process_prospects("prospects.csv", "outreach_messages.csv")

The 1.2-second sleep keeps you under the per-minute request limit on a standard API tier. The retry on RateLimitError covers the occasional burst — if you're coming off a previous heavy run or running multiple scripts simultaneously, you might hit the limit even with the sleep. The error path writes a blank message with the reason rather than crashing, so one failed API call doesn't cost you a 200-row run.

Reviewing What Comes Out

Open the output CSV in a spreadsheet. generated_message is your draft; status tells you what needs attention.

The messages worth sending almost always reference something from recent_activity. If a generated message only mentions the person's title or company — nothing from that field — it's not specific enough to send. Either update your input data and re-run that row, or rewrite it by hand.

A rough filter before anything goes out: would you be annoyed to receive this if you were them? That question catches about 80% of the duds.

Getting Better Output from the Model

The gap between a mediocre and a good generated message comes down to input quality. Three fields drive specificity:

recent_activity should be a specific observation, not a topic. "Published a post on managing feature drift in production ML pipelines" is usable. "Posted about machine learning" is not.

connection_type should include names where relevant. "Second-degree via James Park" gives Claude something to anchor the opener. "Second-degree connection" doesn't.

One optional field worth adding: your_context — a sentence from your side explaining why you specifically are reaching out. "Building a similar payments infrastructure and ran into the same settlement timing issue they described last month" makes the ask feel earned rather than arbitrary.

The Part That's Actually Hard

The script runs. The messages generate. Here's what this tutorial glossed over.

Getting recent_activity data for 200 people is manual work. There's no public LinkedIn API for profile activity. You're reading profiles by hand and filling in the column, which takes 3–5 minutes per prospect if you're being serious about it. For a 200-person list, that's 10–17 hours of research before the script runs once. The automation saves you the writing time — it doesn't save you the research time.

I've watched people build this kind of tool, run it once on a well-researched list, get decent results, and then realize the ongoing cost of keeping that list fresh is roughly what they expected to save. The ROI depends entirely on how often you're doing outreach and whether you were already doing that research anyway.

The other part: Claude occasionally produces something that's technically accurate but tonally off in a way that's hard to explain. Not wrong — just slightly performative, or missing the register of the person's actual writing style. You catch it when you read the message and think "I wouldn't say it like that." These are the ones to rewrite by hand. It's not a rubber-stamp operation.

Whether this setup pays off depends on your list size and frequency. For one-off campaigns under 30 people, the overhead probably doesn't justify it. For ongoing outreach or anything over 100 contacts, the drafting savings are real. The research question is the one I don't have a clean answer for.