LocoAgent: AI-Powered Social Media Agent with Real Browser Automation

Introduction

Today we're open-sourcing LocoAgent, an AI-powered social media agent that autonomously operates social media accounts through real browser automation. Unlike API-based bots or headless browser scrapers, LocoAgent operates through a copy of your actual Chrome profile — same cookies, same login sessions, same browser fingerprint — making it indistinguishable from manual browsing.

LocoAgent combines an LLM-driven agentic loop with agent-browser CLI to perceive, decide, and act on live web pages. It can like posts, write replies, follow users, publish content, and execute complex multi-step social media workflows — all autonomously.

GitHub Demo Video

Why Real Browser Automation?

Social media platforms actively detect and block headless browsers, API-based automation, and unofficial clients. Traditional bot approaches face constant arms races with anti-automation systems. LocoAgent takes a fundamentally different approach:

Real Chrome profile — Operates through a copy of your actual browser with your existing cookies and login sessions via Chrome DevTools Protocol (CDP)
Same fingerprint — No detectable differences from your normal browsing since it's the same browser engine, extensions, and configuration
No API hacks — No reverse-engineered private APIs that break with every platform update
Visual perception — The agent sees the page the same way you do, using interactive element snapshots with reference IDs

Architecture Overview

LocoAgent's architecture consists of four core layers that work together to enable autonomous social media operation:

User Task / Scheduled Task
    │
    ▼
Agentic Loop (LLM-driven)
    ├── System Prompt + Platform Skills + Operation Log Summary
    ├── Perceive: agent-browser snapshot → interactive elements
    ├── Decide: LLM chooses next action
    └── Act: agent-browser click / fill / navigate
            │
            ├── Workflow Engine (deterministic, no LLM)
            │   └── Scripted browser pipelines
            │
            └── Operation Log (persistent dedup)
                └── Prevents repeated likes, follows, replies

The agentic loop is the brain — it receives a task, perceives the current page state via agent-browser snapshot, decides what to do next using the LLM, and executes actions like clicking, typing, and navigating. The full agent-browser CLI reference is embedded in the system prompt, so the agent knows every command natively.

Platform Skill System

Skills are the key innovation that makes LocoAgent reliable. Instead of relying on the LLM to figure out how to operate each platform from scratch, skills inject complete operation playbooks into the agent's context. Each skill is a comprehensive manual covering every operation the agent might need.

Platform	Command	Operations	Capabilities
X.com	`/x-com`	32+	Browse, engage, post, social graph, profile, navigation, lists

With the X.com skill loaded, you can give the agent complex composite tasks and it will execute them in a single pass:

# Interactive: load skill then give task
> /x-com open home timeline, like first 3 posts about AI, reply to the best one

# Headless
bun start -p "/x-com like 5 posts about 'large language models', then follow the authors"

Adding New Platforms

The skill system is designed for extensibility. Adding a new platform is as simple as creating a SKILL.md file with the operation playbook:

mkdir -p skills/linkedin

# Create skills/linkedin/SKILL.md with the operations manual
# The skill auto-discovers at startup and becomes available as /linkedin

Workflow Engine

Not everything needs LLM intelligence. The workflow engine runs deterministic browser-automation pipelines without any LLM involvement. The agent acts as a supervisor — it can inspect status, start/stop workflows — but the execution itself is pure scripted automation. This means predictable behavior, lower costs, and faster execution.

Workflow	Schedule	Description
HuggingFace Papers Fetcher	Daily	Fetch paper list, abstracts, and thumbnails from HuggingFace
HuggingFace → X.com	Daily	Full pipeline: fetch HF papers → download thumbnails → post as tweets
X.com Search & Reply	Daemon	Search X.com → read posts → generate AI reply → post reply
LinkedIn Search & Comment	Daemon	Search LinkedIn → read posts → generate AI comment → post comment

# Run a workflow once
bun run workflow run --id hf-papers-to-x

# Run as a daemon (every 3 minutes)
bun run workflow daemon --id x-search-reply --interval 3

# Check status
bun run workflow status

Custom workflows follow a simple contract: accept --config <json>, log to stderr, and output a JSON summary as the last line on stdout. See the workflow development guide for details.

Operation Log & Deduplication

A recurring problem with autonomous agents is repeated actions — liking the same post twice, following someone you already follow, replying to the same thread again. LocoAgent solves this with a persistent operation log that spans across sessions.

Before every action, the agent checks the log. After every action, it records the result. This creates a reliable memory that prevents duplicate operations without needing a separate database.

# Check before acting (exit 0 = already done, exit 1 = not done)
bun run scripts/log-operation.ts check \
  --platform x --action like --url "https://x.com/.../status/123"

# Record after acting
bun run scripts/log-operation.ts add \
  --platform x --action like --url "https://x.com/.../status/123" \
  --status success --note "AI agents research post"

# 30-day summary (auto-injected into system prompt at startup)
bun run scripts/log-operation.ts summary --days 30

Multi-Provider LLM Support

LocoAgent works with any OpenAI-compatible API through a built-in translation shim. The rest of the system is provider-agnostic — switch models without changing any agent code.

Provider	Base URL	Notes
OpenRouter	`openrouter.ai/api/v1`	Access 200+ models
DeepSeek	`api.deepseek.com`	Thinking mode fully supported
OpenAI	`api.openai.com/v1`	GPT-4o, o1, etc.
Ollama	`localhost:11434/v1`	Local models
Anthropic	(native SDK)	Set ANTHROPIC_API_KEY only

Task Scheduling

For consistent social media presence, LocoAgent supports structured task scheduling. Define daily and weekly tasks in a simple markdown file, set action limits per session, and let the agent execute them automatically.

# persona/tasks.md
## Daily Tasks
1. Engage with relevant content (like posts matching topic queries)
2. Monitor own project mentions
3. Leave 1 technical comment on the most relevant post

## Weekly Tasks (Monday)
4. Follow 3-5 relevant researchers
5. Post 1 original tweet about recent research findings

## Session Constraints
| Action   | Max per session |
|----------|----------------|
| Likes    | 10             |
| Comments | 2              |
| Follows  | 5              |
| Posts    | 1              |

# Execute today's tasks
bun run run-tasks

# Preview the prompt without running
bun run run-tasks:dry

Realtime Trajectory Monitor

When the agent runs in headless mode, it's a black box. The trajectory monitor watches the session log and prints live execution status so you can see exactly what the agent is doing at every step.

═══ New Task ═══
/x-com open timeline, like first post

[6:30:47 PM] ⚡ Bash: agent-browser connect 9222
[6:30:47 PM] ✓ Result: Done
[6:31:10 PM] ⚡ Bash: agent-browser open https://x.com/home
[6:31:27 PM] ⚡ Bash: agent-browser snapshot -i -c -s 'article'
[6:31:44 PM] ● Agent: Found first post, like button ref=e136
[6:31:44 PM] ⚡ Bash: agent-browser click e136
[6:31:45 PM] ✓ Result: Done

Quick Start

Get LocoAgent running in three steps:

# 1. Clone and install
git clone https://github.com/LocoreMind/locoagent.git
cd locoagent
bun install

# 2. Configure .env
CLAUDE_CODE_USE_OPENAI=1
OPENAI_API_KEY=sk-or-v1-...
OPENAI_BASE_URL=https://openrouter.ai/api/v1
OPENAI_MODEL=anthropic/claude-sonnet-4.5
SKIP_PERMISSIONS=1

# 3. Setup Chrome and run
bun run setup-chrome
bun start

# Or run a single task headless
bun start -p "open X.com and like the first post about AI agents"

Tech Stack

Runtime	Bun
Language	TypeScript (TSX)
UI	React + Ink (terminal rendering)
CLI	Commander.js
Browser Automation	agent-browser + Chrome CDP
LLM Integration	Multi-provider (Anthropic SDK + OpenAI-compatible shim)
Extension Protocol	MCP (Model Context Protocol)

Contributing

LocoAgent is open-source under the MIT license and built for the community. Key areas where contributions are welcome:

New platform skills — Add operation playbooks for LinkedIn, Reddit, and other platforms
New workflows — Automated pipelines for content creation and distribution
New tools — Extend the agent's capabilities
Bug fixes — Especially in browser automation edge cases

Contribute on GitHub Report Issues Watch Demo