Introduction
Today we're open-sourcing LocoAgent, an AI-powered social media agent that autonomously operates social media accounts through real browser automation. Unlike API-based bots or headless browser scrapers, LocoAgent operates through a copy of your actual Chrome profile — same cookies, same login sessions, same browser fingerprint — making it indistinguishable from manual browsing.
LocoAgent combines an LLM-driven agentic loop with agent-browser CLI to perceive, decide, and act on live web pages. It can like posts, write replies, follow users, publish content, and execute complex multi-step social media workflows — all autonomously.
Why Real Browser Automation?
Social media platforms actively detect and block headless browsers, API-based automation, and unofficial clients. Traditional bot approaches face constant arms races with anti-automation systems. LocoAgent takes a fundamentally different approach:
- Real Chrome profile — Operates through a copy of your actual browser with your existing cookies and login sessions via Chrome DevTools Protocol (CDP)
- Same fingerprint — No detectable differences from your normal browsing since it's the same browser engine, extensions, and configuration
- No API hacks — No reverse-engineered private APIs that break with every platform update
- Visual perception — The agent sees the page the same way you do, using interactive element snapshots with reference IDs
Architecture Overview
LocoAgent's architecture consists of four core layers that work together to enable autonomous social media operation:
User Task / Scheduled Task
│
▼
Agentic Loop (LLM-driven)
├── System Prompt + Platform Skills + Operation Log Summary
├── Perceive: agent-browser snapshot → interactive elements
├── Decide: LLM chooses next action
└── Act: agent-browser click / fill / navigate
│
├── Workflow Engine (deterministic, no LLM)
│ └── Scripted browser pipelines
│
└── Operation Log (persistent dedup)
└── Prevents repeated likes, follows, replies
The agentic loop is the brain — it receives a task, perceives the current page state via agent-browser snapshot, decides what to do next using the LLM, and executes actions like clicking, typing, and navigating. The full agent-browser CLI reference is embedded in the system prompt, so the agent knows every command natively.
Platform Skill System
Skills are the key innovation that makes LocoAgent reliable. Instead of relying on the LLM to figure out how to operate each platform from scratch, skills inject complete operation playbooks into the agent's context. Each skill is a comprehensive manual covering every operation the agent might need.
| Platform | Command | Operations | Capabilities |
|---|---|---|---|
| X.com | /x-com |
32+ | Browse, engage, post, social graph, profile, navigation, lists |
With the X.com skill loaded, you can give the agent complex composite tasks and it will execute them in a single pass:
# Interactive: load skill then give task
> /x-com open home timeline, like first 3 posts about AI, reply to the best one
# Headless
bun start -p "/x-com like 5 posts about 'large language models', then follow the authors"
Adding New Platforms
The skill system is designed for extensibility. Adding a new platform is as simple as creating a SKILL.md file with the operation playbook:
mkdir -p skills/linkedin
# Create skills/linkedin/SKILL.md with the operations manual
# The skill auto-discovers at startup and becomes available as /linkedin
Workflow Engine
Not everything needs LLM intelligence. The workflow engine runs deterministic browser-automation pipelines without any LLM involvement. The agent acts as a supervisor — it can inspect status, start/stop workflows — but the execution itself is pure scripted automation. This means predictable behavior, lower costs, and faster execution.
| Workflow | Schedule | Description |
|---|---|---|
| HuggingFace Papers Fetcher | Daily | Fetch paper list, abstracts, and thumbnails from HuggingFace |
| HuggingFace → X.com | Daily | Full pipeline: fetch HF papers → download thumbnails → post as tweets |
| X.com Search & Reply | Daemon | Search X.com → read posts → generate AI reply → post reply |
| LinkedIn Search & Comment | Daemon | Search LinkedIn → read posts → generate AI comment → post comment |
# Run a workflow once
bun run workflow run --id hf-papers-to-x
# Run as a daemon (every 3 minutes)
bun run workflow daemon --id x-search-reply --interval 3
# Check status
bun run workflow status
Custom workflows follow a simple contract: accept --config <json>, log to stderr, and output a JSON summary as the last line on stdout. See the workflow development guide for details.
Operation Log & Deduplication
A recurring problem with autonomous agents is repeated actions — liking the same post twice, following someone you already follow, replying to the same thread again. LocoAgent solves this with a persistent operation log that spans across sessions.
Before every action, the agent checks the log. After every action, it records the result. This creates a reliable memory that prevents duplicate operations without needing a separate database.
# Check before acting (exit 0 = already done, exit 1 = not done)
bun run scripts/log-operation.ts check \
--platform x --action like --url "https://x.com/.../status/123"
# Record after acting
bun run scripts/log-operation.ts add \
--platform x --action like --url "https://x.com/.../status/123" \
--status success --note "AI agents research post"
# 30-day summary (auto-injected into system prompt at startup)
bun run scripts/log-operation.ts summary --days 30
Multi-Provider LLM Support
LocoAgent works with any OpenAI-compatible API through a built-in translation shim. The rest of the system is provider-agnostic — switch models without changing any agent code.
| Provider | Base URL | Notes |
|---|---|---|
| OpenRouter | openrouter.ai/api/v1 |
Access 200+ models |
| DeepSeek | api.deepseek.com |
Thinking mode fully supported |
| OpenAI | api.openai.com/v1 |
GPT-4o, o1, etc. |
| Ollama | localhost:11434/v1 |
Local models |
| Anthropic | (native SDK) | Set ANTHROPIC_API_KEY only |
Task Scheduling
For consistent social media presence, LocoAgent supports structured task scheduling. Define daily and weekly tasks in a simple markdown file, set action limits per session, and let the agent execute them automatically.
# persona/tasks.md
## Daily Tasks
1. Engage with relevant content (like posts matching topic queries)
2. Monitor own project mentions
3. Leave 1 technical comment on the most relevant post
## Weekly Tasks (Monday)
4. Follow 3-5 relevant researchers
5. Post 1 original tweet about recent research findings
## Session Constraints
| Action | Max per session |
|----------|----------------|
| Likes | 10 |
| Comments | 2 |
| Follows | 5 |
| Posts | 1 |
# Execute today's tasks
bun run run-tasks
# Preview the prompt without running
bun run run-tasks:dry
Realtime Trajectory Monitor
When the agent runs in headless mode, it's a black box. The trajectory monitor watches the session log and prints live execution status so you can see exactly what the agent is doing at every step.
═══ New Task ═══
/x-com open timeline, like first post
[6:30:47 PM] ⚡ Bash: agent-browser connect 9222
[6:30:47 PM] ✓ Result: Done
[6:31:10 PM] ⚡ Bash: agent-browser open https://x.com/home
[6:31:27 PM] ⚡ Bash: agent-browser snapshot -i -c -s 'article'
[6:31:44 PM] ● Agent: Found first post, like button ref=e136
[6:31:44 PM] ⚡ Bash: agent-browser click e136
[6:31:45 PM] ✓ Result: Done
Quick Start
Get LocoAgent running in three steps:
# 1. Clone and install
git clone https://github.com/LocoreMind/locoagent.git
cd locoagent
bun install
# 2. Configure .env
CLAUDE_CODE_USE_OPENAI=1
OPENAI_API_KEY=sk-or-v1-...
OPENAI_BASE_URL=https://openrouter.ai/api/v1
OPENAI_MODEL=anthropic/claude-sonnet-4.5
SKIP_PERMISSIONS=1
# 3. Setup Chrome and run
bun run setup-chrome
bun start
# Or run a single task headless
bun start -p "open X.com and like the first post about AI agents"
Tech Stack
| Runtime | Bun |
| Language | TypeScript (TSX) |
| UI | React + Ink (terminal rendering) |
| CLI | Commander.js |
| Browser Automation | agent-browser + Chrome CDP |
| LLM Integration | Multi-provider (Anthropic SDK + OpenAI-compatible shim) |
| Extension Protocol | MCP (Model Context Protocol) |
Contributing
LocoAgent is open-source under the MIT license and built for the community. Key areas where contributions are welcome:
- New platform skills — Add operation playbooks for LinkedIn, Reddit, and other platforms
- New workflows — Automated pipelines for content creation and distribution
- New tools — Extend the agent's capabilities
- Bug fixes — Especially in browser automation edge cases