My AI Agent Hit a Login Wall: BrowserAct Let It Ask for Help and Resume
👋 Hey there, Tech Enthusiasts!
I'm Sarvar, a Cloud Architect who loves turning complex tech problems into simple solutions. I've worked with AWS, Azure, DevOps, Data, Analytics, Generative-AI and Agentic-AI building real systems for real companies. In this article series, I'll share what I've learned in a way that's easy to follow, whether you're experienced or just getting started.
Let's get into it! 🚀
I'm a cloud architect. I manage infrastructure across multiple AWS accounts, run CI/CD pipelines, and keep monitoring dashboards healthy for clients. A lot of my day involves checking web-based tools Grafana, GitHub, vendor portals, internal dashboards most of which sit behind login walls and anti-bot protection.
But there was always a gap: the agent couldn't browse the web. It couldn't check a dashboard, read a protected page, or handle a login flow.
That changed when I integrated BrowserAct into my workflow. It's a browser layer that gives AI agents the ability to browse real websites with anti-detection, session management, and human handoff built in.
If you missed the first article where I covered the full setup, start there: I Gave My AI Agent a Real Browser - Here's What Actually Happened. This article focuses on the headless + human handoff pattern I've been running in production.
A Note on Tooling
I'm using Kiro as my AI agent it's free during preview and can execute CLI commands directly. But BrowserAct works with anything that can run shell commands: Claude Code, Cursor, Codex, CrewAI, LangChain, or even a simple bash script. The pattern is the same regardless of agent.
The Setup
I run BrowserAct on a Linux server no desktop, no display, just a terminal. This is how it runs in production for my client: headless on a server, triggered by cron or the agent.
Prerequisites
Before getting started, make sure the following components are installed on your system.
Verify Installed Versions
Run the following commands:
python3 --version
# Python 3.12+
node --version
# v18+
google-chrome --version
# Google Chrome 149.x.x.x
Install UV (If Not Already Installed)
BrowserAct uses Python tooling, and uv is the recommended package manager.
curl -LsSf https://astral.sh/uv/install.sh | sh
Install Google Chrome (If Not Already Installed)
Ubuntu / Debian
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo apt install -y ./google-chrome-stable_current_amd64.deb
Amazon Linux
wget https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm
sudo yum localinstall -y google-chrome-stable_current_x86_64.rpm
Creating a BrowserAct API Key
To allow your AI agent to control a real browser, you'll need a BrowserAct API key.
Step 1: Sign In
Log in to your BrowserAct account.
Step 2: Open API Key Management
- Click your profile email address in the top-right corner.
- Select API Keys from the dropdown menu.
- Click Manage Keys.
Step 3: Create a New API Key
- Click Create Key.
- Enter a descriptive name such as:
Amazon-QMCP-Server-
Development- Click Create.
Step 4: Save the API Key
Copy the generated API key and store it securely. For security reasons, you may not be able to view the complete key again after leaving the page.
Treat your API key like a password. Never share it publicly or commit it to source code repositories.
Configure BrowserAct Authentication
Once you have your API key, authenticate BrowserAct using the following command:
browser-act auth set <your-api-key>
Successful authentication will return:
API key saved.
At this point, BrowserAct is connected and ready to provide browser access to your AI agent. The integration takes less than a minute and requires no additional configuration.
After that, the agent has a browser. One more step create a stealth browser instance:
browser-act browser create --type stealth --name "research"
id=101758963005571124 name="research" type=stealth
That id is your browser ID you'll use it every time you open a session. Think of it like a browser profile: it keeps its own fingerprint, cookies, and anti-detection settings. You create it once and reuse it across sessions.
Note: The browser ID shown in this article (
101758963005571124) is from my account. When you runbrowser create, you'll get your own unique ID. Use that in place of mine throughout the examples.
Managing Sessions
Before starting new sessions, check if any are already running:
browser-act session list
session_name: research-gh
browser_type: stealth
browser_id: 101758963005571124
title: Trending repositories on GitHub today · GitHub
url: https://github.com/trending
session_name: research-hn
browser_type: stealth
browser_id: 101758963005571124
title: news.ycombinator.com
url: https://news.ycombinator.com/
session_name: research-ph
browser_type: stealth
browser_id: 101758963005571124
title: Product Hunt – The best new products in tech.
url: https://www.producthunt.com/
To close a specific session:
browser-act --session research-hn session close
session_name=research-hn closed=true
Tip: Always close sessions when you're done. Open sessions keep the browser running and consume resources. If you hit a "session already in use" error, it means that session name is still active either close it or use a different name.
Real Scenario: Morning Tech Research
One of the things I do for a client is compile a daily tech digest what's trending, what's launching, what competitors are shipping. Used to take me 30 minutes of tab-switching every morning.
Now my agent does it. Here's what that looks like.
Quick Extract One Session, One Page
# Open a stealth browser session on the target page
browser-act --session research-hn browser open 101758963005571124 https://news.ycombinator.com
# Get the page state
browser-act --session research-hn state
The agent got back clean, structured content page title, URL, and all interactive elements. From there it can extract exactly what it needs using JS eval:
browser-act --session research-hn eval 'JSON.stringify(Array.from(document.querySelectorAll(".athing")).slice(0,3).map(el => ({title: el.querySelector(".titleline a")?.textContent, points: el.nextElementSibling?.querySelector(".score")?.textContent})))'
[
{"title": "AI agent bankrupted their operator while trying to scan DN42", "points": "171 points"},
{"title": "Nobody ever gets credit for fixing problems that never happened", "points": "348 points"},
{"title": "If you are asking for human attention, demonstrate human effort", "points": "537 points"},
{"title": "Show HN: Homebrew 6.0.0", "points": "1145 points"}
]
Two commands to open, one to extract. The agent can summarize this, filter by topic, or flag anything relevant to the client.
Where this fits: Any team that needs a daily briefing tech trends, industry news, competitor launches. The agent grabs it, the team reads a summary instead of spending 30 minutes browsing.
Parallel Research - Three Sites at Once
For the full morning digest, the agent opens three parallel sessions on the same browser.
You can use the browser you already created, or create a separate one to keep research isolated from other workflows:
browser-act browser create
# Returns: id=101764340218654773
Then open sessions on it:
# Session 1: GitHub Trending
browser-act --session research-gh browser open 101764340218654773 https://github.com/trending
# Session 2: Hacker News
browser-act --session research-hn browser open 101764340218654773 https://news.ycombinator.com
# Session 3: Product Hunt
browser-act --session research-ph browser open 101764340218654773 https://www.producthunt.com
All three run independently. No conflicts. The agent works through each one:
browser-act session list
session_name: research-gh
browser_type: stealth
browser_id: 101764340218654773
title: Trending repositories on GitHub today · GitHub
url: https://github.com/trending
session_name: research-hn
browser_type: stealth
browser_id: 101764340218654773
title: news.ycombinator.com
url: https://news.ycombinator.com/
session_name: research-ph
browser_type: stealth
browser_id: 101764340218654773
title: Product Hunt – The best new products in tech.
url: https://www.producthunt.com/
Where this fits: Product teams that need multi-source intelligence before standup. Marketing teams tracking launches. DevOps engineers checking status pages across providers. Anything where you'd normally open 5+ tabs.
Structured Data Extraction
Instead of parsing full page HTML, the agent runs targeted JavaScript and gets clean JSON:
browser-act --session research-gh eval "JSON.stringify(Array.from(document.querySelectorAll('article.Box-row')).slice(0,3).map(r => ({repo: r.querySelector('h2 a')?.textContent.trim(), stars: r.querySelector('span.d-inline-block.float-sm-right')?.textContent.trim()})))"
[
{"repo":"iptv-org /\n\n iptv","stars":"2,650 stars today"},
{"repo":"teslamate-org /\n\n teslamate","stars":"35 stars today"},
{"repo":"Panniantong /\n\n Agent-Reach","stars":"1,045 stars today"}]
The agent navigated within the same session Python trending, then TypeScript without opening a new browser. Took a screenshot for the report. I covered extraction patterns in depth in previous article.
Where this fits: Competitor monitoring prices, features, reviews. The agent extracts exactly the data points you need as structured JSON. No scraping framework. No maintenance when the page layout changes. BrowserAct isn't a standalone scraping tool it's a browser layer. Your AI agent is the brain that decides what to do. BrowserAct is the eyes and hands that execute on the web.
Then the Agent Hits a Wall
Everything was going smoothly. The agent had data from three sources, screenshots saved, research compiling nicely. Then it tried to check my GitHub profile settings:
browser-act --session research-gh navigate https://github.com/settings/profile
Response:
url=https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fsettings%2Fprofile
title=Sign in to GitHub · GitHub

















