%20-%202026-05-27T175112.286.png)
How to Avoid IP Bans When Web Scraping in 2026: A Practical Guide

You set up a scraper. The first hundred requests fly through. Then page two hundred returns a 403, page two hundred and one returns a CAPTCHA, and by request three hundred the entire IP range is blacklisted. The job dies, the data is partial, and the dashboard you promised your team is empty.
IP bans are the single most common failure mode in production web scraping, and they have gotten worse. Anti-bot vendors now look at IP reputation, ASN, header consistency, TLS fingerprints, and behavioural timing in combination — a single weak signal is often enough to trigger a block. Getting around them is no longer a matter of "use a proxy and hope." It is a discipline.
This guide walks through what causes IP bans, what does not work anymore, and the eight techniques that reliably keep a scraping pipeline alive in 2026.
Why Websites Ban IPs in the First Place
Bans are a defensive response to traffic that does not match a normal human visitor. The signals that trigger one fall into a few buckets:
Volume and velocity. A single IP hitting two hundred product pages a minute is not a shopper. Rate-based bans are the cheapest detection for a target site to run, which is why they are the most common.
IP reputation. Datacentre ranges from AWS, DigitalOcean, Hetzner, and similar providers are publicly known. So are the IP ranges of cheap proxy resellers. If your traffic comes from one of those, you are flagged before you send a request.
Behavioural patterns. Requests at perfectly even intervals, missing referrers, the same session ID across hundreds of accounts, no mouse movement on a JS-rendered page — modern detection stacks score all of these.
Subnet contamination. Bans rarely target a single IP. If one IP in a /24 block misbehaves, the block goes to a watchlist, and any other IP in the same subnet inherits the suspicion. This is why proxy pools that look large on paper but cluster into a few subnets degrade quickly.
Geographic mismatch. A US retailer receiving "residential" traffic from an IP that geolocates to Vietnam is going to look twice.
The job of an evasion strategy is to keep every one of these signals inside the range a normal user would produce. Below are the techniques that do that.
1. Stop Using Datacentre Proxies for Anything Sensitive
Datacentre proxies are fast and cheap, and that is the entire problem. The IP ranges belong to cloud hosting providers, and major websites know them by heart. Block rates on e-commerce, search engines, social platforms, and travel sites run three to four times higher on datacentre IPs than on residential ones.
Use datacentre proxies for low-sensitivity targets — public APIs, documentation sites, static pages where nobody is paying for anti-bot. For anything that touches a product page, a search result, an ad platform, or a user account, you need residential or mobile IPs.
2. Use Residential Proxies for the Bulk of Your Work
Residential proxies route through real home internet connections — the kind a normal user gets from a consumer ISP. From the target's perspective, the request is indistinguishable from a person sitting on their sofa.
The technical difference is in the ASN. ASN (Autonomous System Number) identifies the network operator behind an IP. Residential IPs sit under consumer ISPs like Comcast, BT, Deutsche Telekom, or Verizon. Datacentre IPs sit under hosting ASNs. Anti-bot systems score ASN, and consumer ASNs score well.
SimplyNode operates a network of more than fifty million ethically sourced residential IPs across 180+ countries. The pool is large enough that you are not recycling a small set of subnets across thousands of requests, which is the failure mode of smaller providers — even ones that advertise large numbers.
3. Use Mobile Proxies When the Target Is Mobile-Heavy
Social platforms, dating apps, fintech apps, and a growing share of e-commerce traffic now come predominantly from mobile devices. Some sites actively prefer mobile traffic and treat desktop residential IPs as suspicious by comparison.
Mobile proxies route through 3G, 4G, and 5G carrier networks. They carry two structural advantages:
- Carrier-grade NAT means hundreds or thousands of real users share each mobile IP. Banning that IP would ban a lot of real customers, so target sites are reluctant to do it.
- The IP changes naturally, which mirrors real user behaviour — a person on a train, switching towers, getting a new IP every few minutes.
Mobile bandwidth costs more per gigabyte than residential because real cellular data is not cheap. Use it where the target explicitly profiles mobile traffic; do not pay the premium where residential would do the job.
4. Get Your Rotation Strategy Right
There are two rotation modes, and choosing the wrong one is a common reason scrapers fail even with a good proxy provider behind them.
Rotating sessions give you a fresh IP on every request, or every few requests. This is right for high-volume, stateless work — scraping a million product pages, none of which need login or cart state. Each request looks like a different visitor.
Sticky sessions hold the same IP for a configurable window — usually one to thirty minutes. This is right for anything stateful: logging in, adding to cart, walking through a checkout, scraping a multi-step search funnel. If your IP changes mid-session, the target's session token breaks and you look like a hijacker.
Match the rotation mode to the workflow. A common mistake is using rotating sessions for everything and then wondering why every login fails.
5. Target the Right Geography (Including City and ASN)
A US-only retailer expects US traffic. A French news site expects French traffic. Sending requests from the wrong country either changes the content you see, breaks pricing tests, or trips a geo-mismatch flag.
Country-level targeting is the baseline. Two more dimensions matter for sensitive targets:
- City-level targeting for local SEO work, ad verification, and any test where the result varies by metro area. If you are checking SERP rankings for "best dentist near me," the IP needs to be in the city you are testing.
- ASN selection when you need to look like traffic from a specific carrier. Tier-one ISPs (Verizon, AT&T, BT, Deutsche Telekom) carry more trust than smaller regional ISPs that anti-bot systems have learned to associate with bot farms.
Combining city and ASN gives you the camouflage to say, in effect, "I am a Verizon Fios customer in Queens." That is a much harder profile to flag than "I am an IP somewhere in the United States."
6. Fix Your Headers Before You Worry About Your Proxies
A perfect proxy network will not save you if your request headers scream "Python script." The most overlooked source of bans is the request itself.
- User-Agent. Default
python-requests/2.xorGo-http-clientstrings identify you instantly. Use real, current browser user agents and rotate them in line with your IP rotation. An IP that switches but a user agent that does not is itself a tell. - Accept, Accept-Language, Accept-Encoding. Real browsers send these. Most scraping libraries do not by default. Add them.
- Referer. Coming from "nowhere" on page after page is unusual. Set a sensible referer chain.
- TLS fingerprint. Tools like Cloudflare Bot Management look at JA3 and JA4 fingerprints. Standard library HTTP clients have a fingerprint that does not match any real browser. Use a client like
curl_cffiorhttpxwith browser-impersonation enabled. - Header order. Browsers send headers in a specific order. Most HTTP libraries do not preserve order. Some bot management vendors check.
A consistent, browser-realistic header set on a residential IP will outperform a careless header set on a premium mobile IP, every time.
7. Throttle, Back Off, and Look Human
Anti-bot detection scores timing. Three rules:
- Add jitter. Do not request every 1.0 seconds exactly. Request every 0.8 to 2.4 seconds, randomly. Even modest randomisation breaks the most basic timing detectors.
- Respect 429 responses. When a target returns "Too Many Requests," it is telling you the threshold. Back off exponentially. Hammering through a 429 is the fastest way to escalate from rate limit to permanent ban.
- Cap retries. Three retries with backoff is fine. Thirty retries on a failing IP is a giant red flag and burns the IP for everyone else in the pool.
If the target is expensive (a search engine, a major retailer), being slower is almost always cheaper than being banned.
8. Handle Sessions and Cookies Like a Real User
Within a session, the cookies, headers, and IP should all stay consistent. Across sessions, they should all change together.
A common bug: the scraper rotates IPs every request but reuses the same cookie jar. From the target's perspective, the same "user" (cookie) is teleporting between continents every two seconds. That is a perfect bot signature.
Pair sticky sessions with persistent cookies. When you rotate the IP, rotate the cookies too. Treat each (IP, cookie jar, user agent, header set) tuple as a single identity, and only retire all of it at once.
Putting It Together: What a Production Setup Looks Like
A scraping pipeline that survives in 2026 looks roughly like this:
- A pool of residential proxies — at least several million IPs, geographically distributed, with city and ASN targeting available.
- Mobile proxies on standby for the subset of targets that demand them.
- Sticky sessions of 5–10 minutes for stateful work; rotating per-request for stateless bulk work.
- A browser-realistic header set, with user agent rotation tied to IP rotation.
- A TLS-fingerprint-correcting HTTP client (
curl_cffi,tls-client, or a headless browser when JS rendering is needed). - Jittered request timing, exponential backoff on 429, capped retries.
- Per-target observability: success rate, average response time, block rate by country and ASN. When the block rate on one geography rises, rotate the strategy before it cascades.
The reason this works is that no single signal carries the load. Each layer absorbs a different category of detection, so a target site has to defeat all of them at once to ban you — and at that point you are indistinguishable from a real, paying customer they do not want to ban.
Where SimplyNode Fits
The techniques above are tool-agnostic, but they assume the underlying proxy network actually delivers what it advertises. A lot of providers do not.
SimplyNode is built around the specific properties this kind of pipeline needs:
- 50M+ ethically sourced residential IPs and a separate mobile pool on real 4G/5G carrier networks.
- City-level and ASN targeting across 180+ countries, so you can match the geographic profile your target expects.
- Rotating and sticky sessions, configurable per request, so you pick the rotation mode each job needs rather than fighting the provider's defaults.
- HTTP, HTTPS, and SOCKS5 support, which means it drops into existing scraping stacks without rewrites.
- Pay-as-you-go pricing with no bandwidth expiry. Buy the gigabytes you need, use them when the work is in front of you, and pay less per gigabyte as volume grows. There are no monthly subscriptions to forfeit and no premium tiers gating features that should be standard.
A scraping operation lives or dies on the proxy layer. The rest of the stack — the parser, the queue, the storage — is solvable engineering. The proxy layer is the part that has to keep working under adversarial conditions, every day, against targets that get smarter every quarter.
Frequently Asked Questions
Are residential proxies legal for web scraping?Using residential proxies is legal. Scraping itself is legal for publicly available data in most jurisdictions, with well-known limits around personal data, copyrighted content, and terms-of-service violations. Always check the target site's terms and the data protection regime that applies to the data you are collecting.
How many IPs do I actually need?It depends on volume and target sensitivity. As a rule of thumb: for moderate-volume scraping of a sensitive target, plan for one IP per 50–100 requests before rotation. High-volume jobs against aggressive anti-bot stacks will burn through IPs faster.
Residential or mobile — which should I default to?Residential, unless the target is mobile-first (social platforms, mobile-only apps, some fintech). Mobile costs more and you should only pay the premium where the target actively profiles mobile traffic.
Will a proxy alone get me past Cloudflare or DataDome?No. A proxy fixes the IP signal. Cloudflare and DataDome also score TLS fingerprint, header order, JS challenges, and behavioural patterns. You need the full stack from this guide — proxy + headers + fingerprint + timing — to consistently get through.
What is the difference between rotating and sticky sessions?Rotating sessions give you a new IP on every request (or every few requests) and suit stateless bulk scraping. Sticky sessions hold the same IP for a set window and suit anything that needs login or multi-step state. Most providers, SimplyNode included, let you choose per request.
%20-%202026-05-26T134040.303.png)

%20-%202026-05-21T164357.956.png)
%20-%202026-05-21T162719.912.png)
%20-%202026-05-19T120136.626.png)
%20-%202026-05-18T110557.344.png)
%20-%202026-05-15T113858.229.png)
%20-%202026-05-14T121851.922.png)
%20-%202026-05-13T120005.619.png)
%20-%202026-05-12T112504.017.png)
%20-%202026-05-11T112454.822.png)
%20-%202026-05-07T105244.165.png)
%20-%202026-05-05T121326.455.png)
%20-%202026-04-30T132926.682.png)
%20-%202026-04-29T122656.668.png)
%20-%202026-04-28T165006.557.png)
%20-%202026-04-27T151422.596.png)
%20-%202026-04-24T134808.219.png)
%20-%202026-04-23T131230.882.png)
%20-%202026-04-22T111352.506.png)
%20-%202026-04-21T115725.855.png)
%20-%202026-04-20T131158.661.png)
%20-%202026-04-10T112244.119.png)
%20-%202026-04-09T111424.150.png)
%20-%202026-04-08T130405.235.png)
%20-%202026-04-07T113530.055.png)
%20-%202026-04-06T113015.908.png)
%20-%202026-04-02T120940.360.png)
%20-%202026-04-01T144424.516.png)
%20-%202026-03-30T112807.229.png)
%20-%202026-03-26T160000.403.png)
%20-%202026-03-26T113412.170.png)
%20-%202026-03-25T110333.022.png)
%20-%202026-03-23T125824.589.png)
%20-%202026-03-19T135903.501.png)
%20-%202026-03-19T114712.472.png)
%20-%202026-03-18T142314.362.png)
%20-%202026-03-17T135837.094.png)
%20-%202026-03-16T113750.118.png)
%20-%202026-03-13T134616.799.png)
%20-%202026-03-11T133856.227.png)
%20-%202026-03-10T124412.864.png)
%20(100).png)
%20(99).png)
%20(98).png)
%20(97).png)
%20(96).png)
%20(95).png)
%20(94).png)
%20(93).png)
%20(92).png)
%20(91).png)
%20(90).png)
%20(90).png)
%20(89).png)
%20(88).png)
%20(87).png)
%20(86).png)
%20(85).png)
%20(84).png)
%20(83).png)
%20(82).png)
%20(81).png)
%20(80).png)
%20(79).png)
%20(78).png)
%20(77).png)
%20(76).png)
%20(75).png)
%20(74).png)
%20(73).png)
.png)
.png)
.png)
.png)
.png)
%20(72).png)
%20(70).png)
%20(68).png)
%20(66).png)
%20(64).png)
%20(63).png)
%20(62).png)
%20(60).png)
%20(59).png)
%20(58).png)
%20(57).png)
%20(52).png)
%20(51).png)
%20(49).png)
%20(48).png)
%20(46).png)
%20(45).png)
%20(44).png)
%20(43).png)
%20(42).png)
%20(41).png)
%20(40).png)
%20(37).png)