What Environmental Parameters Does Instagram Detect?

In this data-driven era, there's a common need to extract data from social media giants like Instagram. However, when we attempt to automate this data retrieval process, we often hit an invisible wall – the anti-bot mechanism. How do these systems distinguish between a real human user and a disguised automated bot?

This article will systematically guide you through the process, from initial approach and investigation to analysis and final verification, gradually demystifying Instagram's environmental detection methods.

I. Thinking: If You Were an Instagram Engineer, Where Would You Start?

Most people focus solely on the Network tab when building scrapers, but to understand detection logic, front-end scripts are equally crucial. Our goal is to locate the "signal collection and validation" code within the loaded JavaScript.

Consider these three aspects:
  1. Identity Markers: These are the most common identifiers. A real user's browser carries a wealth of information: IP address, HTTP headers, device fingerprints (TLS/SSL fingerprint), with the most basic being the User-Agent string.
  2. Behavioral Patterns: Real humans browse at variable speeds – they scroll, click, and linger on pages. Automated scripts, however, typically execute their predefined tasks and terminate immediately upon completion. For instance, fetching page elements and then instantly closing the tab is a rigid pattern inconsistent with human behavior.
  3. Environmental Fingerprints: The browser environment is extremely complex. Screen resolution, installed fonts, plugin lists, timezone, subtle differences in Canvas rendering... The combination of these factors generates a unique fingerprint for each user.

Most people focus solely on the Network tab when building scrapers, but to understand detection logic, front-end scripts are equally crucial. Our goal is to locate the "signal collection and validation" code within the loaded JS.

1.Tool Selection:

Use the Chrome Developer Tools. Shift your focus away from the "Network" tab and towards two more powerful investigative tools:

(1) The "Sources" tab: This contains all script files loaded by the website.
(2) Global Search: Click the three vertical dots in the top right corner of the DevTools window, select "Search". This opens a panel allowing you to search across all loaded resources (including all JS files).

2.Keyword Search:

We can locate suspicious "sentinel" scripts by searching for specific keywords. These keywords are jargon within the anti-bot field; their presence in the code almost certainly indicates environmental detection logic.
Search focusing on these two categories:

A. Automation Indicators (For identifying Selenium / Puppeteer etc.)

  • navigator.webdriver
  • HeadlessChrome
  • __driver_evaluate、__webdriver_evaluate

B. Fingerprinting & Environmental Parameters (For determining "human-likeness")

  • canvas
  • webgl
  • screen.width / screen.height / screen.colorDepth
  • plugins、navigator.plugins
  • platform
  • timezone / Date.getTimezoneOffset()
  • keydown / keyup
  • hardwareConcurrency
  • maxTouchPoints
  • Notification.permission
  • history.length、window.outerWidth/outerHeight
  • Hitting

these keywords doesn't guarantee being flagged as a bot, but it strongly suggests that "environmental collection/validation logic is present here."

Don't Read Line by Line, Look for "Contextual Roles"

When we search for these keywords, we won't find clean, readable code but rather heavily minified and obfuscated JavaScript. Our goal isn't to understand every line but to grasp the logical context.

The located code snippet below is a typical example of a "signal collection mapping table" (excerpt):

js Copy
__d(
    'SignalCollectorMap',
    [
        'BDConnectionRTTSignalCollector',
        'BDHeartbeatSignalCollector',
        'BDHeartbeatV2SignalCollector',
        'BDKeyDownUpSignalCollector',
        'BDLanguagesSignalCollector',
        'BDMimeTypeCountSignalCollector',
        'BDMousePresenceSignalCollector',
        'BDNavigatorAppVersionSignalCollector',
        'BDNavigatorHardwareConcurrencySignalCollector',
        'BDNavigatorMaxTouchPointSignalCollector',
        'BDNavigatorNotificationPermissionSignalCollector',
        'BDNavigatorPlatformSignalCollector',
        'BDNavigatorPluginsFileExtensionsSignalCollector',
        'BDNavigatorUserAgentSignalCollector',
        'BDNavigatorVendorSignalCollector',
        'BDNotificationPermissionSignalCollector',
        'BDPluginCountSignalCollector',
        'BDTimezoneOffsetSignalCollector',
        'BDTouchPresenceSignalCollector',
        'BDWebdriverSignalCollector',
        'BDWebglSupportSignalCollector',
        'BDWindowHistoryLengthSignalCollector',
        'BDWindowOuterDimensionSignalCollector',
    ],
    function (a, b, c, d, e, f, g) {
        'use strict'
        a = {
            get: function (a) {
                switch (a) {
                    case 3e4:
                        return c('BDWebdriverSignalCollector').get()
                    case 30001:
                        return c('BDPluginCountSignalCollector').get()
                    case 30002:
                        return c('BDMimeTypeCountSignalCollector').get()
                    case 30003:
                        return c('BDLanguagesSignalCollector').get()
                    case 30004:
                        return c('BDConnectionRTTSignalCollector').get()
                    case 30005:
                        return c('BDWindowOuterDimensionSignalCollector').get()
                    case 30007:
                        return c(
                            'BDNotificationPermissionSignalCollector',
                        ).get()
                    case 30008:
                        return c(
                            'BDNavigatorNotificationPermissionSignalCollector',
                        ).get()
                    case 30012:
                        return c('BDNavigatorVendorSignalCollector').get()
                    case 30013:
                        return c('BDNavigatorAppVersionSignalCollector').get()
                    case 30015:
                        return c('BDNavigatorPlatformSignalCollector').get()
                    case 30018:
                        return c(
                            'BDNavigatorHardwareConcurrencySignalCollector',
                        ).get()
                    case 30019:
                        return c(
                            'BDNavigatorPluginsFileExtensionsSignalCollector',
                        ).get()
                    case 30022:
                        return c('BDWebglSupportSignalCollector').get()
                    case 30040:
                        return c('BDTimezoneOffsetSignalCollector').get()
                    case 30093:
                        return c(
                            'BDNavigatorMaxTouchPointSignalCollector',
                        ).get()
                    case 30094:
                        return c('BDNavigatorUserAgentSignalCollector').get()
                    case 30095:
                        return c('BDWindowHistoryLengthSignalCollector').get()
                    case 30100:
                        return c('BDKeyDownUpSignalCollector').get()
                    case 30106:
                        return c('BDMousePresenceSignalCollector').get()
                    case 30107:
                        return c('BDTouchPresenceSignalCollector').get()
                    case 38e3:
                        return c('BDHeartbeatSignalCollector').get()
                    case 38001:
                        return c('BDHeartbeatV2SignalCollector').get()
                }
            },
        }
        b = a
        g['default'] = b
    },
    98,
)

How to Read This Type of Mapping:

Each BD*SignalCollector represents a type of signal (e.g., webdriver status, plugin count, hardware concurrency, timezone offset, WebGL support, keyboard/mouse/touch presence, window dimensions, heartbeat, etc.).

The numbers in the switch (a) statement (e.g., 30001, 30022) can be seen as metric IDs, which are typically dispatched/reported uniformly elsewhere in the code.

Focusing on the get() implementation and call stack of these Collectors can quickly clarify the flow: "what is collected → when is it collected → where is it sent".

Update Time:Sep 05, 2025

Comments

Tips: Support some markdown syntax: **bold**, [bold](xxxxxxxxx), `code`, - list, > reference