In data collection work, the most fundamental and critical step is to determine how a website renders its pages. The rendering method dictates the technology and strategy we use to scrape data. As one of the world's largest photo and short-video social platforms, Instagram undoubtedly employs a unique rendering approach. This article will analyze the page structure and data loading characteristics of Instagram from the perspective of its rendering mechanism.
1. Why Focus on the Rendering Mechanism
Before writing scraping scripts, we typically encounter two common questions:
Does the page source code contain the complete data?
If the target data is present within the HTML source code, then we only need to parse the HTML to extract it.
Does it rely on JavaScript for dynamic rendering?
If the data is loaded asynchronously by the frontend via AJAX or GraphQL requests, then we must simulate these requests or use a browser driver to obtain the data.
Therefore, understanding Instagram's rendering mechanism is crucial for deciding whether to parse HTML directly or analyze interface requests.
2. Web Page Rendering Methods
To better understand Instagram's characteristics, let's first review common rendering patterns:
Server-Side Rendering (SSR)
(1) How it works: The browser sends a request, and the server generates and returns a complete HTML page.
(2) Characteristics:
- The page source contains complete data.
- SEO friendly.
- Very convenient for scrapers to obtain data.
Client-Side Rendering (CSR)
(1) How it works: The browser initially loads only a basic HTML shell. JavaScript then executes API requests to fetch data and dynamically renders it onto the page.
(2) Characteristics:
- The initial HTML is almost empty.
- Data resides in JSON responses from asynchronous requests.
- Scrapers must study interface requests; otherwise, they cannot access the core content.
Hybrid Rendering (SSR + CSR)
(1) How it works: Part of the basic framework is rendered by the server and output, but core data is still fetched asynchronously by the frontend.
(2) Characteristics:
- Some data might be visible in the source code, but it is incomplete.
- Primary data needs to be obtained from interfaces.
3. Instagram's Page Rendering Method
The image below shows a portion of the source code from an Instagram post webpage:

From the page source and the data displayed on the actual page, we can see that Instagram pages do contain some content generated by server-side rendering, such as certain structural elements and some data. However, this does not mean Instagram relies entirely on backend rendering.
When we open the browser developer tools and observe network requests, we find that Instagram, during page load, sends requests to https://www.instagram.com/graphql/query/ to dynamically fetch data, especially comments, likes, user profiles, etc.
Example partial comment data:
JSON
{
"node": {
"id": "18048006491558991",
"text": "So beautiful ??",
"created_at": 1756012879,
"did_report_as_spam": false,
"owner": {
"id": "76494578412",
"is_verified": false,
"profile_pic_url": "https://scontent-hkg1-1.cdninstagram.com/v/t51.2885-19/538976363_17847810789546413_6285302922984971743_n.jpg?stp=dst-jpg_s150x150_tt6&efg=eyJ2ZW5jb2RlX3RhZyI6InByb2ZpbGVfcGljLmRqYW5nby4xMDgwLmMyIn0&_nc_ht=scontent-hkg1-1.cdninstagram.com&_nc_cat=105&_nc_oc=Q6cZ2QEWgN3_vvSk2Ja6NzhRBclKubC8Pwy0nlPX_e4uSomV_wpgN7m5As2GeXdhh7UM8j4&_nc_ohc=gM8_RBAZwjUQ7kNvwGsj4PG&_nc_gid=yQbQ1Bx14eaUbMyWrs--uw&edm=ANTKIIoBAAAA&ccb=7-5&oh=00_AfWsG2SUP99sDND6sMn_nQp_188x81cK5GNP18z-x7xHpQ&oe=68B1B133&_nc_sid=d885a2",
"username": "angela_travelz"
},
"viewer_has_liked": false,
"edge_liked_by": {
"count": 0
},
"is_restricted_pending": false,
"edge_threaded_comments": {
"count": 0,
"page_info": {
"has_next_page": false,
"end_cursor": null
},
"edges": []
}
}
}
4. Analysis and Conclusion
From this, it is evident that Instagram employs a hybrid approach combining server-side rendering with frontend asynchronous loading:
(1) Server-Side Rendered Part: The basic page framework and a small amount of data are output directly within the HTML.
(2) Frontend Asynchronously Loaded Part: Core data (such as comments, user interaction information, etc.) is fetched dynamically after the initial page load via GraphQL interfaces.
For data collection purposes, relying solely on parsing the HTML source code will not yield complete data. We must also analyze the interface requests, their parameters, and the returned JSON data structure to effectively collect dynamic content like comments.