How to Scrape TikTok Search Data

Ava

TikTok is a global short-video social platform that has attracted sustained attention from millions of users, thanks to its unique algorithm and precise user targeting. Scraping TikTok data enables the quick identification of potential new customers and facilitates the discovery of new marketing opportunities. This article will explain how to scrape TikTok search data and parse the returned information.
First, we need to install the necessary Python libraries.

language Copy
pip install requests pandas execjs loguru

Next, we create a TiktokUserSearch class. Within the init method, we initialize fixed headers, particularly the user-agent, and set up the output file.

Python Copy
from datetime import datetime
from fake_useragent import UserAgent

class TiktokUserSearch:
  
    def __init__(self, output_file=None, headers=None):
        # 1. Dynamically generate a User-Agent and set a more complete default request header
        try:
            ua = UserAgent()
            default_user_agent = ua.chrome
        except Exception:
            default_user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36"

        default_headers = {
            "User-Agent": default_user_agent,
            "Referer": "https://www.tiktok.com/",
            "Accept-Language": "en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7",
        }
        
        # 2. If the user passes in custom headers, use it to update the defaults
        if headers and isinstance(headers, dict):
            default_headers.update(headers)
        
        self.headers = default_headers
        self.cookies = None # Initialize to empty and will be set in subsequent methods

        # 3. Maintains excellent dynamic filename logic
        self.output_file = output_file if output_file else f'tiktok_videos_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv'
        
        print(f"The crawler is initialized.User-Agent: {self.headers['User-Agent'][:30]}...")
        print(f"The data will be saved to: {self.output_file}")

A method is needed to convert a cookie string into the dictionary format required by the requests library.

Python Copy
def cookie_str_to_dict(self, cookie_str) -> dict:
        cookie_dict = {}
        cookies = [i.strip() for i in cookie_str.split('; ') if i.strip() != ""]
        for cookie in cookies:
            key, value = cookie.split('=', 1)
            cookie_dict[key] = value
        return cookie_dict

The next crucial steps involve extracting and carrying key device fingerprints like msToken from the Cookie. The most critical step is calling a pre-reverse-engineered JavaScript file to generate the X-Bogus dynamic signature in real-time using execjs. Finally, an automatic retry mechanism is added to all network requests to handle potential network fluctuations, ensuring the robustness and high success rate of the scraping task.
Finally, the returned data is parsed and saved to a file in CSV format. Requests are sent to fetch the data.

Python Copy
def parse_data(self, data_list):
    # ... Extract various fields ...
    df = pd.DataFrame(video_data)
    df.to_csv(self.output_file, mode='a', header=not file_exists, ...)

Define the keywords to scrape and proceed with scraping in a loop.

Update Time:Feb 03, 2026

Comments

Tips: Support some markdown syntax: **bold**, [bold](xxxxxxxxx), `code`, - list, > reference