How to Obtain Google Maps Review Data? —— A Practical Analysis Based on Request Generation and Simulation

JANSON

In the era of big data, review data is an extremely valuable type of information. For businesses, researchers, or data analysts, Google Maps review data can help them understand genuine user feedback on businesses, attractions, hotels, hospitals, and more, enabling them to make well-informed decisions.
1.The Value of Google Maps Review Data
①Business Analysis: Uncover user satisfaction points and pain points through reviews to optimize services.

②Travel Recommendations: Reviews for hotels, restaurants, and attractions serve as crucial references for user choices.

③Public Opinion Monitoring: The volume and sentiment of reviews can reflect market trends.

④Data Mining: Enables sentiment analysis and keyword extraction.

2.Target API Analysis

When accessing a specific location's page on Google Maps, you'll notice that reviews are not loaded all at once but are fetched in batches via asynchronous requests. The core API endpoint for this is:

https://www.google.com/maps/rpc/listugcposts?authuser=0&hl=el&pb=...

The pb parameter here is crucial. It contains necessary information such as the Place ID, pagination token, and request ID. If we can correctly construct this pb string, we can simulate frontend requests to obtain complete review data.

3.Core Parameter Parsing

Through reverse engineering of frontend requests, the URL construction rules can be summarized as follows:

placeID: Extracted from the Google Maps share link, typically found in the !1sxxxx segment.

pageToken: A token used for pagination. It is empty for the first page and returned in the API response for subsequent pages.

pageSize: The number of reviews returned per request, e.g., 20.

requestID: A session request ID, usually a randomly generated string.

The concatenation logic for the pb parameter resembles::
!1m6!1s{placeID}
!6m4!4m1!1e1!4m1!1e3
!2m2!1i{pageSize}!2s{pageToken}
!5m2!1s{requestID}!7e81
!8m9!2b1!3b1!5b1!7b1
!12m4!1b1!2b1!4m1!1e1!11m0!13m1!1e1
After final concatenation, appending this to the API URL completes a request.

4.Python Implementation: Generating the URL

Python Copy
def _generate_url(self, map_url, page_token, page_size, request_id):
        place_id_regex = re.compile(r"!1s([^!]+)")
        match = place_id_regex.search(map_url)
        if not match:
            raise ValueError(f"Could not extract place ID from URL: {map_url}")
        raw_place_id = match.group(1)
        try:
            raw_place_id = urllib.parse.unquote(raw_place_id)
        except Exception:
            pass
        encoded_place_id = urllib.parse.quote(raw_place_id)
        encoded_page_token = urllib.parse.quote(page_token)
        pb_components = [
            f"!1m6!1s{encoded_place_id}",
            "!6m4!4m1!1e1!4m1!1e3",
            f"!2m2!1i{page_size}!2s{encoded_page_token}",
            f"!5m2!1s{request_id}!7e81",
            "!8m9!2b1!3b1!5b1!7b1",
            "!12m4!1b1!2b1!4m1!1e1!11m0!13m1!1e1",
        ]
        pb_string = "".join(pb_components)
        return f"https://www.google.com/maps/rpc/listugcposts?authuser=0&hl=el&pb={pb_string}"

5.Pagination Handling

Python Copy
def extract_next_page_token(data):
    text = data.decode("utf-8", errors="ignore")
    prefix = ")]}'\n"
    if text.startswith(prefix):
        text = text[len(prefix) :]
    try:
        result = json.loads(text)
    except json.JSONDecodeError:
        return ""
    token = get_nested_element(result, 1)
    return token if isinstance(token, str) else ""

6.Simulating Requests
Using the URL generation and pagination handling described above, we simulate sending requests.

Python Copy
def _fetch_review_page(self, url):
        try:
            resp = self.http_client.get(url, timeout=10)
            resp.raise_for_status()
            return resp.content
        except httpx.RequestError as e:
            raise Exception(f"Fetch error for {url}: {e}")
        except httpx.HTTPStatusError as e:
            raise Exception(f"{url}: unexpected status code: {e.response.status_code}")

Ultimately, we obtain the raw review data returned by the API.

Update Time:Feb 03, 2026

Comments

Tips: Support some markdown syntax: **bold**, [bold](xxxxxxxxx), `code`, - list, > reference