Formatting Google Maps Scraped Data into JSON

Regarding the use of Puppeteer to scrape Google Maps and save data in JSON format, the directly obtained data often has poor readability, is complex, and contains a lot of invalid information. Organizing it into JSON format improves readability. This article will demonstrate the advantages of the JSON format and how to use Puppeteer to scrape data and save it in JSON format.

Advantages of the JSON Format
Good readability: data is clear and concise, making it suitable for data interaction.

Abundance of conversion tools: easy for machines to parse and generate.

Fast data transmission speed: it is a lightweight data interchange format.

Compatibility with mainstream browsers: it uses a text format that is completely language-independent.

How to Convert to JSON Format?

  1. Obtain the raw data (rawData).
JavaScript Copy
const responseText = await response.text();

2.Parse the JSON data using regex matching and replacement.

JavaScript Copy
const rawPostJson = JSON.parse(cleanedText);

3.Call the rawPostJson data, process the function, and return the result. Once the raw data is obtained:

JavaScript Copy
const formattedData = transformRawData(rawPostJson);
    console.log(`[${searchQuery}] Task successful, found ${formattedData.length} formatted results.`);
    
    return {
      searchQuery,
      rawData: rawPostJson,
      formattedData: formattedData
    };

4.Use the transformRawData() function to convert the complex raw data into clear and concise formatted JSON data.

JavaScript Copy
function transformRawData(rawData){
  const placesList = rawData?.[64];

  if (!Array.isArray(placesList)){
  console.warn("Warning: The expected list of locations could not be found in the original data, conversion cannot be performed.");
  return [];
}

5.Analyze the API response, identify patterns, and locate the exact array position in the raw data. Use optional chaining (?.) to simplify access to nested object or array properties.

JavaScript Copy
const formattedResults = placesList
  .map(place=>{
    const placeData = place?.[1];
    if (!placeData) return null;

    const title = placeData[11];
    if(!title) {
      return null;
    }
 
    const reviewsCount = placeData?.[4]?.[8] ?? 0;     
    const addressInfo = placeData?.[183]?.[1];
    const street = addressInfo?.[0] ?? '';
    const city = addressInfo?.[3] ?? '';
    const state = addressInfo?.[5] ?? '';
    const countryCode = addressInfo?.[6] ?? '';         
    const categoryName = placeData?.[13]?.[0] ?? '';     
    const placeId = placeData?.[78];
    const url = (placeId && title)
    ?`https://www.google.com/maps/search/?api=1&query=${encodeURIComponent(title)}&query_place_id=${placeId}`
    :`https://www.google.com/maps/search/?api=1&query=${encodeURIComponent(title)}`;

6.Return the extracted information as a clean object.

JavaScript Copy
 return {
       title,
        reviewsCount,
        street,
        city,
        state,
        countryCode,
        categoryName,
        url,
    };

7.Filter out null values and return the result.

language Copy
.filter(Boolean);
  return formattedResults;

Saving the Converted JSON Format
The scraped data is saved as a JSON file using the fs module. Use fs.writeFile to save the formatted formattedData.

Code Example:

JavaScript Copy
 if (allRawData.length > 0) {
        
        await fs.writeFile('./maps_data_all.json', JSON.stringify(allRawData, null, 2));
        console.log(` The raw data capture is complete!A total of ${allRawData.length} raw data has been saved to maps_data_all.json`);
    } else {
        console.warn("All tasks did not return valid [raw data],maps_data_all.json not generated.");
    }

    if (allFormattedDSata.length > 0) {
      await fs.writeFile('./formatted_maps_data.json', JSON.stringify(allFormattedData, null, 2));
      console.log(`Data extraction completed! A total of ${allFormattedData.length} records have been saved to formatted_maps_data.json.`);
    } else {
      console.warn("All tasks did not return valid formatted data, ,formatted_maps_data.json was not generated。");
    }

  } catch (error) {
    console.error("A serious error occurred in the main process.: ", error);
  } finally {
    if (browser) {
      await browser.close();
      console.log("The browser has been closed.");
    }
  }
Update Time:Sep 05, 2025

Comments

Tips: Support some markdown syntax: **bold**, [bold](xxxxxxxxx), `code`, - list, > reference