Regarding the failure to intercept data using Puppeteer for scraping Google Maps, why did it fail? It's because Google Maps responses contain a special string.
Error: Scraping failed: Unexpected non-whitespace character after JSON at position 32398 (line 1 column 32399)
This error indicates that a non-whitespace character was encountered while parsing the JSON data, preventing it from being processed correctly.
Inspection of the response data fetched from Google Maps network requests reveals that the data begins with a specific special string: )]}'\n.
Why is this special string present?
This is a security measure implemented by Google Maps to prevent "JSON Hijacking" attacks. It helps protect users' personal information on Google Maps, such as searched locations or restaurants, from being stolen, acting similarly to a firewall.
How to handle this issue?
First, obtain the complete response data as text. It will look something like:
"d": ")]}'\n[[\"New York, NY, Starbucks\",[[null,null,null,null,null,null,null,null,\"KEegaMXvFYG5vr0P19elmAs\",\"0ahUKEwjFo56u-o6PAxWB...
Next, before parsing, remove this prefix using the dataString.replace(")]}'\n",""); method.
Finally, parse the remaining, valid JSON data using JSON.parse().
Code Steps:
1.First, parse the initial block into a JSON object.
javascript
const outerJson = JSON.parse(jsonBlock);
2.Extract the value corresponding to the "d" key and remove the special string )]}'\n.
javascript
const dataString = outerJson.d;
const cleanedText = dataString.replace(")]}'\n", "");
3.Parse the final, cleaned JSON data.
javascript
const rawPostJson = JSON.parse(cleanedText);