You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
trying snscrape --jsonl --max-results 10 instagram-hashtag oreo
I've got the following error:
2023-05-12 10:36:36.448 CRITICAL snscrape._cli Dumped stack and locals to /tmp/snscrape_locals_wgtjomc1
Traceback (most recent call last):
File "/usr/local/bin/snscrape", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.8/site-packages/snscrape/_cli.py", line 320, in main
for i, item in enumerate(scraper.get_items(), start = 1):
File "/usr/local/lib/python3.8/site-packages/snscrape/modules/instagram.py", line 109, in get_items
r = self._initial_page()
File "/usr/local/lib/python3.8/site-packages/snscrape/modules/instagram.py", line 77, in _initial_page
r = self._get(self._initialUrl, headers = self._headers, responseOkCallback = self._check_initial_page_callback)
File "/usr/local/lib/python3.8/site-packages/snscrape/base.py", line 266, in _get
return self._request('GET', *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/snscrape/base.py", line 237, in _request
success, msg = responseOkCallback(r)
File "/usr/local/lib/python3.8/site-packages/snscrape/modules/instagram.py", line 88, in _check_initial_page_callback
jsonData = r.text.split('<script type="text/javascript">window._sharedData = ')[1].split(';</script>')[0] # May throw an IndexError if Instagram changes something again; we just let that bubble.
IndexError: list index out of range
the same behaviour happens also using a istagram-user and instagram-location
How to reproduce
Issuing the command
snscrape --jsonl --max-results 10 instagram-hashtag oreo
Describe the bug
trying snscrape --jsonl --max-results 10 instagram-hashtag oreo
I've got the following error:
2023-05-12 10:36:36.448 CRITICAL snscrape._cli Dumped stack and locals to /tmp/snscrape_locals_wgtjomc1
Traceback (most recent call last):
File "/usr/local/bin/snscrape", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.8/site-packages/snscrape/_cli.py", line 320, in main
for i, item in enumerate(scraper.get_items(), start = 1):
File "/usr/local/lib/python3.8/site-packages/snscrape/modules/instagram.py", line 109, in get_items
r = self._initial_page()
File "/usr/local/lib/python3.8/site-packages/snscrape/modules/instagram.py", line 77, in _initial_page
r = self._get(self._initialUrl, headers = self._headers, responseOkCallback = self._check_initial_page_callback)
File "/usr/local/lib/python3.8/site-packages/snscrape/base.py", line 266, in _get
return self._request('GET', *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/snscrape/base.py", line 237, in _request
success, msg = responseOkCallback(r)
File "/usr/local/lib/python3.8/site-packages/snscrape/modules/instagram.py", line 88, in _check_initial_page_callback
jsonData = r.text.split('<script type="text/javascript">window._sharedData = ')[1].split(';</script>')[0] # May throw an IndexError if Instagram changes something again; we just let that bubble.
IndexError: list index out of range
the same behaviour happens also using a istagram-user and instagram-location
How to reproduce
Issuing the command
snscrape --jsonl --max-results 10 instagram-hashtag oreo
Expected behaviour
a json of the objects in page
Screenshots and recordings
No response
Operating system
centos 8
Python version: output of
python3 --version
Python 3.8.12
snscrape version: output of
snscrape --version
snscrape 0.6.2.20230321.dev13+g786815d
Scraper
snscrape --jsonl --max-results 10 instagram-hashtag oreo
How are you using snscrape?
CLI (
snscrape ...
as a command, e.g. in a terminal)Backtrace
No response
Log output
2023-05-12 10:44:46.688 INFO snscrape.modules.instagram Retrieving initial data
2023-05-12 10:44:46.690 INFO snscrape.base Retrieving https://www.instagram.com/explore/tags/oreo/
2023-05-12 10:44:46.690 DEBUG snscrape.base ... with headers: {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
2023-05-12 10:44:46.690 DEBUG snscrape.base ... with environmentSettings: {'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
2023-05-12 10:44:46.690 DEBUG urllib3.connectionpool Starting new HTTPS connection (1): www.instagram.com:443
2023-05-12 10:44:46.755 DEBUG snscrape.base Connected to: ('31.13.86.174', 443)
2023-05-12 10:44:46.755 DEBUG snscrape.base Connection cipher: ('ECDHE-RSA-AES128-GCM-SHA256', 'TLSv1/SSLv3', 128)
2023-05-12 10:44:47.697 DEBUG urllib3.connectionpool https://www.instagram.com:443 "GET /explore/tags/oreo/ HTTP/1.1" 200 None
2023-05-12 10:44:47.832 INFO snscrape.base Retrieved https://www.instagram.com/explore/tags/oreo/: 200
2023-05-12 10:44:47.832 DEBUG snscrape.base ... with response headers: {'Vary': 'Accept-Encoding', 'Content-Encoding': 'gzip', 'critical-ch': 'Sec-CH-UA-Model', 'accept-ch-lifetime': '4838400', 'accept-ch': 'viewport-width,Sec-CH-Prefers-Color-Scheme,Sec-CH-UA-Full-Version-List,Sec-CH-UA-Platform-Version', 'reporting-endpoints': 'coep_report="https://www.facebook.com/browser_reporting/?minimize=0", default="https://www.instagram.com/error/ig_web_error_reports/?device_level=unknown"', 'report-to': '{"max_age":86400,"endpoints":[{"url":"https:\/\/www.facebook.com\/browser_reporting\/?minimize=0"}],"group":"coep_report"}, {"max_age":259200,"endpoints":[{"url":"https:\/\/www.instagram.com\/error\/ig_web_error_reports\/?device_level=unknown"}]}', 'content-security-policy-report-only': "default-src *.facebook.com *.fbcdn.net *.instagram.com data: blob:;script-src *.facebook.com *.fbcdn.net *.facebook.net 'unsafe-inline' 'unsafe-eval' blob: data: 'self' *.instagram.com static.cdninstagram.com;style-src data: blob: 'unsafe-inline' *.fbcdn.net *.facebook.com *.instagram.com static.cdninstagram.com;connect-src *.facebook.com facebook.com .fbcdn.net .facebook.net wss://.facebook.com: blob: .instagram.com .cdninstagram.com wss://.instagram.com: 'self' wss://edge-chat.instagram.com connect.facebook.net;font-src *.facebook.com data: *.fbcdn.net *.instagram.com static.cdninstagram.com *.intern.facebook.com;img-src *.instagram.com *.facebook.com *.fbcdn.net data: blob: *.cdninstagram.com *.fbsbx.com android-webview-video-poster:;media-src *.facebook.com *.fbcdn.net *.instagram.com *.cdninstagram.com cdn.fbsbx.com data: blob:;frame-src *.instagram.com *.facebook.com *.fbsbx.com fbsbx.com data:;block-all-mixed-content;report-uri https://www.facebook.com/csp/reporting/?m=c&minimize=0;", 'content-security-policy': "default-src *.facebook.com *.fbcdn.net *.instagram.com data: blob:;script-src *.facebook.com *.fbcdn.net *.facebook.net 'unsafe-inline' 'unsafe-eval' blob: data: 'self' *.instagram.com static.cdninstagram.com;style-src data: blob: 'unsafe-inline' *.fbcdn.net *.facebook.com *.instagram.com static.cdninstagram.com;connect-src *.facebook.com facebook.com .fbcdn.net .facebook.net wss://.facebook.com: blob: .instagram.com .cdninstagram.com wss://.instagram.com: 'self' wss://edge-chat.instagram.com connect.facebook.net;font-src *.facebook.com data: *.fbcdn.net *.instagram.com static.cdninstagram.com *.intern.facebook.com;img-src *.instagram.com *.facebook.com *.fbcdn.net data: blob: *.cdninstagram.com *.fbsbx.com android-webview-video-poster: *.whatsapp.net;media-src *.facebook.com *.fbcdn.net *.instagram.com *.cdninstagram.com cdn.fbsbx.com data: blob:;frame-src *.instagram.com *.facebook.com *.fbsbx.com fbsbx.com data:;block-all-mixed-content;upgrade-insecure-requests;report-uri https://www.facebook.com/csp/reporting/?m=c&minimize=0;", 'document-policy': 'force-load-at-top', 'permissions-policy': 'accelerometer=()', 'cross-origin-resource-policy': 'rollout', 'cross-origin-embedder-policy-report-only': 'require-corp;report-to="coep_report"', 'cross-origin-opener-policy': 'same-origin-allow-popups', 'Pragma': 'no-cache', 'Cache-Control': 'private, no-cache, no-store, must-revalidate', 'Expires': 'Sat, 01 Jan 2000 00:00:00 GMT', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '0', 'X-Frame-Options': 'DENY', 'Strict-Transport-Security': 'max-age=15552000', 'Content-Type': 'text/html; charset="utf-8"', 'X-FB-Debug': '+eZyYqxs96slB6NBTlraJ0h9YS5OXMvuy+/L90+PskOEK8vZ+3YgAbE2msYaoJGl2cgSJYgRhPb0v9P9kWMIfA==', 'Date': 'Fri, 12 May 2023 08:44:47 GMT', 'X-FB-TRIP-ID': '1679558926', 'Alt-Svc': 'h3=":443"; ma=86400', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive'}
Dump of locals
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: