Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weibo: "User does not exist" when using --name on certain accounts #444

Open
TheTechRobo opened this issue Apr 4, 2022 · 10 comments
Open
Labels
bug Something isn't working module:weibo

Comments

@TheTechRobo
Copy link
Contributor

TheTechRobo commented Apr 4, 2022

Haven't tested with user IDs.

~/u/steam ❯❯❯ python3 parse_weibo.deduped >> weibo.jsonl
['snscrape', '--jsonl', '--progress', 'weibo-user', '--name', 'fangshengmeng']
2022-04-03 22:27:12.723  WARNING  snscrape.modules.weibo  User does not exist
Finished, 0 results
['snscrape', '--jsonl', '--progress', 'weibo-user', '--name', 'qukean']
2022-04-03 22:27:14.052  WARNING  snscrape.modules.weibo  User does not exist
Finished, 0 results
['snscrape', '--jsonl', '--progress', 'weibo-user', '--name', 'yetuavg']
2022-04-03 22:27:15.280  WARNING  snscrape.modules.weibo  User does not exist
Finished, 0 results
['snscrape', '--jsonl', '--progress', 'weibo-user', '--name', 'zhangfrank110']
2022-04-03 22:27:16.472  WARNING  snscrape.modules.weibo  User does not exist
Finished, 0 results

With verbose output (can't get locals because it didn,t crash; you should add an option to dump them anyway):

~/u/steam ❯❯❯ snscrape -v --progress --jsonl weibo-user --name fangshengmeng  2
2022-04-03 22:29:29.953  INFO  snscrape.base  Retrieving https://m.weibo.cn/n/fangshengmeng
2022-04-03 22:29:31.017  INFO  snscrape.base  Retrieved https://m.weibo.cn/n/fangshengmeng: 200
2022-04-03 22:29:31.017  WARNING  snscrape.modules.weibo  User does not exist
2022-04-03 22:29:31.017  INFO  snscrape._cli  Done, found 0 results
Finished, 0 results

Also it seems really unintuitive to have to add --name as an option if it's not a user ID; could this be fixed like it was with the Twitter scraper, i.e. seeing if it's an int?

@JustAnotherArchivist
Copy link
Owner

Can reproduce that with those names, but as far as I can tell, none of them exist (or their profiles require logging in, perhaps?). Others work correctly. Random example: Angelinazhaoooo (though it crashes with a KeyError on the video extraction very quickly).

The Twitter scraper also has an explicit flag, --user-id. Automatic detection for that obviously breaks when someone has a username composed solely of digits.

@JustAnotherArchivist
Copy link
Owner

Also, to dump on every WARNING or higher, there is a global option: --dump-locals
(Yes, it should probably get a better name.)

@TheTechRobo
Copy link
Contributor Author

The Twitter scraper also has an explicit flag, --user-id. Automatic detection for that obviously breaks when someone has a username composed solely of digits.

Oh, I guess that's true.

@TheTechRobo
Copy link
Contributor Author

I could have sworn they existed when I loaded it up into a browser, but maybe I'm wrong. Sorry for opening this invalid issue, I guess.

Wait, https://weibo.com/qukean exists I think

@JustAnotherArchivist
Copy link
Owner

Maybe, but that's behind a login wall. The mobile site, which is publicly accessible and therefore used by snscrape, says it doesn't exist: https://m.weibo.cn/n/qukean

@TheTechRobo
Copy link
Contributor Author

Oh, I'm using weibo.com. Is that different? I don't have to login for weibo.com/qukean:

image

@JustAnotherArchivist
Copy link
Owner

JustAnotherArchivist commented Apr 4, 2022

Yeah, it's not really a login, but it's an auth system of sorts with awful JS stuff to get cookies for accessing weibo.com (that I didn't want to reimplement). It is the same service though, so it's interesting that this profile is only accessible on weibo.com but not on m.weibo.cn.

@TheTechRobo
Copy link
Contributor Author

TheTechRobo commented Apr 5, 2022

Oh yeah, I noticed that redirect. Sounds very annoying to bypass or mimic.

@TheTechRobo TheTechRobo changed the title Weibo scraper broken: "User does not exist" when using --name Weibo: "User does not exist" when using --name on certain accounts Apr 5, 2022
@JustAnotherArchivist
Copy link
Owner

Yeah, the only way to fix this would be to reimplement that auth flow. Not something I'll tackle anytime soon, I think.

@JustAnotherArchivist JustAnotherArchivist added bug Something isn't working module:weibo labels Apr 5, 2022
@JustAnotherArchivist
Copy link
Owner

It's only the name resolution which is the problem here, it seems. qukean is user ID 1223717857, and that works fine on the mobile site (and consequently with snscrape). The name resolution on weibo.com is still behind the same auth flow though, so this insight doesn't really change anything, but at least you can manually work around it by observing the user ID in the network monitor when loading the profile page and then using that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module:weibo
Projects
None yet
Development

No branches or pull requests

2 participants