Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrapping for the youtube search requests using golang #195

Open
muskankhedia opened this issue Mar 15, 2020 · 4 comments
Open

Scrapping for the youtube search requests using golang #195

muskankhedia opened this issue Mar 15, 2020 · 4 comments
Assignees
Labels
gssoc20 Issue for GSSoC-2020 medium Mudium type issue

Comments

@muskankhedia
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
Presently the scrapping for the youtube is done using the selenium browser, now we are revamping the application and level up all the scrapping to be done using golang. As using various different dependency for different features is making the work difficult and also increases the size of the application.

Describe the solution you'd like
The scrapping of the youtube requests to be updated and implemented using the golang scrapping packages. One of them package implemented presently for the google and bing scrapping is goquery. Other recommendations of golang packages are most welcome.

@muskankhedia muskankhedia added gssoc20 Issue for GSSoC-2020 medium Mudium type issue labels Mar 15, 2020
@irshadjsr21
Copy link
Contributor

I'm up for it.

@irshadjsr21
Copy link
Contributor

It turns out that Youtube renders its pages using Javascript. So we cannot scrape it through Golang. We'll have to stick with selenium or puppeteer.

@muskankhedia
Copy link
Collaborator Author

Yaa. I have noticed that. In that case, we can go with puppeteer as it can be developed using GoLang and we can remove the complete support for nodejs. What do you say @Harkishen-Singh ?

@Harkishen-Singh
Copy link
Owner

Harkishen-Singh commented Mar 20, 2020

Well, we had implemention with selenium but in subprocess way. However, I think we can go with https://github.com/chromedp/chromedp. It has embedded selenium and should be smaller in size. @muskankhedia. @irshadjsr21 you can go ahead with this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gssoc20 Issue for GSSoC-2020 medium Mudium type issue
Projects
None yet
Development

No branches or pull requests

3 participants