Youtube parser

socnet

Client
Регистрация
02.12.2014
Сообщения
135
Благодарностей
119
Баллы
43
The parser includes two templates:
  • Description and title parser
  • Recommended video parser
Youtube Video Info Parser
Performs a GET request on the video page, extracts the description and title.

Youtube ID Parser
To start, the ID of one video is fed in. Then the template parses the IDs of recommended videos.

First, it's necessary to parse the IDs. I managed to get around 402k in a week. Then, the template for parsing videos is launched. Ideally, both templates can work together - parsing IDs will always be faster than descriptions and titles.

118520


Data is saved in a MySQL database because storing in files is slow and inefficient. You'll need to install the database itself and phpMyAdmin (optional).

Installation Method #1:
Download the installer from https://dev.mysql.com/downloads/installer/. For phpMyAdmin, PHP and a web server (Apache or nginx) are required. You can download ready-made LAMP bundles like Wamp, Xampp, etc.

Installation Method #2:
Install Docker Desktop. Prepare the docker-compose.yml file, navigate to the directory with this file, and execute the command docker-compose up. After starting the containers, phpMyAdmin will be available at localhost:8080.

Create a table with the following structure
118521


Create a unique index for youtube_id to avoid duplicates. And, of course, a primary index for auto-increment. Naturally, indexes need to be created before populating the table with data.

The database is ready. Add one entry manually with the first ID. Example: https://www.youtube.com/watch?v=nok4P9cYw_g - extract the ID. You can take any video and copy its ID to start. Then run Youtube ID Parser. Once at least the first 1000 are available, you can start Youtube Video Info Parser.
 
Тема статьи
Парсинг

Вложения

Кто просматривает тему: (Всего: 1, Пользователи: 0, Гости: 1)