I'm trying to create a monitor for nike, where I can scrape the releases. Everytime something releases, the monitor will get the URL. Also I want know where should I keep the data and what is the best option. Is it possible to sync with a discord bot to get notifications?
import scrapy
class NikeCalendarSpider(scrapy.Spider):
name = 'nike_calendar'
allowed_domains = ['www.nike.com.br']
def start_requests(self):
yield scrapy.Request(url='https://www.nike.com.br/Snkrs#calendario', callback=self.parse,
headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, '
'like Gecko) Chrome/87.0.4280.141 Safari/537.36'
})
def parse(self, response):
for product in response.xpath(
"//div[@id='DadosPaginacaoCalendario']/div/div[@class='snkr-release produto produto--aviseme']"):
yield {
'title': product.xpath(".//div[@class='snkr-release__info']/div[2]/a/text()").get(),
'url': product.xpath(".//div[@class='snkr-release__info']/div[2]/a/@href").get()
}
question from:
https://stackoverflow.com/questions/65837440/scrapy-how-to-keep-scraping-the-same-page-to-get-a-new-information-infinite-l 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…