Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
562 views
in Technique[技术] by (71.8m points)

javascript - How to find all elements on the webpage through scrolling using SeleniumWebdriver and Python

I can't seem to get all elements on a webpage. No matter what I have tried using selenium. I am sure I am missing something. Here's my code. The url has at least 30 elements yet whenever I scrape only 6 elements return. What am I missing?

import requests
import webbrowser
import time
from bs4 import BeautifulSoup as bs
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException



headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}
url = 'https://www.adidas.com/us/men-shoes-new_arrivals'

res = requests.get(url, headers = headers)
page_soup = bs(res.text, "html.parser")


containers = page_soup.findAll("div", {"class": "gl-product-card-container show-variation-carousel"})


print(len(containers))
#for each container find shoe model
shoe_colors = []

for container in containers:
    if container.find("div", {'class': 'gl-product-card__reviews-number'}) is not None:
        shoe_model = container.div.div.img["title"]
        review = container.find('div', {'class':'gl-product-card__reviews-number'})
        review = int(review.text)



driver = webdriver.Chrome()
driver.get(url)
time.sleep(5)
shoe_prices = driver.find_elements_by_css_selector('.gl-price')

for price in shoe_prices:
    print(price.text)
print(len(shoe_prices))
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

you have to slowly scroll down the page. It only request price data with ajax when product viewed.

options = Options()
options.add_argument('--start-maximized')
driver = webdriver.Chrome(options=options)

url = 'https://www.adidas.com/us/men-shoes-new_arrivals'
driver.get(url)

scroll_times = len(driver.find_elements_by_class_name('col-s-6')) / 4 # (divide by 4 column product per row)
scrolled = 0
scroll_size = 400

while scrolled < scroll_times:
    driver.execute_script('window.scrollTo(0, arguments[0]);', scroll_size)
    scrolled +=1
    scroll_size += 400
    time.sleep(1)

shoe_prices = driver.find_elements_by_class_name('gl-price')

for price in shoe_prices:
    print(price.text)

print(len(shoe_prices))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...