python爬虫 python3+selenium+chrome
2022/2/26 14:21:22
本文主要是介绍python爬虫 python3+selenium+chrome,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
1、准备
安装selenium 使用命令安装selenium: pip install selenium
下载浏览器驱动:谷歌浏览器驱动下载地址:http://chromedriver.storage.googleapis.com/index.html
驱动程序和浏览器的映射关系:https://blog.csdn.net/mcfnhm/article/details/85339414
将下载后的浏览器驱动程序解压 将chromedriver.exe复制到python的安装目录下的scripts的文件夹中
2、设置浏览器无头模式
from selenium import webdriver from time import sleep #无头模式 from selenium.webdriver.chrome.options import Options #实现回避检测(此方式已弃用) #from selenium.webdriver import ChromeOptions #无头 chrom_option = Options() chrom_option.add_argument('--headless') chrom_option.add_argument('--disable-gpu') #规避检测(此方式已弃用) #option = ChromeOptions() chrom_option.add_experimental_option('excludeSwitches', ['enable-automation']) chrom = webdriver.Chrome(options=chrom_option) chrom.get("https://www.baidu.com") print(chrom.page_source)
3、动作链示例
from selenium import webdriver from time import sleep #导入动作链 from selenium.webdriver import ActionChains from selenium.webdriver.common.by import By from selenium.webdriver.chrome.options import Options url='https://www.runoob.com/try/try.php?filename=jqueryui-api-droppable' chrom = webdriver.Chrome() chrom.get(url) chrom.maximize_window() #定位元素位于iframe标签中 需要通过一下操作后再进行标签定位 chrom.switch_to.frame("iframeResult") div_ele = chrom.find_element(By.ID,'draggable') action = ActionChains(chrom) action.click_and_hold(div_ele) for i in range(5): #move_by_offset(x,y) #perform立即执行动作链 action.move_by_offset(17,0).perform() sleep(1) #释放动作链 action.release() chrom.quit()
4.读取excel后写入txt
import xlrd import os from selenium import webdriver from selenium.webdriver.common.by import By from time import sleep def read_excel(url,chrome_url): # 导入需要读取的表格 excel = xlrd.open_workbook(url) sheet = excel.sheets()[0] txt_path = './reData' if not os.path.exists(txt_path): os.mkdir(txt_path) fp = open('./'+txt_path+'/error.txt','w',encoding='utf-8') fs = open('./'+txt_path+'/succ.txt','w',encoding='utf-8') for row in range(2,sheet.nrows): name = sheet.cell_value(row,5) pwd = sheet.cell_value(row,6) if len(name) > 0 and len(pwd) > 0: chrom = webdriver.Chrome() chrom.get(chrome_url) chrom.maximize_window() sleep(1) page_text='' try: name_input_ele = chrom.find_element(By.ID, 'userName') pwd_input_ele = chrom.find_element(By.ID, 'password') btn = chrom.find_element(By.ID, 'login') name_input_ele.send_keys(name) pwd_input_ele.send_keys(pwd) btn.click() sleep(1) page_text = chrom.page_source except: chrom.quit() if page_text.find('用户名或密码错误') >0: fp.write('%10s—%10s\n' % (name, pwd)) else: fs.write('%10s—%10s\n' % (name, pwd)) chrom.quit() fp.close() fs.close() if __name__ == '__main__': pass
这篇关于python爬虫 python3+selenium+chrome的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-05-08有遇到过吗?同样的规则 Excel 中 比Python 结果大
- 2024-03-30开始python成长之路
- 2024-03-29python optparse
- 2024-03-29python map 函数
- 2024-03-20invalid format specifier python
- 2024-03-18pool.map python
- 2024-03-18threads in python
- 2024-03-14python Ai 应用开发基础训练,字符串,字典,文件
- 2024-03-13id3 algorithm python
- 2024-03-13sum array elements python