【问题标题】:How do you fully close a tab by using Selenium in Python?如何在 Python 中使用 Selenium 完全关闭选项卡?
【发布时间】:2021-05-19 17:43:39
【问题描述】:

我正在尝试使用 Selenium 进行网络抓取,并且我打开了一些选项卡以获取一些信息,但随后又想关闭它们。如果我不这样做,我会在代码完成运行时打开很多选项卡。我尝试切换到我想要关闭的选项卡,然后尝试通过执行以下操作将其关闭:

browser.switch_to.window(browser.window_handles[1])
browser.close()

当我运行程序时,尽管选项卡保持打开状态并且 URL 通常在哪里显示“about:blank”。有没有办法完全关闭此选项卡,同时保持所有其他选项卡打开?以下是完整代码供参考。

from selenium import webdriver
import os
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import xlsxwriter
from datetime import datetime
import time
from selenium.common.exceptions import TimeoutException


trade_date_lim = "5/1/2021"


chrome_driver = os.path.abspath('C:/Users/ross/Desktop/chromedriver.exe')
browser = webdriver.Chrome(chrome_driver)


#makes workbook to write to
workbook = xlsxwriter.Workbook('reit_bonds_test.xlsx')
worksheet = workbook.add_worksheet()


stocks = ["PNW", "STWD"]

for stock in stocks:
    browser.get('https://finra-markets.morningstar.com/BondCenter/Default.jsp')
    wait = WebDriverWait(browser, 10)
    #using clicks and send_keys, gets the bond page for a desired stock
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,
                                           '#TabContainer > div > div.rtq-tab-wrap > div.rtq-tab-menus-wrap > ul > li:nth-child(3) > a > span'))).click()
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#firscreener-cusip'))).send_keys(stock)
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,
                                           "#ms-finra-advanced-search-form > div.ms-finra-advanced-search-btn > input:nth-child(2)"))).click()
    try:
        wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-agreement > input"))).click()
    except TimeoutException:
        pass

    #clicks to sort by earliest date, clicks again to sort by latest maturity
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-grid-hd > div > div:nth-child(7) > div"))).click()
    time.sleep(5)
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-grid-hd > div > div:nth-child(7) > div"))).click()
    time.sleep(5)
    #gathers all bond offerings on first page
    whole_chart = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll"))).text

    #gets number of bonds listed on page so we can iterate through them. Some pages have differing number of bonds listed. Most on page is 20
    parent = browser.find_element_by_xpath('//*[@id="ms-finra-search-results"]/div/div[3]/div[1]/div[1]/div[2]/div[2]/div')
    count_divs = len(parent.find_elements_by_xpath("./div"))

    bnd_off_cnt = 1
    row_num = 0
    while row_num < count_divs and bnd_off_cnt < 3:

        #gets values that I'm looking for
        symbol = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(3)"))).text
        maturity = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(7)"))).text
        moody_rating = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(8)"))).text
        sandp_rating = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(9)"))).text
        stated_bond_yield = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(11)"))).text

        #looks to see if all values are non-empty and if moody rating and sandp rating are not equal to 'WR' and 'NR'
        if symbol.strip() and maturity.strip() and moody_rating.strip() and sandp_rating.strip() and stated_bond_yield.strip() and moody_rating != "WR" and sandp_rating != "NR":
            #bond detail page below
            element = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(2) > div > a")))
            element_link = element.get_attribute('href') #gets the link

            #opens window, switches to it and opens the bond detail page
            browser.execute_script("window.open('');")
            time.sleep(3)
            browser.switch_to.window(browser.window_handles[1])
            browser.get(element_link)

            #switch to iframe on second page and clicks it
            wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID, "ms-bond-detail-iframe")))
            wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#tradeHistory_link"))).click()
            #switches to third page
            browser.switch_to.window(browser.window_handles[-1])
            #sleeps for 3 seconds so we know for sure that we are working on right page
            time.sleep(3)


            # get length of table on trades page and iterate through them trying to find the most recent "Trade" status
            bond_trades = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "#ms-glossary > div > table > tbody > tr")))
            count = len(bond_trades)


            for trade in range(count):

                bond_trade_status = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-glossary > div > table > tbody > tr:nth-child(" + str(trade + 1) + ") > td:nth-child(4) > div"))).text
                if bond_trade_status == "Trade":
                    bond_last_traded = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-glossary > div > table > tbody > tr:nth-child(" + str(trade + 1) + ") > td:nth-child(1) > div"))).text
                    if bond_last_traded > trade_date_lim:
                        #prior bond yields occasionally don't match the yield that it was last traded at
                        bond_yield = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-glossary > div > table > tbody > tr:nth-child(" + str(trade + 1) + ") > td:nth-child(7) > div"))).text
                        print(symbol, maturity, bond_yield)
                        bnd_off_cnt += 1
                        break
                    else:
                        continue
                    #test for if we are within X amount of time from today
                    #continue if we are more than that amount of time
                    #exit if we are within time frame and get 'Yield'
                else:
                    continue
            browser.switch_to.window(browser.window_handles[1])
            browser.close()
            browser.switch_to.window(browser.window_handles[0])
        row_num += 1

感谢您的任何帮助!

罗斯

【问题讨论】:

  • 您要关闭标签页或浏览器吗?
  • 我只想在引用 browser.switch_to.window(browser.window_handles[1]) 时关闭我所在的选项卡。

标签: python selenium selenium-webdriver web-scraping selenium-chromedriver


【解决方案1】:

我认为问题不在于driver.close()。当我运行您的代码时,在您调用driver.close() 时打开了三个窗口。我还没有仔细研究您的代码以弄清楚每个窗口的确切来源,我认为您处于更好的位置。但是driver.close() 完全关闭选项卡就好了,是您的代码将多余的选项卡留在那里。

我在您的 driver.close 周围添加了这样的日志记录,以表明您在每个循环中都获得了一个额外的窗口句柄。

print(browser.window_handles)
browser.switch_to.window(browser.window_handles[1])
browser.close()
print(browser.window_handles)

您可以从输出中看到窗口是如何累积的:

PNW5042752 09/15/2050 3.198
['CDwindow-F6F960253D139F2E40A277E65170F5FD', 'CDwindow-5A1FC1679F6C04AA88D09BA7B6568B53', 'CDwindow-DE838D1CED095EF7FEFF8DF3A3242829']
['CDwindow-F6F960253D139F2E40A277E65170F5FD', 'CDwindow-DE838D1CED095EF7FEFF8DF3A3242829']
PNW4989897 05/15/2050 3.217
['CDwindow-F6F960253D139F2E40A277E65170F5FD', 'CDwindow-DE838D1CED095EF7FEFF8DF3A3242829', 'CDwindow-C7B7C24BC2C066C2D42F622ED58A982B', 'CDwindow-9AEBC8F28DFB92B291591730FA45B87B']
['CDwindow-F6F960253D139F2E40A277E65170F5FD', 'CDwindow-C7B7C24BC2C066C2D42F622ED58A982B', 'CDwindow-9AEBC8F28DFB92B291591730FA45B87B']
['CDwindow-F6F960253D139F2E40A277E65170F5FD', 'CDwindow-C7B7C24BC2C066C2D42F622ED58A982B', 'CDwindow-9AEBC8F28DFB92B291591730FA45B87B', 'CDwindow-D3732930FBE9AED7BA22171E4CAE0DCF', 'CDwindow-38527482E16607D629B8DBBA4598AC76']
['CDwindow-F6F960253D139F2E40A277E65170F5FD', 'CDwindow-9AEBC8F28DFB92B291591730FA45B87B', 'CDwindow-D3732930FBE9AED7BA22171E4CAE0DCF', 'CDwindow-38527482E16607D629B8DBBA4598AC76']

解决方案:只需在测试结束时关闭两个浏览器选项卡,而不仅仅是一个。

browser.switch_to.window(browser.window_handles[2])
browser.close()
browser.switch_to.window(browser.window_handles[1])
browser.close()
browser.switch_to.window(browser.window_handles[0])

【讨论】:

    【解决方案2】:

    尝试以下方法:

    browser.switch_to.window(browser.window_handles[1])
    browser.find_element_by_tag_name('body').send_keys(Keys.CONTROL, 'w')
    browser.switch_to.window(browser.window_handles[0])
    

    先切换到你想关闭的窗口,然后用Control+w键关闭,最后跳转到初始窗口(tab)

    【讨论】:

    • 标签仍然存在,并且 url 仍然是“about:blank”。我想这可能意味着该选项卡在技术上是关闭的,尽管它仍然是物理打开的....
    • 在应用 Control + w 之前的第二个选项卡是什么?
    【解决方案3】:

    我会在评论中发布它,但是代码部分太大了。我认为你有缩进问题。检查位置:

    browser.switch_to.window(browser.window_handles[1])
    browser.close()
    browser.switch_to.window(browser.window_handles[0])
    

    它被移到了外循环。结果是一样的,打开的标签要少得多(在我的机器上试过)。

    from selenium import webdriver
    import os
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    import xlsxwriter
    from datetime import datetime
    import time
    from selenium.common.exceptions import TimeoutException
    
    
    trade_date_lim = "5/1/2021"
    
    
    chrome_driver = os.path.abspath('C:/Users/ross/Desktop/chromedriver.exe')
    browser = webdriver.Chrome(chrome_driver)
    
    
    #makes workbook to write to
    workbook = xlsxwriter.Workbook('reit_bonds_test.xlsx')
    worksheet = workbook.add_worksheet()
    
    
    stocks = ["PNW", "STWD"]
    
    for stock in stocks:
        browser.get('https://finra-markets.morningstar.com/BondCenter/Default.jsp')
        wait = WebDriverWait(browser, 10)
        #using clicks and send_keys, gets the bond page for a desired stock
        wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,
                                               '#TabContainer > div > div.rtq-tab-wrap > div.rtq-tab-menus-wrap > ul > li:nth-child(3) > a > span'))).click()
        wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#firscreener-cusip'))).send_keys(stock)
        wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,
                                               "#ms-finra-advanced-search-form > div.ms-finra-advanced-search-btn > input:nth-child(2)"))).click()
        try:
            wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-agreement > input"))).click()
        except TimeoutException:
            pass
    
        #clicks to sort by earliest date, clicks again to sort by latest maturity
        wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-grid-hd > div > div:nth-child(7) > div"))).click()
        time.sleep(5)
        wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-grid-hd > div > div:nth-child(7) > div"))).click()
        time.sleep(5)
        #gathers all bond offerings on first page
        whole_chart = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll"))).text
    
        #gets number of bonds listed on page so we can iterate through them. Some pages have differing number of bonds listed. Most on page is 20
        parent = browser.find_element_by_xpath('//*[@id="ms-finra-search-results"]/div/div[3]/div[1]/div[1]/div[2]/div[2]/div')
        count_divs = len(parent.find_elements_by_xpath("./div"))
    
        bnd_off_cnt = 1
        row_num = 0
        while row_num < count_divs and bnd_off_cnt < 3:
    
            #gets values that I'm looking for
            symbol = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(3)"))).text
            maturity = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(7)"))).text
            moody_rating = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(8)"))).text
            sandp_rating = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(9)"))).text
            stated_bond_yield = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(11)"))).text
    
            #looks to see if all values are non-empty and if moody rating and sandp rating are not equal to 'WR' and 'NR'
            if symbol.strip() and maturity.strip() and moody_rating.strip() and sandp_rating.strip() and stated_bond_yield.strip() and moody_rating != "WR" and sandp_rating != "NR":
                #bond detail page below
                element = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(2) > div > a")))
                element_link = element.get_attribute('href') #gets the link
    
                #opens window, switches to it and opens the bond detail page
                browser.execute_script("window.open('');")
                time.sleep(3)
                browser.switch_to.window(browser.window_handles[1])
                browser.get(element_link)
    
                #switch to iframe on second page and clicks it
                wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID, "ms-bond-detail-iframe")))
                wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#tradeHistory_link"))).click()
                #switches to third page
                browser.switch_to.window(browser.window_handles[-1])
                #sleeps for 3 seconds so we know for sure that we are working on right page
                time.sleep(3)
    
    
                # get length of table on trades page and iterate through them trying to find the most recent "Trade" status
                bond_trades = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "#ms-glossary > div > table > tbody > tr")))
                count = len(bond_trades)
    
    
                for trade in range(count):
    
                    bond_trade_status = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-glossary > div > table > tbody > tr:nth-child(" + str(trade + 1) + ") > td:nth-child(4) > div"))).text
                    if bond_trade_status == "Trade":
                        bond_last_traded = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-glossary > div > table > tbody > tr:nth-child(" + str(trade + 1) + ") > td:nth-child(1) > div"))).text
                        if bond_last_traded > trade_date_lim:
                            #prior bond yields occasionally don't match the yield that it was last traded at
                            bond_yield = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-glossary > div > table > tbody > tr:nth-child(" + str(trade + 1) + ") > td:nth-child(7) > div"))).text
                            print(symbol, maturity, bond_yield)
                            bnd_off_cnt += 1
                            break
                        else:
                            continue
                        #test for if we are within X amount of time from today
                        #continue if we are more than that amount of time
                        #exit if we are within time frame and get 'Yield'
                    else:
                        continue
            browser.switch_to.window(browser.window_handles[1])
            browser.close()
            browser.switch_to.window(browser.window_handles[0])
            row_num += 1
    

    我没有过多关注您代码中的一些问题,因为它们不在问题中。

    【讨论】:

    • 我认为 browser.close 在正确的位置,只是没有关闭正确数量的窗口。使用此代码,经过几个循环后,到达browser.switch_to.window(browser.window_handles[1]) 时只有一个窗口打开,所以我得到browser.switch_to.window(browser.window_handles[1]) IndexError: list index out of range
    • 你是对的。我第二次运行代码时遇到了同样的错误。但我并不总是收到这个out of range 错误。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-04-08
    • 2018-12-03
    • 2020-09-12
    • 2020-04-13
    • 1970-01-01
    相关资源
    最近更新 更多