开发者问题收集

如何使用 selenium 从工具提示中抓取文本?页面不包含工具提示 html

2021-03-07
277

我正在尝试使用 selenium 从页面中抓取项目。目前我只能从项目工具提示中抓取文本。

带有工具提示的项目

我使用开发人员工具 (f12) 看到的页面中某个项目的 HTML:

<div class="offer-page-sale-item offer-page-sale-item__actual" data-real-id="12345" data-id="12345-4">
    <!-- Render product -->
    <div class="item ">
        <div class="discount price bare">
            <div class="t1">
                <span class="value">1</span>
                <span class="cents">69</span>
                <span class="eur">€</span>
            </div>
        </div>

        <div class="tags tags_primary">
            <div class="tag i tooltipstered"></div>
        </div>

        <div class="img"></div>

        <div class="text">
            <div class="title">[name of the product]</div>
            <div class="pack">
                <span class="ptitle"></span>
                <span class="price"> </span>
            </div>
        </div>
        <div class="infopop">
            <div class="in"><span></span>
                <div class="arrow_top"></div>
                <div class="arrow_bottom"></div>
            </div>
        </div>
    </div>                        
</div>

如果我从页面源代码 (ctrl+u) 查看同一项目的 HTML,我看到:

<div class="offer-page-sale-item offer-page-sale-item__actual" data-real-id="12345" data-id="12345-4">
    <!-- Render product -->
    <div class="item ">
        <div class="discount price bare">
            <div class="t1">
                <span class="value">1</span>
                <span class="cents">69</span>
                <span class="eur">€</span>
            </div>
        </div>

        <div class="tags tags_primary">
            <div title="[product price valid from]-[product price valid until]" class="tag i"></div>
        </div>

        <div class="img"></div>

        <div class="text">
            <div class="title">[name of the product]</div>
            <div class="pack">
                <span class="ptitle"></span>
                <span class="price"> </span>
            </div>
        </div>
        <div class="infopop">
            <div class="in"><span></span>
                <div class="arrow_top"></div>
                <div class="arrow_bottom"></div>
            </div>
        </div>
    </div>                        
</div>

所以唯一的区别是在 <div class="tags tags_primary"> 标签内。 因为使用页面源代码时我实际上可以看到文本,我想我应该能够捕获它? 但是,Selenium 驱动程序只给我 class="tag i tooltipstered" 标签,而不是 class="tag i" ,而后者具有我需要的 title 属性。

我尝试过:

  1. 使用 Actions class MoveToElement() ,但仍然找不到工具提示标题。
  2. 使用 IJavaScriptExecutor 获取 <div class="tags tags_primary">innerHtml ,但没有成功。

如果有人有任何想法,将不胜感激。

更新: 将 @PDHide phyton 代码重写为 c#。 为 IWebDriver 创建扩展:

public static IWebElement WaitUntilVisible(this IWebDriver driver, By itemSpecifier, int secondsTimeout = 10)
{
    var wait = new WebDriverWait(driver, new TimeSpan(0, 0, secondsTimeout));
    var element = wait.Until<IWebElement>(driver =>
    {
        try
        {
            var elementToBeDisplayed = driver.FindElement(itemSpecifier);
            if (elementToBeDisplayed.Displayed)
            {
                return elementToBeDisplayed;
            }
            return null;
        }
        catch (StaleElementReferenceException)
        {
            return null;
        }
        catch (NoSuchElementException)
        {
            return null;
        }
    });
    return element;
}

然后使用它从工具提示中收集文本:

using(IWebDriver _driver = new ChromeDriver())
{
    _driver.Navigate().GoToUrl("https://www.maxima.lt/akcijos");
    _driver.WaitUntilVisible(By.CssSelector("#CybotCookiebotDialogBodyLevelButtonLevelOptinAllowallSelectionWrapper #CybotCookiebotDialogBodyLevelButtonLevelOptinAllowAll")).Click();
    _driver.WaitUntilVisible(By.ClassName("close")).Click();
    var tool = _driver.WaitUntilVisible(By.CssSelector("[class='tag i tooltipstered']"));
    Actions actions = new Actions(_driver);
    actions.MoveToElement(tool).Build().Perform();
    var tooltip = _driver.WaitUntilVisible(By.ClassName("tooltipster-content"));
    Console.WriteLine(tooltip.Text);
}
1个回答
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium import webdriver




driver=webdriver.Chrome()
driver.get("https://www.maxima.lt/akcijos")

#close cookies 
WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located(
        (By.CSS_SELECTOR, '#CybotCookiebotDialogBodyLevelButtonLevelOptinAllowallSelectionWrapper #CybotCookiebotDialogBodyLevelButtonLevelOptinAllowAll'))
).click()


#close ad pop up
WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located(
        (By.CLASS_NAME, 'close'))
).click()

#find information icon
tool = WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located(
        (By.CSS_SELECTOR, '[class="tag i tooltipstered"]'))
)


driver.maximize_window()

#move to information icon
webdriver.ActionChains(driver).move_to_element(tool).perform()

#find the tool tip
tooltip = WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located(
        (By.CLASS_NAME, 'tooltipster-content'))
)

#print the content
print(tooltip.text)

这是 Python 代码,您可以在 C# 中使用相同的逻辑序列

PDHide
2021-03-07