【问题标题】:Determine whether static file site contains css inheritance rule判断静态文件站点是否包含css继承规则
【发布时间】:2017-10-27 14:38:29
【问题描述】:

我正在开发一个呈现静态 html 文件的站点,我希望确定站点中的哪些页面包含特定的 css 继承规则,例如 .parent .child(从父级继承的子类)。

我可以想象一个网络爬虫访问这些页面中的每一个,运行测试以查看给定页面是否具有该样式,并返回报告,但是是否有任何工具已经可以很好地为静态文件站点执行此操作(例如,不是 webpack 的 css-tree-shake-plugin)?如果其他人可以就这个问题提供任何见解,我将不胜感激。

【问题讨论】:

    标签: css tree-shaking


    【解决方案1】:

    这是我想出的:

    #!/usr/bin/env python
    
    '''
    from source:
    
    pip install selenium
    pip install beautifulsoup4
    brew install phantomjs
    
    usage: python shake_trees.py '_site' '.parent .child'
    '''
    
    from bs4 import BeautifulSoup
    from selenium import webdriver
    import sys, copy, multiprocessing, os, fnmatch
    
    def clean_href(href):
      return href.split('#')[0].split('?')[0]
    
    def get_web_urls(all_links, visited):
      recurse = False
      for link in copy.copy(all_links):
        if link not in visited:
          visited.add(link)
          driver.get(link)
          for tag in driver.find_elements_by_tag_name('a'):
            href = clean_href(tag.get_attribute('href'))
            if domain in href and href not in visited:
              recurse = True
              all_links.add(href)
      if recurse:
        return get_web_urls(all_links, visited)
      else:
        print(all_links, visited)
        return all_links
    
    def get_static_site_urls():
      matches = []
      for root, dirnames, filenames in os.walk(root_url):
        for filename in fnmatch.filter(filenames, file_match):
          matches.append(os.path.join(root, filename))
      return matches
    
    if __name__ == '__main__':
    
      # parse command line arguments
      root_url = sys.argv[1]
      css_selector = sys.argv[2]
    
      # globals
      domain = root_url
      static_site = False if 'http' in root_url else True
      file_match = '*.html'
    
      # initialize the phantom driver
      driver = webdriver.PhantomJS()
      driver.set_window_size(1000, 1000)
    
      if static_site:
        urls = get_static_site_urls()
      else:
        driver.get( root_url )
        urls = get_web_urls( set([root_url]), set() )
    
      for url in urls:
        if static_site:
          url = 'file://' + os.path.join(os.getcwd(), url)
        driver.get(url)
        soup = BeautifulSoup( driver.page_source, 'html.parser' )
        if soup.select_one(css_selector):
          print('match on', url)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2012-04-11
      • 2010-11-03
      • 2016-09-10
      • 1970-01-01
      • 2015-03-18
      • 1970-01-01
      • 2019-08-17
      • 1970-01-01
      相关资源
      最近更新 更多