【问题标题】:python: using threading , apply thread timeoutpython:使用线程,应用线程超时
【发布时间】:2017-04-24 20:17:43
【问题描述】:

我在多线程脚本中使用线程库。我想在线程上实现超时。因此,如果线程在指定的时间后没有返回task_done,它应该退出函数并返回task_done

这是我的代码:

def create_workers():
    for _ in range(NUMBER_OF_THREADS):
        t = threading.Thread(target=work)
        t.daemon = True
        t.start()


def create_jobs():
    for d in Date_set :
        queue.put(d)
    queue.join()
    scrape()


def scrape_page(thread_name, page_url):
    print(thread_name + ' now working on ' + page_url)
    get_active_urls_perDay(session=s,Date=page_url,County=Cty, courtSystem=CS, PT=P)


def work():
    while True:
        url = queue.get()
        scrape_page(threading.current_thread().name, url)
        Date_set.remove(url)
        print str(len(Date_set)) + " days more to go!"
        print "Number of threads active", threading.activeCount()
        queue.task_done()


def scrape():
    queued_links = Date_set
    if len(queued_links) > 0:
        print(str(len(queued_links)) + ' days in the queue')
        create_jobs()

work函数中,我想在线程上实现超时。 否则代码运行正常,但是没有返回task_done的线程会暂停代码并一直等待它们返回。

【问题讨论】:

    标签: python multithreading timeout python-multithreading


    【解决方案1】:
    import threading
    import Queue
    import time
    
    lock = threading.Lock()
    
    Date_set = ['127.0.0.1/test1', '127.0.0.1/test3', '127.0.0.1/test3', '127.0.0.1/test4']
    queue = Queue.Queue()
    NUMBER_OF_THREADS = 3
    
    
    def create_jobs():
        for d in Date_set:
            queue.put(d)
        # scrape()
    
    thread_list = []
    
    def create_workers():
        for _ in range(NUMBER_OF_THREADS):
            t = threading.Thread(target=work)
            thread_list.append(t)
            t.daemon = True
            t.start()
    
    
    def join_all():
        [t.join(5) for t in thread_list]
    
    
    def scrape_page(thread_name, page_url):
        time.sleep(1)
        lock.acquire()
        print(thread_name + ' now working on ' + page_url)
        print page_url + ' done'
        lock.release()
        # get_active_urls_perDay(session=s,Date=page_url,County=Cty, courtSystem=CS, PT=P)
    
    
    def work():
        while True:
            if queue.empty() is True:
                break
            url = queue.get()
            try:
                scrape_page(threading.current_thread().name, url)
                # Date_set.remove(url)
                lock.acquire()
                print str(len(Date_set)) + " days more to go!"
                print "Number of threads active", threading.activeCount()
                lock.release()
            finally:
                queue.task_done()
    
    
    def scrape():
        queued_links = Date_set
        if len(queued_links) > 0:
            print(str(len(queued_links)) + ' days in the queue')
            create_jobs()
    
    
    # s=session
    # Cty= County
    # CS= courtSystem
    # P= PT
    # Date_set = create_dates_set(start_filingDate, end_filingDate)
    create_jobs()
    create_workers()
    join_all()
    print 'main thread quit and all worker thread quit even if it is not finished'
    # scrape()
    # return case_urls
    

    这个例子可以运行,我使用 sleep(200) 模拟get_active_urls_perDay,15 秒后,脚本将停止。如果将 sleep(200) 替换为 sleep(1),所有线程都将完成并退出主线程。

    【讨论】:

    • 它正在工作,但它导致所有线程和主进程在1个线程卡住时立即结束,整个代码成功结束......
    • 哦!不不...您的代码运行良好,它只停止了一个线程...但整个代码也退出了一次。我们能以某种方式阻止整个代码结束吗?
    • stopped just one thread 是什么意思?当所有连接达到超时时,整个代码将结束。
    • 我认为当一个线程中断时,代码会从线程中出来以打印您在那里编写的语句,然后结束,但其他线程继续在后台工作。例如。有 8 个线程在工作。当 1 号线程失败时,代码通过打印 main thread quit and all worker thread quit even if it is not finished 终止,而其余 7 个线程仍在工作。
    • 不,如果只有一个线程破坏了代码,主线程将不会退出,它会退出,直到所有调用 join() 方法的线程终止 - 正常或通过未处理的异常 - 或直到可选发生超时。当主线程退出时,所有工作线程也被杀死(工作线程挂起超过5秒(加入超时)并杀死它们)。
    【解决方案2】:
    def create_jobs():
        for d in Date_set :
            queue.put(d)
        scrape()
    
    def create_workers():
        thread_list=[]
        for _ in range(NUMBER_OF_THREADS):
            t = threading.Thread(target=work)
            thread_list.append(t)
            t.daemon = True
            t.start()
        return thread_list
    
    def join_all(thread_list):
        [t.join(5) for t in thread_list]
    
    
    
    def scrape_page(thread_name, page_url):
        print(thread_name + ' now working on ' + page_url)
        get_active_urls_perDay(session=s,Date=page_url,County=Cty, courtSystem=CS, PT=P)
    
    
    def work():
        while True:
            url = queue.get()
            try:
                scrape_page(threading.current_thread().name, url)
                Date_set.remove(url)
                print str(len(Date_set)) + " days more to go!"
                print "Number of threads active", threading.activeCount()
            finally:
                queue.task_done()
    
    def scrape():
        queued_links = Date_set
        if len(queued_links) > 0:
            print(str(len(queued_links)) + ' days in the queue')
            create_jobs()
    
    
    s=session
    Cty= County
    CS= courtSystem
    P= PT
    Date_set = create_dates_set(start_filingDate, end_filingDate)
    t_list= create_workers()
    join_all(t_list)
    scrape()
    return case_urls
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2016-07-10
      • 2015-10-04
      • 2015-10-30
      • 2014-05-10
      • 1970-01-01
      • 2023-04-03
      • 2015-08-10
      • 2011-01-17
      相关资源
      最近更新 更多