如何从 txt 文件中下载图像？答案

【问题标题】：How to download images from a txt file?如何从 txt 文件中下载图像？
【发布时间】：2021-07-31 00:22:06
【问题描述】：

[![在此处输入图像描述][1]][1]我想从维基百科页面下载图像，所以我编写了这个程序，它与所有链接一起保存的 txt 文件，但我不知道如何继续程序下载文件。有人可以帮我吗？

from urllib.request import urlopen
from bs4 import BeautifulSoup
from requests import get 
import urllib.request
import wikipedia
import requests
import re

title = input("Title: ")
link = (wikipedia.page(title).url)
html = urlopen(link)
bs = BeautifulSoup(html, 'html.parser')
images = bs.find_all('img', {'src':re.compile('.jpg')})
f= open("cache.txt","w+")
for image in images: 
    url = ('https:' + image['src']+'\n')
    f.write(url)

【问题讨论】：

这能回答你的问题吗？ Downloading a picture via urllib and python
import urllib.request;urllib.request.urlretrieve(url, filename) 就是这样

标签： python beautifulsoup request wikipedia

【解决方案1】：

我发现这可能有帮助... 它下载一个图像，但其余的 urllib.error.HTTPError：HTTP 错误 404：未找到

import wget
import csv
with open('cache.csv', newline='') as csvfile:
     spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
     for row in spamreader:
         wget.download(', '.join(row))

【讨论】：

【解决方案2】：

您可以使用 wget 模块下载文件。

pip install wget

使用 wget 下载文件

wget.download(url)

您必须遍历 txt 文件中的每一行并使用 wget 下载文件。

python 代码

import wget
import csv


with open("cache.txt","r") as f:
    line = csv.reader(f)
    for i in line:
        wget.download(i[0])

【讨论】：

出现：AttributeError: 'list' object has no attribute 'decode'
如果 URL 保存正确，请检查您的 txt 文件。如果您可以分享您的 csv 文件的屏幕截图会更好。
我在帖子里放了截图
wget.download(i[0]) 会解决问题

【解决方案3】：

我解决了这是代码：

from urllib.request import urlopen
from bs4 import BeautifulSoup
from requests import get 
import urllib.request
import wikipedia
import requests
import re

title = input("Title: ")
link = (wikipedia.page(title).url)
html = urlopen(link)
bs = BeautifulSoup(html, 'html.parser')
images = bs.find_all('img', {'src':re.compile('.jpg')})
f= open("cache.txt","w+")
for image in images: 
    url = ('https:' + image['src']+'\n')
    f.write(url)

with open('cache.txt') as f:
   for line in f:
      url = line
      path = 'image'+url.split('/', -1)[-1]
      urllib.request.urlretrieve(url, path.rstrip('\n'))

【讨论】：