【问题标题】:Using Python to download specific .pdb files from Protein Data Bank使用 Python 从蛋白质数据库下载特定的 .pdb 文件
【发布时间】:2016-09-17 01:46:19
【问题描述】:

我一直在尝试从蛋白质数据库下载 .pdb 文件。我编写了以下代码块来提取这些文件,但是我下载的文件包含网页。

#Sector C - Processing block:
RefinedPDBCodeList = [] #C1
with open('RefinedPDBCodeList') as inputfile:
    for line in inputfile:
         RefinedPDBCodeList.append(line.strip().split(','))

print(RefinedPDBCodeList[0])
['101m.pdb']

import urllib.request      
for i in range(0, 1): #S2 - range(0, len(RefinedPDBCodeList)):
    path=urllib.request.urlretrieve('http://www.rcsb.org/pdb/explore/explore.do?structureId=101m', '101m.pdb')

【问题讨论】:

    标签: python chemistry


    【解决方案1】:

    您的基本网址似乎有误。试试吧:

    urllib.request.urlretrieve('http://files.rcsb.org/download/101M.pdb', '101m.pdb')
    

    【讨论】:

      【解决方案2】:

      BioPython 提供了一种检索方法PDBList.retrieve_pdb_file。但是,这依赖于 PDB FTP 服务。如果由于某种原因(防火墙等)未打开 FTP 端口,则可以使用此功能:

      def download_pdb(pdbcode, datadir, downloadurl="https://files.rcsb.org/download/"):
          """
          Downloads a PDB file from the Internet and saves it in a data directory.
          :param pdbcode: The standard PDB ID e.g. '3ICB' or '3icb'
          :param datadir: The directory where the downloaded file will be saved
          :param downloadurl: The base PDB download URL, cf.
              `https://www.rcsb.org/pages/download/http#structures` for details
          :return: the full path to the downloaded PDB file or None if something went wrong
          """
          pdbfn = pdbcode + ".pdb"
          url = downloadurl + pdbfn
          outfnm = os.path.join(datadir, pdbfn)
          try:
              urllib.request.urlretrieve(url, outfnm)
              return outfnm
          except Exception as err:
              print(str(err), file=sys.stderr)
              return None
      

      【讨论】:

        【解决方案3】:

        该 URL 已更新(虽然旧 URL 暂时重定向到新 URL):

        urllib.request.urlretrieve('https://files.rcsb.org/download/101M.pdb', '101m.pdb')
        

        请参阅https://www.rcsb.org/pdb/static.do?p=download/http/index.html 以获取 RCSB PDB 提供的不同下载的完整 URL 列表。

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2013-11-27
          • 1970-01-01
          • 2018-10-28
          相关资源
          最近更新 更多