【发布时间】:2016-09-27 20:46:27
【问题描述】:
我想从网站上抓取一些数据并将其存储到 postgree 数据库中,但出现以下错误:
ImportError: 没有名为“设置”的模块
我的文件夹树视图是:
Coder
├── coder_app
│ ├── __init__.py
│ ├── items.py
│ ├── models.py
│ ├── pipelines.py
│ ├── settings.py
│ └── spiders
│ ├── __init__.py
│ └── livingsocial.py
├── output.json
└── scrapy.cfg
蜘蛛代码如下:
# -*- coding: utf-8 -*-
import scrapy
from coder_app.items import CoderItem
# from scrapy.loader import ItemLoader
class LivingsocialSpider(scrapy.Spider):
name = "livingsocial"
allowed_domains = ["livingsocial.com"]
start_urls = (
'http://www.livingsocial.com/cities/15-san-francisco',
)
def parse(self, response):
# deals = response.xpath('//li')
for deal in response.xpath('//li[a//h2/text()]'):
item = CoderItem()
item['title'] = deal.xpath('a//h2/text()').extract_first()
item['link'] = deal.xpath('a/@href').extract_first()
item['location'] = deal.xpath('a//p[@class="location"]/text()').extract_first()
item['price'] = deal.xpath('a//div[@class="deal-price"]/text()').extract_first()
item['end_date'] = deal.xpath('a//p[@class="dates"]/text()').extract_first()
yield item
Pipeline.py 是:
from sqlalchemy.orm import sessionmaker
from coder_app.models import Deals, db_connect, create_deals_table
class CoderPipeline(object):
def process_item(self, item, spider):
return item
class LivingSocialPipeline(object):
"""Livingsocial pipeline for storing scraped items in the database"""
def __init__(self):
"""
Initializes database connection and sessionmaker.
Creates deals table.
"""
engine = db_connect()
create_deals_table(engine)
self.Session = sessionmaker(bind=engine)
def process_item(self, item, spider):
"""Save deals in the database.
This method is called for every item pipeline component.
"""
session = self.Session()
deal = Deals(**item)
try:
session.add(deal)
session.commit()
except:
session.rollback()
raise
finally:
session.close()
return item
Model.py 代码为:
from sqlalchemy import create_engine, Column, Integer, String, DateTime
from sqlalchemy.engine.url import URL
from sqlalchemy.ext.declarative import declarative_base
import settings
DeclarativeBase = declarative_base()
def db_connect():
"""
Performs database connection using database settings from settings.py.
Returns sqlalchemy engine instance
"""
return create_engine(URL(**settings.DATABASE))
def create_deals_table(engine):
"""
Performs table creation
"""
DeclarativeBase.metadata.create_all(engine)
class Deals(DeclarativeBase):
"""Sqlalchemy deals model"""
__tablename__ = "deals"
id = Column(Integer, primary_key=True)
title = Column('title', String)
link = Column('link', String, nullable=True)
location = Column('location', String, nullable=True)
#original_price = Column('original_price', String, nullable=True)
price = Column('price', String, nullable=True)
end_date = Column('end_date', DateTime, nullable=True)
settings.py 是:
BOT_NAME = 'coder_app'
SPIDER_MODULES = ['coder_app.spiders']
NEWSPIDER_MODULE = 'coder_app.spiders'
DATABASE = {
'drivername': 'postgres',
'host': 'localhost',
'port': '5432',
'username': 'mohib',
'password': '100200',
'database': 'scrape'
}
ITEM_PIPELINES = {
'coder_app.pipelines.LivingSocialPipeline': 300,
}
为什么会出现这个错误,我也试过了
code_app.import settings
在model.py中,但后来我收到了这个错误:
NameError: name 'settings' is not defined
我真的被困在这里了。谁能帮帮我?
【问题讨论】:
-
你试过
from code_app import settings吗? -
像魅力一样工作!谢谢你:)
标签: python-3.x scrapy