【问题标题】:Cannot insert Data into MongoDb: TypeError: document must be an instance of dict, bson.son.SON, bson.raw_bson.RawBSONDocument,无法将数据插入 MongoDb:TypeError:文档必须是 dict、bson.son.SON、bson.raw_bson.RawBSONDocument 的实例,
【发布时间】:2018-09-11 11:05:32
【问题描述】:

无法在变量的帮助下将数据插入 mongodb,但是如果我打印输出并将其粘贴到 db.collection.insert_many(output) 中,则代码可以正常运行并存储数据,但我直接使用 db.collection.insert_many( output_final) 它给了我错误。请帮忙。

import re
import json
from selenium import webdriver
from bs4 import  BeautifulSoup
import requests
from fake_useragent import  UserAgent
import pymongo
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
db = myclient["db_db"]


ua          = UserAgent()
header      = {'user-agent':ua.chrome}
driver      = webdriver.Chrome('C:/Users/MUNTAZIR/Downloads/Compressed/chromedriver_win32/chromedriver.exe')
driver.get('https://www.eduvision.edu.pk/scholarships/index.php?authority=1&level=4&field=1&cat=2&type=1')
# tr = driver.find_element_by_id('ctl00_ctl42_g_7f68baae_5353_4bdd_bfe1_b88e3367234f_csr1_table')
soup = BeautifulSoup(driver.page_source, 'lxml')
scholar = soup.findAll("div", {"class": "card-content col-xs-12"})
s_output1 = ""
for s in scholar[0:1]:
        title=s.findAll("h2")[0].text
        desc = s.findAll("div", {"class": "text"})[0].text.replace("\n", "").replace('"','')
        url = "https://www.eduvision.edu.pk/scholarships/" + s.a['href']
        type= "Higher Education Commission"
        # print(type +"\n" +title +"\n" +desc +"\n" +url +"\n")
        s_output1 = ("{""\n"
             '"type"' + ":" + '" ' + type + ' ",' + "\n"
             '"title"' + ":" + '" ' + title + ' ",' + "\n"
             '"url"' + ":" + '" ' + url + ' ",' + "\n"
             '"description"' + ":" + '" ' + desc + ' "' + "\n"
             "}""\n"
             )
s_output2_d = ""
for s in scholar[1:]:
        title=s.findAll("h2")[0].text
        desc=s.findAll("div",{"class": "text"})[0].text.replace("\n", "").replace('"','')
        url = "https://www.eduvision.edu.pk/scholarships/" + s.a['href']
        type= "Higher Education Commission"
        s_output2 = (",{""\n"
             '"type"' + ":" + '" ' + type + ' ",' + "\n"
             '"title"' + ":" + '" ' + title + ' ",' + "\n"
             '"url"' + ":" + '" ' + url + ' ",' + "\n"
             '"description"' + ":" + '" ' + desc + ' "' + "\n"
             "}""\n"
             )
        s_output2_d += s_output2

output_final = ""
output_final += s_output1 + s_output2_d
print(output_final)

db.collection2.insert_many(output_final)
print("saved")
driver.close()

【问题讨论】:

  • 你的 output_final 是一个字符串。 insert_many 接受一个字典列表。

标签: python database mongodb insert


【解决方案1】:

Mongo 的 insert_many 采用实际字典的实际列表,而不是如果“评估”将产生字典列表的字符串。所以,这是有效的:

db.collection.insert_many([{'a':1, 'b':2}, {'a':3, 'b':4}]) # list of dictionaries

这是无效的:

db.collection.insert_many("[{'a':1, 'b':2}, {'a':3, 'b':4}]") # string

编辑:如何将字符串变成列表?不要做一个字符串,做一个列表!例如(未测试):

list_of_s_output_1s = []
for s in scholar[0:1]:
        title=s.findAll("h2")[0].text
        description = s.findAll("div", {"class": "text"})[0].text.replace("\n", "").replace('"','')
        url = "https://www.eduvision.edu.pk/scholarships/" + s.a['href']
        type= "Higher Education Commission"
        # print(type +"\n" +title +"\n" +desc +"\n" +url +"\n")
        s_output_1 = {
            "type": type, 
            "title": title, 
            "url": url, 
            "description": description
        }
        list_of_s_output_1s.append(s_output_1)

list_of_s_s_output_2s = []
for s in scholar[1:]:
        title = s.findAll("h2")[0].text
        description = s.findAll("div",{"class": "text"})[0].text.replace("\n", "").replace('"','')
        url = "https://www.eduvision.edu.pk/scholarships/" + s.a['href']
        type= "Higher Education Commission"
        s_output_2 = {
             "type": type,
             "title": title,
             "url": url,
             "description": description
            }
        list_of_s_output_2s.append(s_output_2)

output_final = list_of_s_output_1s + list_of_s_output_2s

db.collection2.insert_many(output_final)
print("saved")
driver.close()

【讨论】:

  • 那我现在该怎么办。我需要将字符串输出转换为字典吗?请建议我如何将 output_final 数据保存到 mongodb
  • 查看编辑。不要转换它。只需创建一个列表而不是字符串。
  • @muntazirabbas,这回答了你的问题吗?
猜你喜欢
  • 1970-01-01
  • 2021-03-03
  • 1970-01-01
  • 2021-06-20
  • 2017-08-14
  • 1970-01-01
  • 1970-01-01
  • 2022-09-28
  • 2011-02-22
相关资源
最近更新 更多