【问题标题】:how to push a csv data to mongodb using python如何使用python将csv数据推送到mongodb
【发布时间】:2015-02-09 13:11:57
【问题描述】:

尝试使用 python 将 csv 数据推送到 mongodb。我是 python 和 mongodb 的初学者。我使用以下代码

import csv
import json
import pandas as pd
import sys, getopt, pprint
from pymongo import MongoClient
#CSV to JSON Conversion
csvfile = open('C://test//final-current.csv', 'r')
jsonfile = open('C://test//6.json', 'a')
reader = csv.DictReader( csvfile )
header= [ "S.No", "Instrument Name", "Buy Price", "Buy Quantity", "Sell Price", "Sell Quantity", "Last Traded Price", "Total Traded Quantity", "Average Traded Price", "Open Price", "High Price", "Low Price", "Close Price", "V" ,"Time"]
#fieldnames=header
output=[]
for each in reader:
    row={}
    for field in header:
        row[field]=each[field]
    output.append(row)

json.dump(output, jsonfile, indent=None, sort_keys=False , encoding="UTF-8")
mongo_client=MongoClient() 
db=mongo_client.october_mug_talk
db.segment.drop()
data=pd.read_csv('C://test//6.json', error_bad_lines=0)
df = pd.DataFrame(data)
records = csv.DictReader(df)
db.segment.insert(records)

但输出是以这种格式给出的

/* 0 */
{
  "_id" : ObjectId("54891c4ffb2a0303b0d43134"),
  "[{\"AverageTradedPrice\":\"0\"" : "BuyPrice:\"349.75\""
}

/* 1 */
{
  "_id" : ObjectId("54891c4ffb2a0303b0d43135"),
  "[{\"AverageTradedPrice\":\"0\"" : "BuyQuantity:\"3000\""
}

/* 2 */
{
  "_id" : ObjectId("54891c4ffb2a0303b0d43136"),
  "[{\"AverageTradedPrice\":\"0\"" : "ClosePrice:\"350\""
}

/* 3 */
{
  "_id" : ObjectId("54891c4ffb2a0303b0d43137"),
  "[{\"AverageTradedPrice\":\"0\"" : "HighPrice:\"0\""
}

实际上我希望输出喜欢单个 id 所有其他字段应显示为子类型 例如:

 _id" : ObjectId("54891c4ffb2a0303b0d43137")
    AveragetradedPrice :0
    HighPrice:0
    ClosePrice:350
    buyprice:350.75

请帮帮我。在此先感谢

【问题讨论】:

  • output.append(row) => db.segment.insert(row)
  • 但是如果我直接推送到 mongodb ,它会产生 InvalidDocument: key 'S.No' must not contain '.'
  • 将标头作为 dict 以将 s.no 映射为 s_no,因此它可以作为 json 键接受
  • 不使用mongoimport有什么特殊原因吗?
  • 我终于搞定了。谢谢

标签: python json mongodb csv


【解决方案1】:

感谢您的建议。这是更正后的代码:

import csv
import json
import pandas as pd
import sys, getopt, pprint
from pymongo import MongoClient
#CSV to JSON Conversion
csvfile = open('C://test//final-current.csv', 'r')
reader = csv.DictReader( csvfile )
mongo_client=MongoClient() 
db=mongo_client.october_mug_talk
db.segment.drop()
header= [ "S No", "Instrument Name", "Buy Price", "Buy Quantity", "Sell Price", "Sell Quantity", "Last Traded Price", "Total Traded Quantity", "Average Traded Price", "Open Price", "High Price", "Low Price", "Close Price", "V" ,"Time"]

for each in reader:
    row={}
    for field in header:
        row[field]=each[field]

    db.segment.insert(row)

【讨论】:

    【解决方案2】:

    假设您的 CSV 中有一个标题行,那么有一个更好的方法可以减少导入次数。

    from pymongo import MongoClient
    import csv
    
    # DB connectivity
    client = MongoClient('localhost', 27017)
    db = client.db
    collection = db.collection
    
    # Function to parse csv to dictionary
    def csv_to_dict():
        reader = csv.DictReader(open(FILEPATH))
        result = {}
        for row in reader:
            key = row.pop('First_value')
            result[key] = row
        return query
    
    # Final insert statement
    db.collection.insert_one(csv_to_dict())
    

    希望有帮助

    【讨论】:

      【解决方案3】:

      最简单的方法是使用 pandas 我的代码是

      import json
      import pymongo
      import pandas as pd
      myclient = pymongo.MongoClient()
      
      df = pd.read_csv('yourcsv.csv',encoding = 'ISO-8859-1')   # loading csv file
      df.to_json('yourjson.json')                               # saving to json file
      jdf = open('yourjson.json').read()                        # loading the json file 
      data = json.loads(jdf)                                    # reading json file 
      

      现在你可以在你的 mangodb 数据库中插入这个 json :-]

      【讨论】:

      • er,显示您离开的部分就是问题所在。
      【解决方案4】:

      为什么要一一插入数据?看看这个。

      import pandas as pd
      from pymongo import MongoClient
      
      client = MongoClient(<your_credentials>)
      database = client['YOUR_DB_NAME']
      collection = database['your_collection']
      
      def csv_to_json(filename, header=None):
          data = pd.read_csv(filename, header=header)
          return data.to_dict('records')
      
      collection.insert_many(csv_to_json('your_file_path'))
      

      请注意,如果文件太大,您的应用程序可能会崩溃。

      【讨论】:

        【解决方案5】:
            from pymongo import MongoClient
            import csv
            import json
            # DB connectivity
            client = MongoClient('localhost', 27017)
            db = client["database name"]
            col = db["collection"]
            # Function to parse csv to dictionary
            def csv_to_dict():
                reader = csv.DictReader(open('File with path','r'))
                result = {}
                for row in reader:
                    key = row.pop('id')
                    result[key]= row
                return result
            
            # Final insert statement
            x=col.insert_one(csv_to_dict())
                    print(x.inserted_id)
        
        # to insert one row
        
        #and to insert many rows following code is to be executed
        from pymongo import MongoClient
        import csv
        # read csv file as a list of lists
        client = MongoClient('localhost', 27017)
        db = client["data base name"]
        col = db["Collection Name"]
        with open('File with path', 'r') as read_obj:
            # pass the file object to reader() to get the reader object
            csv_reader = csv.DictReader(read_obj)
            # Pass reader object to list() to get a list of lists
            mylist = list(csv_reader)
            #print(list_of_rows)
            x = col.insert_many(mylist)
        
            #print list of the _id values of the inserted documents:
            print(x.inserted_ids)
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 2015-11-10
          • 1970-01-01
          • 2020-11-16
          • 1970-01-01
          • 1970-01-01
          • 2017-09-19
          • 1970-01-01
          相关资源
          最近更新 更多