【发布时间】:2021-03-09 20:21:08
【问题描述】:
全部。我有以下代码读取并返回 s3 中 JSON 文件的一些值。此代码使用多线程。我的问题是如何修改它以使用 asyncio 来代替
def get_keys_from_prefix(bucket, prefix):
"""
function to get key from S3 and return a list of keys
"""
keys_list = []
paginator = s3.meta.client.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket=bucket, Prefix=prefix):
keys = [content['Key'] for content in page.get('Contents')]
for obj in keys:
if obj.endswith('.json'):
keys_list.append(obj)
return keys_list
def read_json_file_from_s3(bucket, key):
"""
function to read content of Json file and print location
"""
try:
obj = boto3.client('s3').get_object(Bucket=bucket, Key=key)
data = obj['Body'].read().decode('utf-8')
json_content = json.loads(data)
Info = json_content['info']
location = Info.get("location")
print (key)
print (location)
except:
pass
def multithreading ():
bucket = "bucket-name"
prefix = "prefix"
start = time.perf_counter()
key_list = get_keys_from_prefix(bucket, prefix)
with ThreadPoolExecutor() as executor:
executor.map(read_json_file_from_s3, repeat(bucket), key_list)
executor.shutdown(wait=True)
finish = time.perf_counter()
print(f'Finished in {round(finish - start, 2)} second(s)')
multithreading ()
【问题讨论】:
-
StackOverflow 不是代码编写服务。请展示您自己解决问题的尝试,我们将帮助您解决遇到的问题。
标签: python-3.x multithreading asynchronous amazon-s3 python-asyncio