Ad
Python: Read CSV From S3 Bucket With `import Csv`
I am currently trying to read a .csv
directly from an AWS S3 bucket. However, I am always receiving a FileNotFoundError
. Weirdly after I can actually see the content of the .csv file.
Traceback (most recent call last): File "<console>", line 1, in <module> FileNotFoundError: [Errno 2] No such file or directory: b',event_id,ds,yhat,yhat_lower,yhat_upper\n0,277,2019-09-04 07:14:08.051643,0.3054256311115928,0.29750667741533227,0.31441960581142636\n'
Here my code:
BUCKET_NAME = 'fbprophet'
FORECAST_DATA_OBJECT = 'forecast.csv'
s3 = boto3.client(
's3',
aws_access_key_id=settings.ML_AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.ML_AWS_SECRET_ACCESS_KEY,
)
obj = s3.get_object(Bucket=BUCKET_NAME, Key=FORECAST_DATA_OBJECT)
data = obj['Body'].read()
with open(data, newline='') as csvfile:
spamreader = csv.reader(io.BytesIO(csvfile), delimiter=' ', quotechar='|')
for row in spamreader:
print(', '.join(row))
And here some content of my .csv file. I ideally I could access each row as a dictionary with row['event_id']. E.g. to access yhat I could just write row['event_id']['yhat]. But currently, that's not how it works at all.
event_id ds yhat yhat_lower yhat_upper
0 277 2019-09-04 7:14:08 0.3054256311 0.2975066774 0.3144196058
0 178 2019-09-28 0.3454256311 0.2275066774 0.3944196058
Ad
Answer
Just get rid of with open(data, newline='') as csvfile:
because open
expects a name of a file on your local filesystem.
You should pass data
to io.BytesIO
directly.
BUCKET_NAME = 'fbprophet'
FORECAST_DATA_OBJECT = 'forecast.csv'
s3 = boto3.client(
's3',
aws_access_key_id=settings.ML_AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.ML_AWS_SECRET_ACCESS_KEY,
)
obj = s3.get_object(Bucket=BUCKET_NAME, Key=FORECAST_DATA_OBJECT)
data = obj['Body'].read().decode('utf-8')
spamreader = csv.reader(io.StringIO(data), delimiter=' ', quotechar='|')
for row in spamreader:
print(', '.join(row))
Edit: Apparently csv.reader
expects strings, not bytes,
so you need to decode the response and wrap data in is.StringIO
instead.
Ad
source: stackoverflow.com
Related Questions
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Can't turn off Javascript using Selenium
- → WebDriver click() vs JavaScript click()
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module
Ad