3.5 KiB
Writing to AWS S3¶
We have setup our demo environment with the label aws passed to reference our s3 access_key and secret_key and file (called friends.csv). In the cell below we will write the data to our aws s3 bucket named com.phi.demo
#
# Writing to mongodb database
#
import transport
from transport import providers
import pandas as pd
_data = pd.DataFrame({"name":['James Bond','Steve Rogers','Steve Nyemba'],'age':[55,150,44]})
mgw = transport.get.writer(label='aws')
mgw.write(_data)
print (transport.__version__)
Reading from AWS S3¶
The cell below reads the data that has been written by the cell above and computes the average age within a mongodb pipeline. The code in the background executes an aggregation using
- Basic read of the designated file friends.csv
- Compute average age using standard pandas functions
NOTE
By design read object are separated from write objects in order to avoid accidental writes to the database. Read objects are created with transport.get.reader whereas write objects are created with transport.get.writer
import transport
from transport import providers
import pandas as pd
def cast(stream) :
print (stream)
return pd.DataFrame(str(stream))
mgr = transport.get.reader(label='aws')
_df = mgr.read()
print (_df)
print ('--------- STATISTICS ------------')
print (_df.age.mean())
An auth-file is a file that contains database parameters used to access the database. For code in shared environments, we recommend
- Having the auth-file stored on disk
- and the location of the file is set to an environment variable.
To generate a template of the auth-file open the file generator wizard found at visit https://healthcareio.the-phi.com/data-transport