You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
4.5 KiB
4.5 KiB
None
<html lang="en">
<head>
</head>
</html>
Writing to Google Bigquery¶
- Insure you have a Google Bigquery service account key on disk
- The service key location is set as an environment variable BQ_KEY
- The dataset will be automatically created within the project associated with the service key
The cell below creates a dataframe that will be stored within Google Bigquery
In [1]:
#
# Writing to Google Bigquery database
#
import transport
from transport import providers
import pandas as pd
import os
PRIVATE_KEY = os.environ['BQ_KEY'] #-- location of the service key
DATASET = 'demo'
_data = pd.DataFrame({"name":['James Bond','Steve Rogers','Steve Nyemba'],'age':[55,150,44]})
bqw = transport.factory.instance(provider=providers.BIGQUERY,dataset=DATASET,table='friends',context='write',private_key=PRIVATE_KEY)
bqw.write(_data,if_exists='replace') #-- default is append
print (['data transport version ', transport.__version__])
Reading from Google Bigquery¶
The cell below reads the data that has been written by the cell above and computes the average age within a Google Bigquery (simple query).
- Basic read of the designated table (friends) created above
- Execute an aggregate SQL against the table
NOTE
It is possible to use transport.factory.instance or transport.instance they are the same. It allows the maintainers to know that we used a factory design pattern.
In [2]:
import transport
from transport import providers
import os
PRIVATE_KEY=os.environ['BQ_KEY']
pgr = transport.instance(provider=providers.BIGQUERY,dataset='demo',table='friends',private_key=PRIVATE_KEY)
_df = pgr.read()
_query = 'SELECT COUNT(*) _counts, AVG(age) from demo.friends'
_sdf = pgr.read(sql=_query)
print (_df)
print ('--------- STATISTICS ------------')
print (_sdf)
The cell bellow show the content of an auth_file, in this case if the dataset/table in question is not to be shared then you can use auth_file with information associated with the parameters.
NOTE:
The auth_file is intended to be JSON formatted
In [3]:
{
"dataset":"demo","table":"friends"
}
Out[3]:
In [ ]: