bug fixes, enhancements mongodb

pull/1/head
Steve Nyemba 2 years ago
parent b8fb538ec7
commit 1c254eb133

@ -42,11 +42,13 @@ Once installed **data-transport** can be used as a library in code or a command
## Data Transport as a Library (in code) ## Data Transport as a Library (in code)
--- ---
The data-transport can be used within code as a library The data-transport can be used within code as a library, and offers the following capabilities:
* Read/Write against [mongodb](https://github.com/lnyemba/data-transport/wiki/mongodb) * Read/Write against [mongodb](https://github.com/lnyemba/data-transport/wiki/mongodb)
* Read/Write against tranditional [RDBMS](https://github.com/lnyemba/data-transport/wiki/rdbms) * Read/Write against tranditional [RDBMS](https://github.com/lnyemba/data-transport/wiki/rdbms)
* Read/Write against [bigquery](https://github.com/lnyemba/data-transport/wiki/bigquery) * Read/Write against [bigquery](https://github.com/lnyemba/data-transport/wiki/bigquery)
* ETL CLI/Code [ETL](https://github.com/lnyemba/data-transport/wiki/etl) * ETL CLI/Code [ETL](https://github.com/lnyemba/data-transport/wiki/etl)
* Support for pre/post conditions i.e it is possible to specify queries to run before or after a read or write
The read/write functions make data-transport a great candidate for **data-science**; **data-engineering** or all things pertaining to data. It enables operations across multiple data-stores(relational or not) The read/write functions make data-transport a great candidate for **data-science**; **data-engineering** or all things pertaining to data. It enables operations across multiple data-stores(relational or not)
@ -60,7 +62,7 @@ It is possible to perform ETL within custom code as follows :
import transport import transport
import time import time
_info = [{source:{'provider':'sqlite','path':'/home/me/foo.csv','table':'me'},target:{provider:'bigquery',private_key='/home/me/key.json','table':'me','dataset':'mydataset'}}, ...] _info = [{source:{'provider':'sqlite','path':'/home/me/foo.csv','table':'me',"pipeline":{"pre":[],"post":[]}},target:{provider:'bigquery',private_key='/home/me/key.json','table':'me','dataset':'mydataset'}}, ...]
procs = transport.factory.instance(provider='etl',info=_info) procs = transport.factory.instance(provider='etl',info=_info)
# #
# #

@ -8,7 +8,7 @@ def read(fname):
return open(os.path.join(os.path.dirname(__file__), fname)).read() return open(os.path.join(os.path.dirname(__file__), fname)).read()
args = { args = {
"name":"data-transport", "name":"data-transport",
"version":"1.6.1", "version":"1.6.2",
"author":"The Phi Technology LLC","author_email":"info@the-phi.com", "author":"The Phi Technology LLC","author_email":"info@the-phi.com",
"license":"MIT", "license":"MIT",
"packages":["transport"]} "packages":["transport"]}

@ -192,6 +192,10 @@ class SQLReader(SQLRW,Reader) :
_sql = _sql.replace(":fields",_fields) _sql = _sql.replace(":fields",_fields)
if 'limit' in _args : if 'limit' in _args :
_sql = _sql + " LIMIT "+str(_args['limit']) _sql = _sql + " LIMIT "+str(_args['limit'])
#
# @TODO:
# It is here that we should inspect to see if there are any pre/post conditions
#
return self.apply(_sql) return self.apply(_sql)
def close(self) : def close(self) :
try: try:

Loading…
Cancel
Save