\ No newline at end of file
diff --git a/www/html/_notes/documentation.html b/www/html/_notes/documentation.html
new file mode 100644
index 0000000..ca30a3f
--- /dev/null
+++ b/www/html/_notes/documentation.html
@@ -0,0 +1,14 @@
+
+
+
Documentation
+
+
The documentation is available within the codebase in a folder called notebooks. You will learn to
+
+ Read/write to supported databases
+ Develop plugins for pre/post processing
+ Use the command line interface (CLI)
+
+ Feel free to reachout to for questions, feature requests
+ steve.l.nyemba@vumc.org or steve@the-phi.com
+
+
\ No newline at end of file
diff --git a/www/html/_notes/features.html b/www/html/_notes/features.html
new file mode 100644
index 0000000..09bedfe
--- /dev/null
+++ b/www/html/_notes/features.html
@@ -0,0 +1,47 @@
+
+
+
Features
+
+
+ Built with pandas & designed for & by AI/ML engineers
+ Support for plugins used in pre/post processing
+ Read from one source & write to many (different databases)
+ Includes lightweight ETL command line interface (CLI)
+
+
+
+
Supported Technologies
+
+
\ No newline at end of file
diff --git a/www/html/_notes/install.html b/www/html/_notes/install.html
new file mode 100644
index 0000000..3b68f48
--- /dev/null
+++ b/www/html/_notes/install.html
@@ -0,0 +1,39 @@
+
+
+
Installation
+
+ The assumption here is the you are using virtual environment virtualenv along with pip.
+ There are various components that can be installed with data-transport. By default SQL support is always available
+
+ by default sql libraries handle Sqlite3+, Microsoft SQLServer, MySQL, Mariadb, PostgreSQL, duckdb
+ nosql : installs libraries to handle mongodb, cloudant & couchdb
+ cloud : libraries to handle bigquery, databricks, S3, nextcloud
+ warehouse : installs libraries to handle apache iceberg, spark, apache drill
+ other : installs libraries to handle http/https, RabbitMQ, Files
+
+
+ You can specify any combination of nosql, cloud, warehouse, other or all, by default sql support is provided.
+
+
+ A command line interface (CLI) is installed and can be tested by running the following to see what
+
+
+ $ transport --help
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/www/html/_notes/intro.html b/www/html/_notes/intro.html
new file mode 100644
index 0000000..462af66
--- /dev/null
+++ b/www/html/_notes/intro.html
@@ -0,0 +1,53 @@
+
+
+
+
+
Quick Start
+ By default basic installation has
+
+
Add a label entry to registry :
+
• CLI, learn how
+
• Graphical User Interface, learn how
+
Import in python code
+
+
+
+
+
+
diff --git a/www/html/_notes/plugins-intro.html b/www/html/_notes/plugins-intro.html
new file mode 100644
index 0000000..90ff0ae
--- /dev/null
+++ b/www/html/_notes/plugins-intro.html
@@ -0,0 +1 @@
+ These are basic python functions with a single argument (data:pd.DataFrame). The functions can be used as a pipeline to be called in the context of pre/post processing.
\ No newline at end of file
diff --git a/www/html/_notes/product.html b/www/html/_notes/product.html
new file mode 100644
index 0000000..8b13789
--- /dev/null
+++ b/www/html/_notes/product.html
@@ -0,0 +1 @@
+
diff --git a/www/html/_notes/registry.html b/www/html/_notes/registry.html
new file mode 100644
index 0000000..16b7cce
--- /dev/null
+++ b/www/html/_notes/registry.html
@@ -0,0 +1,76 @@
+
+
+
What is the registry
+
data-transport uses a registry to store database authentication information and referenced by a human readable label.
+
+
The code uses the labels instead of username,passwords ...
+
This makes sharing notebooks (Jupyter, Zeppelin, ...) without dissipating sensitive information
+
+
+
Initialize Registry
+
+ The registry can be initialized through the command line (CLI) or in code.
+ A folder will be created in your home folder $HOME/.data-transport/transport-registry.json
+
+
+
+
+ Command line (CLI)
+
+
+
+ # creates folders and files needed
+ $ transport registry reset < your email >
+
+
+
+ Create a file http-auth.json and it should contain the following
+
+
+
+ Add the created file http-auth.json to the registry under the label address-db
+
+ $ transport registry --add address-db http-auth.json
+
+
+
+
+
+
+
+ In code
+
+
+import transport
+import io
+import json
+#
+# transport.registry.exists()
+_email = 'steve@the-phi.com'
+transport.registry.init(_email)
+
+#
+# Adding the entry to the registry now that is initialized
+_authStr = {"provider":"http",
+"url":"https://raw.githubusercontent.com/codeforamerica/ohana-api/master/data/sample-csv/addresses.csv"
+}
+file = io.StringIO(json.dumps(_authStr))
+transport.registry.set('address-db',file)
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/www/html/_notes/run-etl.html b/www/html/_notes/run-etl.html
new file mode 100644
index 0000000..2de09c3
--- /dev/null
+++ b/www/html/_notes/run-etl.html
@@ -0,0 +1,23 @@
+
+
+
Run ETL (transport)
+
+ Simple ETL from the commandline (CLI), you can generate a sample ETL configuration,
+
+ $ transport generate ./my-etl-config.json
+
+
+ The configuration file will specify source and targets for the data.
+ source, https and targets are both CSV file and SQLite 3+ database
+
+
+ Perform the ETL Job given the configuration specifications
+
+ $ transport apply ./my-etl-config.json
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/www/html/_notes/source-code.html b/www/html/_notes/source-code.html
new file mode 100644
index 0000000..ea95299
--- /dev/null
+++ b/www/html/_notes/source-code.html
@@ -0,0 +1,90 @@
+
+
+
+
+
+
+
Collaborative development
+
+ 0. In this scenario we assume the registry has been initialized and that an entry has been added (CLI).
+
+ # transport registry add --help
+ $ transport registry add address-db http-auth.json
+
+
+
+
+The python code would look like the following :
+
+import transport
+
+#
+# We are assuming here that the label books-db is an entry in the registry
+
+dbreader = transport.get.reader(label='address-db') # No database credentials
+_df = dbreader.read(sql="SELECT * FROM books where postal_code like '946%' ")
+print (_df.head())
+
+
+
+
+1. Alternatively it is possible to directly use the authentication file dubbed "auth-file".
+
+
+ import transport
+
+ #
+ # We are assuming here that the label books-db is an entry in the registry
+
+ dbreader = transport.get.reader(auth_file='/home/me/http-auth.json') # No database credentials
+ _df = dbreader.read(sql="SELECT * FROM books where postal_code like '946%' ")
+ print (_df.head())
+
+
+
+
+
+
+
+
Non-collaborative development
+
+ In this scenario, we are using connectivity parameters in the code. We do NOT recommend this if the code will be used/shared.
+
+
+
+ import transport
+
+ #
+ # In this scenario we are loading an SQLite3+ database
+ url= "https://raw.githubusercontent.com/codeforamerica/ohana-api/master/data/sample-csv/addresses.csv"
+ _args = {"provider":"http","database"}
+ dbreader = transport.get.reader(**_args) # No database credentials
+ _df = dbreader.read(sql="SELECT * FROM books where postal_code like '946%' ")
+ print (_df.head())
+
+
+
+
+
+
Learn more
+ It is possible to initialize the registry; run ETL from your code as well as from the command line (CLI). We compiled this in notebooks available in our code repository
+
+
+
+
\ No newline at end of file
diff --git a/www/html/_notes/supported.html b/www/html/_notes/supported.html
new file mode 100644
index 0000000..450ad57
--- /dev/null
+++ b/www/html/_notes/supported.html
@@ -0,0 +1,49 @@
+
+
+
+
+
+
Supported Databases
+
+
+
+
+
+
+
+
+
+
+
diff --git a/www/html/_plugins/info.py b/www/html/_plugins/info.py
new file mode 100644
index 0000000..dfa52e7
--- /dev/null
+++ b/www/html/_plugins/info.py
@@ -0,0 +1,49 @@
+import transport
+import info
+from multiprocessing import Process
+import pandas as pd
+from io import StringIO
+import requests
+import cms
+import os
+
+@cms.Plugin(mimetype='application/json')
+def about (**_args):
+ return {'license':info.__license__,'author':transport.__app_name__, 'version':transport.__version__, 'supported':transport.supported().to_html(index=False,col_space=0,justify='left').replace("\n","").replace('border="1"','')}
+#
+# loading notebooks from github
+@cms.Plugin(mimetype='application/json')
+def _data (**_args) :
+ _csv = f"""provider,label,doc,url,notebook
+etl,ETL,"Built-in ETL CLI program",https://healthcareio.the-phi.com/data-transport,etl.ipynb
+mongodb,MongoDB,"mongodb is a NoSQL database developed by mongodb",https://mongodb.com,mongodb.ipynb
+mysql,MySQL,"mysql is a relational database developed and maintained by Oracle",https://www.mysql.com,mysql.ipynb
+postgresql,PostgreSQL,"postgresql is an object - relational database developed and maintained by PostgreSQL Global Development Group",https://www.postgresql.com,postgresql.ipynb
+mssqlserver,MS SQL Server,"SQL Server a relational database server developed by Microsoft",https://www.microsoft.com,mssqlserver.ipynb
+sqlite,sqlite,"sqlite is portable relational database",https://www.sqlite.com,sqlite.ipynb
+s3,AWS S3,"AWS Simple Storage Service",https://www.aws.amazon.com/s3,s3.ipynb
+"""
+ _df = pd.read_csv(StringIO(_csv))
+ return _df.drop_duplicates().to_dict(orient='records')
+# MEM_PRODUCT_FILE = StringIO(f"""_,community,enterprise
+# NoSQL,1,1
+# RDBMS/SQL,1,1
+# Cloud databases,1,1
+# Other (MQTT; Files; https),1,1
+# Warehouse,1,1
+# """)
+@cms.Plugin(mimetype="text/html")
+def product(**_args):
+ _config = _args['config']
+ path = os.sep.join([_config['layout']['location'],_config['layout']['root'],'_assets','data','products.csv'])
+ _df = pd.read_csv(path)
+
+ return _df.to_html(index=0).replace(">",">").replace("<","<").replace("no",'').replace("yes",'')
+
+@cms.Plugin(mimetype="application/json")
+def registry (**_args):
+ if not transport.registry.isloaded() :
+ transport.registry.load()
+ _repo = transport.registry.DATA
+ _data = [{'label':key,'provider':_repo[key]['provider'],'plugins':([] if 'plugins' not in _repo[key] else _repo[key]['plugins'])} for key in _repo if 'provider' in _repo[key]]
+ return _data
\ No newline at end of file
diff --git a/www/html/about/design-and-credit.md b/www/html/about/design-and-credit.md
new file mode 100644
index 0000000..ee58d49
--- /dev/null
+++ b/www/html/about/design-and-credit.md
@@ -0,0 +1,13 @@
+### What is data-transport
+
+The **data-transport** and is intended to read, write to any supported databases transparently. The framework lowers the barrier to adoption by using **Python Pandas**, **SQLAlchemy** as foundational components to its architecture. The architecture allows
+
+1. seamless reads and writes regardless of the database vendor (SQL, NoSQL, Cloud ...)
+2. separates reads from writes to avoid accidental tampering with data
+3. works with pandas for reads and writes
+4. support for plugin (functions) to apply against the data as pre/post processing
+5. supports a lightweight Extract Transform tool (ETL) as a command line interface (CLI)
+
+### Credits
+
+**Data-transport** was designed and developed at the [Health Information Privacy Laboratory](https://hiplab.mc.vanderbilt.edu) at Vanderbilt University Medical Center with the input of all co-workers, interns post-doctoral fellows
diff --git a/www/html/about/feedback.html b/www/html/about/feedback.html
new file mode 100644
index 0000000..e3c2d28
--- /dev/null
+++ b/www/html/about/feedback.html
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/www/html/about/license.html b/www/html/about/license.html
new file mode 100644
index 0000000..5bf2030
--- /dev/null
+++ b/www/html/about/license.html
@@ -0,0 +1,37 @@
+
+
+
+
+
+ , version
+
+
+
+
+
+
MIT License
+
+
+
+
Supported Databases
+
+
+
+
+
+
\ No newline at end of file
diff --git a/www/html/contact.html b/www/html/contact.html
new file mode 100644
index 0000000..bd0fb2e
--- /dev/null
+++ b/www/html/contact.html
@@ -0,0 +1,8 @@
+
Thank you,
+
For considering {{layout.header.title}},
+Please feel free to direct bugs/inquiries to :
+
+
steve.l.nyemba@vumc.org
+
or info@the-phi.com
+
clone the source code here
+
\ No newline at end of file
diff --git a/www/html/docs/html/bigquery.html b/www/html/docs/html/bigquery.html
new file mode 100644
index 0000000..96eaeb0
--- /dev/null
+++ b/www/html/docs/html/bigquery.html
@@ -0,0 +1,7672 @@
+
+
+
+
+
+bigquery
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Insure you have a Google Bigquery service account key on disk
+
The service key location is set as an environment variable BQ_KEY
+
The dataset will be automatically created within the project associated with the service key
+
+
The cell below creates a dataframe that will be stored within Google Bigquery
+
+
+
+
+
+
+
+
+
In [3]:
+
+
+
#
+# Writing to Google Bigquery database
+#
+importtransport
+fromtransportimportproviders
+importpandasaspd
+importos
+
+PRIVATE_KEY=os.environ['BQ_KEY']#-- location of the service key
+DATASET='demo'
+_data=pd.DataFrame({"name":['James Bond','Steve Rogers','Steve Nyemba'],'age':[55,150,44]})
+bqw=transport.get.writer(provider=providers.BIGQUERY,dataset=DATASET,table='friends',private_key=PRIVATE_KEY)
+bqw.write(_data,if_exists='replace')#-- default is append
+print(['data transport version ',transport.__version__])
+
The cell below reads the data that has been written by the cell above and computes the average age within a Google Bigquery (simple query).
+
+
Basic read of the designated table (friends) created above
+
Execute an aggregate SQL against the table
+
+
NOTE
+
By design read object are separated from write objects in order to avoid accidental writes to the database.
+Read objects are created with transport.get.reader whereas write objects are created with transport.get.writer
Downloading: 100%|██████████|
+Downloading: 100%|██████████|
+ name age
+0 James Bond 55
+1 Steve Rogers 150
+2 Steve Nyemba 44
+--------- STATISTICS ------------
+ _counts f0_
+0 3 83.0
+
+
+
+
+
+
+
+
+
+
+
+
+
An auth-file is a file that contains database parameters used to access the database.
+For code in shared environments, we recommend
+
+
Having the auth-file stored on disk
+
and the location of the file is set to an environment variable.
The example below reads data from an http source (github) and will copy the data to a csv file and to a database. This example illustrates the one-to-many ETL features.
+
+
+
+
+
+
+
+
+
In [2]:
+
+
+
#
+# Writing to Google Bigquery database
+#
+importtransport
+fromtransportimportproviders
+importpandasaspd
+importos
+
+#
+#
+source={"provider":"http","url":"https://raw.githubusercontent.com/codeforamerica/ohana-api/master/data/sample-csv/addresses.csv"}
+target=[{"provider":"files","path":"addresses.csv","delimiter":","},{"provider":"sqlite","database":"sample.db3","table":"addresses"}]
+
+_handler=transport.get.etl(source=source,target=target)
+_data=_handler.read()#-- all etl begins with data being read
+_data.head()
+
The cell below reads the data that has been written by the cell above and computes the average age within a mongodb pipeline. The code in the background executes an aggregation using db.runCommand
+
+
Basic read of the designated collection find=<collection>
+
Executing an aggregate pipeline against a collection aggreate=<collection>
+
+
NOTE
+
By design read object are separated from write objects in order to avoid accidental writes to the database.
+Read objects are created with transport.get.reader whereas write objects are created with transport.get.writer
Insure the Microsoft SQL Server is installed and you have access i.e account information
+
The target database must be created before hand.
+
We created an authentication file that will contain user account and location of the database
+
+
The cell below creates a dataframe that will be stored in a Microsoft SQL Server database.
+
NOTE This was not tested with a cloud instance
+
+
+
+
+
+
+
+
+
In [ ]:
+
+
+
#
+# Writing to Google Bigquery database
+#
+importtransport
+fromtransportimportproviders
+importpandasaspd
+importos
+
+AUTH_FOLDER=os.environ['DT_AUTH_FOLDER']#-- location of the service key
+MSSQL_AUTH_FILE=os.sep.join([AUTH_FOLDER,'mssql.json'])
+
+_data=pd.DataFrame({"name":['James Bond','Steve Rogers','Steve Nyemba'],'age':[55,150,44]})
+msw=transport.get.writer(provider=providers.MSSQL,table='friends',auth_file=MSSQL_AUTH_FILE)
+msw.write(_data,if_exists='replace')#-- default is append
+print(['data transport version ',transport.__version__])
+
The cell below reads the data that has been written by the cell above and computes the average age within an MS SQL Server (simple query).
+
+
Basic read of the designated table (friends) created above
+
Execute an aggregate SQL against the table
+
+
NOTE
+
By design read object are separated from write objects in order to avoid accidental writes to the database.
+Read objects are created with transport.get.reader whereas write objects are created with transport.get.writer
+
+
+
+
+
+
+
+
+
In [ ]:
+
+
+
importtransport
+fromtransportimportproviders
+importos
+AUTH_FOLDER=os.environ['DT_AUTH_FOLDER']#-- location of the service key
+MSSQL_AUTH_FILE=os.sep.join([AUTH_FOLDER,'mssql.json'])
+
+msr=transport.get.reader(provider=providers.MSSQL,table='friends',auth_file=MSSQL_AUTH_FILE)
+_df=msr.read()
+_query='SELECT COUNT(*) _counts, AVG(age) from friends'
+_sdf=msr.read(sql=_query)
+print(_df)
+print('\n--------- STATISTICS ------------\n')
+print(_sdf)
+
+
+
+
+
+
+
+
+
+
+
+
+
An auth-file is a file that contains database parameters used to access the database.
+For code in shared environments, we recommend
+
+
Having the auth-file stored on disk
+
and the location of the file is set to an environment variable.
The cell below reads the data that has been written by the cell above and computes the average age within a MySQL (simple query).
+
+
Basic read of the designated table (friends) created above
+
Execute an aggregate SQL against the table
+
+
NOTE
+
By design read object are separated from write objects in order to avoid accidental writes to the database.
+Read objects are created with transport.get.reader whereas write objects are created with transport.get.writer
The cell below reads the data that has been written by the cell above and computes the average age within a PostreSQL (simple query).
+
+
Basic read of the designated table (friends) created above
+
Execute an aggregate SQL against the table
+
+
NOTE
+
By design read object are separated from write objects in order to avoid accidental writes to the database.
+Read objects are created with transport.get.reader whereas write objects are created with transport.get.writer
We have setup our demo environment with the label aws passed to reference our s3 access_key and secret_key and file (called friends.csv). In the cell below we will write the data to our aws s3 bucket named com.phi.demo
The cell below reads the data that has been written by the cell above and computes the average age within a mongodb pipeline. The code in the background executes an aggregation using
+
+
Basic read of the designated file friends.csv
+
Compute average age using standard pandas functions
+
+
NOTE
+
By design read object are separated from write objects in order to avoid accidental writes to the database.
+Read objects are created with transport.get.reader whereas write objects are created with transport.get.writer
The cell below reads the data that has been written by the cell above and computes the average age within a PostreSQL (simple query).
+
+
Basic read of the designated table (friends) created above
+
Execute an aggregate SQL against the table
+
+
NOTE
+
By design read object are separated from write objects in order to avoid accidental writes to the database.
+Read objects are created with transport.get.reader whereas write objects are created with transport.get.writer
+ Preview notebooks with database providers/vendors
+ The notebooks show how to use data-transport as a library
+
+
+
+
+
+
+
Generated: auth-file
+
+ An auth-file is a file used to store database parameters.
+ Copy the code above to the auth-file and fill with appropriate values
+ Attributes with zero i.e 0 are optional
+
+
+
+
+ Note:
+ The database providers/vendors above is exhaustive
+ Generate files for database provider/vendors click here
+
+
+
+
+
+
+
+
+
Jupyter Notebook preview
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/www/html/docs/plugins.html b/www/html/docs/plugins.html
new file mode 100644
index 0000000..f239221
--- /dev/null
+++ b/www/html/docs/plugins.html
@@ -0,0 +1,166 @@
+
+
+
+
+
+
Plugins: Usage & Development
+
+
+
+
+
+
Plugins: Registry
+
+ The plugins registry is a registry of plugins intended to be used in pre/post processing. This feature comes in handy :
+
In a collaborative environment (Jupyter-x; Zeppelin; AWS Service Workbench)
+
+
+
+
+
+
Plugins: Architecture & Design
+ Plugins are designed around plugin architecture using Iterator design-pattern. In that respect and function as a pipeline i.e executed sequentially in the order in which they are expressed in the parameter. Effectively the output of one function will be the input to the next.
+
+
+
+
Data Transport UML Plugin Component View
+
+
+
+
+
+
+
Quick Start
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
The code here shows a function that will be registered as "autoincrement".
+
The data, will always be a pandas.DataFrame
+
For the sake of this example the file will be my-plugin.py
+
+
+
+
+ import transport
+
import numpy as np
+
+ _index = 0
+
@transport.Plugin(name='autoincrement')
+
def _incr (_data):
+
+
global _index
+
_data['_id'] = _index + np.arange(_data.shape[0])
+
_index = _data.shape[0]
+
return _data
+
+
+
+
+
+
+
+
+
+
+
+ data-transport comes with a built-in command line interface (CLI). It allows plugins to be registered and reused.
+
+
Registered functions are stored in $HOME/.data-transport/plugins/code
+
Any updates to my-plugin.py will require re-registering the file
+
Additional plugin registry functions (list, test) are available
+
+
+
+
+
+ $ transport plugin-add demo ./my-plugin.py
+
+
+ The following command allows data-transport to determine what is knows about the function i.e real name and name to be used in code.
+
+ $ transport plugin-test demo.autoincrement
+
+
+
+
+
+
+
+
+
+ Once registered, the plugins are ready for use within code or configuration file (auth-file).
+
\ No newline at end of file
diff --git a/www/html/docs/quick_start.html b/www/html/docs/quick_start.html
new file mode 100644
index 0000000..75e29f7
--- /dev/null
+++ b/www/html/docs/quick_start.html
@@ -0,0 +1,114 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
0. Installation
+
+ Install using pip or from code and test installation from https://dev.the-phi.com/git/qcms/cms
+
+
+
+
1. Initialize registry
+
+ Keep database authentication parameters out of sight, in a collaborative environment
+
+
+
+
+
2. Import into Code
+
+ How to use {{layout.header.title}} in python source code (notebooks, or files)
+
+
+
+
+
3. Source code
+
+ {{layout.header.title}} is available uner MIT license and available on github. Contribute, share at will ;-)
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/www/html/docs/source-code.html b/www/html/docs/source-code.html
new file mode 100644
index 0000000..e69de29
diff --git a/www/html/docs/transport.html b/www/html/docs/transport.html
new file mode 100644
index 0000000..a205659
--- /dev/null
+++ b/www/html/docs/transport.html
@@ -0,0 +1,186 @@
+
+
+
+
+
+
ETL: Introduction
+ Extract Load & Transform (ETL) consists in copying data from one database to one or many others. This can be done in two different ways:
+
+
Command Line Interface (CLI), driven by JSON configuration
+
Or within custom python code
+
+ The ETL process will take advantage of registries for plugins and labeled database connectivity to perform pre/post processing tasks.
+
+
+
+
+
ETL: Command Line Interface
+
+ The configuration file needed to run the ETL is a JSON formatted file where each entry contains:
+
+
source with the content of an auth-file
+
target with list of elements of an auth-file
+
+
+
+ The CLI (transport), is capable of generating a demo ETL :
+
+
with source: reads CSV data from github
+
and target: writes the data to CSV & SQLite3 database
+ The command-line interface should be instructed to run the ETL by calling the apply function.
+
+
+
+ $ transport apply ./demo-etl.json
+
+
+
+ Additional parameters can be invoked by providing the --help switch
+
+
+
+
+ $ transport apply --help
+
+
+
+
+
+
+ The following examples shows simple configuration files that do NOT require any database to be installed. Feel free to change and edit at your own discression.
+
+
+
Example # 1: Basic ETL
+
+
+
+
+
+
+
+
+
+
+
+ data-transport comes with a CLI integrated that will
+
+
generate an EL configuration file
+
+ $ transport generate ./demo-etl.json
+
+
+
+
NOTE:The configuration file supports labels and/or plugins, these would have to be done manually
+
+
+
+
Copy the content and save it to a file "demo-etl.json"
\ No newline at end of file
diff --git a/www/html/index.html b/www/html/index.html
new file mode 100644
index 0000000..d2acc3e
--- /dev/null
+++ b/www/html/index.html
@@ -0,0 +1,194 @@
+
+
+
+
+
+
+
+
+
+
Read, Write & Stream Data
+
Anywhere
+
+ version
+
+
+
+
+
+
+
+
+
Open-Source
+
+
Built with python, easy install and available under MIT license on github
+ Uses a registry to prevent sensitive data connectivity to be in a notebook (Zeppelin, Jypterlab ...)
+
+
+
+
+
Pipelines
+
+
+ Apply user-defined functions across database technologies for consistent results
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ info@the-phi.com
+
Contact US
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/www/html/setup/configurator.html b/www/html/setup/configurator.html
new file mode 100644
index 0000000..d6ddca5
--- /dev/null
+++ b/www/html/setup/configurator.html
@@ -0,0 +1,83 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Source: Database Technology
+
Select a database technology as a source
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/www/html/wizard/wizard.html b/www/html/wizard/wizard.html
new file mode 100644
index 0000000..c30e39d
--- /dev/null
+++ b/www/html/wizard/wizard.html
@@ -0,0 +1,199 @@
+
+
+
+
+
+
Wizard: auth file generator
+
+
This wizard generates an auth-file. It is a template file to be used to setup a data-transport database connectivity to help with best practice when it comes to sensitive information in code.
+
+ search for the database provider / vendors
+ click on the vendor and copy the generated code to a file
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ 0 found
+
+
+
+
+
+
+
+
+
+
+
+
+ Note :
+ Copy the code above to the auth-file and fill with appropriate values
+ Attributes with zero i.e 0 are optional
+
+
+
+
+
+
+
+
+
+
Prerequisites
+
+
Familiarity with JSON format
+
Understand your current database security access policy
+ Insure your policy (permissions) match your use case
+
+
+
+
Thing to know
+
+
Values assigned to attributes
+ value of one i.e 1 suggests a value must be provided
+ value of zero i.e 0 suggests the attribute is optional and can be removed
+
+
+ Supported databases (or database providers) to use in search
+
+