In a collaborative environment (Jupyter-x; Zeppelin; AWS Service Workbench)
Plugins: Architecture & Design
Plugins are designed around plugin architecture using Iterator design-pattern. In that respect and function as a pipeline i.e executed sequentially in the order in which they are expressed in the parameter. Effectively the output of one function will be the input to the next.
Data Transport UML Plugin Component View
Quick Start
The code here shows a function that will be registered as "autoincrement".
The data, will always be a pandas.DataFrame
For the sake of this example the file will be my-plugin.py
import transport
import numpy as np
_index = 0
@transport.Plugin(name='autoincrement')
def _incr (_data):
global _index
_data['_id'] = _index + np.arange(_data.shape[0])
_index = _data.shape[0]
return _data
data-transport comes with a built-in command line interface (CLI). It allows plugins to be registered and reused.
Registered functions are stored in $HOME/.data-transport/plugins/code
Any updates to my-plugin.py will require re-registering the file
Additional plugin registry functions (list, test) are available
$ transport plugin-add demo ./my-plugin.py
The following command allows data-transport to determine what is knows about the function i.e real name and name to be used in code.
$ transport plugin-test demo.autoincrement
Once registered, the plugins are ready for use within code or configuration file (auth-file).