{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "#### Writing to mongodb\n", "\n", "Insure mongodb is actually installed on the system, The cell below creates a dataframe that will be stored within mongodb" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.0.0\n" ] } ], "source": [ "#\n", "# Writing to mongodb database\n", "#\n", "import transport\n", "from transport import providers\n", "import pandas as pd\n", "_data = pd.DataFrame({\"name\":['James Bond','Steve Rogers','Steve Nyemba'],'age':[55,150,44]})\n", "mgw = transport.factory.instance(provider=providers.MONGODB,db='demo',collection='friends',context='write')\n", "mgw.write(_data)\n", "print (transport.__version__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Reading from mongodb\n", "\n", "The cell below reads the data that has been written by the cell above and computes the average age within a mongodb pipeline. The code in the background executes an aggregation using **db.runCommand**\n", "\n", "- Basic read of the designated collection **find=\\**\n", "- Executing an aggregate pipeline against a collection **aggreate=\\**\n", "\n", "**NOTE**\n", "\n", "It is possible to use **transport.factory.instance** or **transport.instance** they are the same. It allows the maintainers to know that we used a factory design pattern." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " name age\n", "0 James Bond 55\n", "1 Steve Rogers 150\n", "--------- STATISTICS ------------\n", " _id _counts _mean\n", "0 0 2 102.5\n" ] } ], "source": [ "\n", "import transport\n", "from transport import providers\n", "mgr = transport.instance(provider=providers.MONGODB,db='foo',collection='friends')\n", "_df = mgr.read()\n", "PIPELINE = [{\"$group\":{\"_id\":0,\"_counts\":{\"$sum\":1}, \"_mean\":{\"$avg\":\"$age\"}}}]\n", "_sdf = mgr.read(aggregate='friends',pipeline=PIPELINE)\n", "print (_df)\n", "print ('--------- STATISTICS ------------')\n", "print (_sdf)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The cell bellow show the content of an auth_file, in this case if the dataset/table in question is not to be shared then you can use auth_file with information associated with the parameters.\n", "\n", "**NOTE**:\n", "\n", "The auth_file is intended to be **JSON** formatted" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'host': 'klingon.io',\n", " 'port': 27017,\n", " 'username': 'me',\n", " 'password': 'foobar',\n", " 'db': 'foo',\n", " 'collection': 'friends',\n", " 'authSource': '',\n", " 'mechamism': ''}" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "{\n", " \"host\":\"klingon.io\",\"port\":27017,\"username\":\"me\",\"password\":\"foobar\",\"db\":\"foo\",\"collection\":\"friends\",\n", " \"authSource\":\"\",\"mechamism\":\"\"\n", "}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 2 }