data-maker/pipeline.py

#!/usr/bin/env python3
	@staticmethod
		df = pd.read_gbq(SQL,credentials=credentials,dialect='standard')
		else :

	def shuffle(self,_args):
			for _item in _schema :
				if _item['type'] in ['DATE','TIMESTAMP','DATETIME'] :
					_df[_item['name']] = _df[_item['name']].astype(str)			
			writer.write(_df,schema=_schema,table=args['from'])
		else:
			writer.write(_df,table=args['from'])
		_cast = {}
			args['data'] = args['data'][ list(set(df.columns)- set(_cols))]
				# we need to format the fields here to make sure we have something cohesive
				if set(df.columns) & set(_df.columns) :
		
		# info = {"full":{_id:_fname,"rows":_args['data'].shape[0]},"partial":{"path":_pname,"rows":data_comp.shape[0]} }
		# if partition :
		# 	info ['partition'] = int(partition)
		# logger.write({"module":"generate","action":"write","input":info} )
		args['batch_size']	= 2000 #if 'batch_size' not in args else int(args['batch_size'])

				job.name = 'Trainer # ' + str(index)
			time.sleep(2)
bug fix with installer within branch 5 years ago			`#!/usr/bin/env python3`
pipeline 5 years ago			`@staticmethod`
bug fix: multiple conditions on statement 5 years ago			`df = pd.read_gbq(SQL,credentials=credentials,dialect='standard')`
bug fix and upgrades to base functionalities 5 years ago			`else :`
gpu indexing 4 years ago
bug fix data type and pipeline 5 years ago			`def shuffle(self,_args):`
adding shuffle feature to be used for very large spaces 4 years ago			`for _item in _schema :`
			`if _item['type'] in ['DATE','TIMESTAMP','DATETIME'] :`
			`_df[_item['name']] = _df[_item['name']].astype(str)`
			`writer.write(_df,schema=_schema,table=args['from'])`
			`else:`
			`writer.write(_df,table=args['from'])`
fix: table schema (urgh) 5 years ago			`_cast = {}`
... 4 years ago			`args['data'] = args['data'][ list(set(df.columns)- set(_cols))]`
bug fix: zeros matrix and continuous variables 4 years ago			`# we need to format the fields here to make sure we have something cohesive`
bug fixes: design improvements 4 years ago			`if set(df.columns) & set(_df.columns) :`
.. 4 years ago
bug fixes: design improvements 4 years ago			`# info = {"full":{_id:_fname,"rows":_args['data'].shape[0]},"partial":{"path":_pname,"rows":data_comp.shape[0]} }`
			`# if partition :`
			`# info ['partition'] = int(partition)`
			`# logger.write({"module":"generate","action":"write","input":info} )`
bug fix ... (hopfully makes a difference) 5 years ago			`args['batch_size'] = 2000 #if 'batch_size' not in args else int(args['batch_size'])`
dataset ... (fix) 5 years ago
bug fixes: design improvements 4 years ago			`job.name = 'Trainer # ' + str(index)`
feature: bootstrap-like with candidates 4 years ago			`time.sleep(2)`