Database operations¶

All operations available for databases. Supported databases: all that SqlAlchemy support

Connect¶

Db.connect(url: str)¶

Connect to the database and set it as main database

Parameters:	url (str) – path to the database, uses the Sqlalchemy format
Example:	`ds.connect("sqlite:///mydb.slqite")`

Load data¶

Db.load(table: str)¶

Set the main dataframe from a table’s data

Parameters:	table (str) – table name
Example:	`ds.load("mytable")`

Db.load_django(query: django query)¶

Load the main dataframe from a django orm query

Parameters:	query (django query) – django query from a model
Example:	`ds.load_django(Mymodel.objects.all())`

Db.load_django_(query: django query) → Ds¶

Returns a DataSwim instance from a django orm query

Parameters:	query (django query) – django query from a model
Returns:	a dataswim instance with data from a django query
Return type:	Ds
Example:	`ds2 = ds.load_django_(Mymodel.objects.all())`

Infos¶

Db.tables()¶

Print the existing tables in a database

Example:	`ds.tables()`

Db.tables_() → list¶

Return a list of the existing tables in a database

Returns:	list of the table names
Return type:	list
Example:	`tables = ds.tables_()`

Db.table(name: str)¶

Display info about a table: number of rows and columns

Parameters:	name (str) – name of the table
Example:	`tables = ds.table("mytable")`

Insert¶

Insert.to_db(table: str, dtypes: List[sqlalchemy.sql.sqltypes.SchemaType] = None)¶

Save the main dataframe to the database

Parameters:	table (str) – the table to create dtypes (List[SchemaType], optional) – SqlAlchemy columns type, defaults to None, will be infered if not provided

Insert.update_table(table: str, pks: List[str] = ['id'], mirror: bool = True)¶

Update records in a database table from the main dataframe

Parameters:	table (str) – table to update pks (List[str], optional :param mirror: delete the rows not in the new datataset) – if rows with matching pks exist they will be updated, otherwise a new row is inserted in the table, defaults to [“id”]

Insert.insert(table: str, records: dict, create_cols: bool = False, dtypes: List[sqlalchemy.sql.sqltypes.SchemaType] = None)¶

Insert one or many records in the database from a dictionary: or a list of dictionaries

Parameters:	table (str) – the table to insert into records (dict) – a dictionnary or list of dictionnaries of the data to insert create_cols (bool, optional) – create the columns if they don’t exist, defaults to False dtypes (SchemaType, optional) – list of SqlAlchemy table types, defaults to None. The types are infered if not provided

Insert.upsert(table: str, record: dict, create_cols: bool = False, dtypes: List[sqlalchemy.sql.sqltypes.SchemaType] = None, pks: List[str] = ['id'])¶

Upsert a record in a table

Parameters:

table (str) – the table to upsert into
record (dict) – dictionary with the data to upsert
create_cols (bool, optional) – create the columns if it doesn’t exist, defaults to False
dtypes (List[SchemaType], optional) – list of SqlAlchemy column types, defaults to None
pks (List[str], optional) – if rows with matching pks exist they will be updated, otherwise a new row is inserted in the table, defaults to [“id”]

Relations¶

Relation.relation(table: str, origin_field: str, search_field: str, destination_field: str = None, id_field: str = 'id')¶

Add a column to the main dataframe from a relation foreign key

Parameters:

table (str) – the table to select from
origin_field (str) – the column name in the origin table to search from, generally an id column
search_field (str) – the column name in the foreign table
destination_field (str, optional) – name of the column to be created with the data in the datframe, defaults to None, will be named as the origin_field if not provided
id_field (str, optional) – name of the primary key to use, defaults to “id”

example: ds.relation("product", "category_id", "name")

Relation.relation_(table: str, origin_field: str, search_field: str, destination_field=None, id_field='id') → pandas.core.frame.DataFrame¶

Returns a DataSwim instance with a column filled from a relation: foreign key

Parameters:	table (str) – the table to select from origin_field (str) – the column name in the origin table to search from, generally an id column search_field (str) – the column name in the foreign table destination_field (str, optional) – name of the column to be created with the data in the datframe, defaults to None, will be named as the origin_field if not provided id_field (str, optional) – name of the primary key to use, defaults to “id”
Returns:	a pandas DataFrame
Return type:	DataFrame

InfluxDb¶

influx_init (url, port, user, pwd, db)

Initialize an Influxdb database client

influx_to_csv (measurement, batch_size=5000)

Batch export data from an Influxdb measurement to csv