Database operations

All operations available for databases. Supported databases: all that SqlAlchemy support

Connect

Db.connect(url: str)

Connect to the database and set it as main database

Parameters:url (str) – path to the database, uses the Sqlalchemy format
Example:ds.connect("sqlite:///mydb.slqite")

Load data

Db.load(table: str)

Set the main dataframe from a table’s data

Parameters:table (str) – table name
Example:ds.load("mytable")
Db.load_django(query: django query)

Load the main dataframe from a django orm query

Parameters:query (django query) – django query from a model
Example:ds.load_django(Mymodel.objects.all())
Db.load_django_(query: django query) → Ds

Returns a DataSwim instance from a django orm query

Parameters:query (django query) – django query from a model
Returns:a dataswim instance with data from a django query
Return type:Ds
Example:ds2 = ds.load_django_(Mymodel.objects.all())

Infos

Db.tables()

Print the existing tables in a database

Example:ds.tables()
Db.tables_() → list

Return a list of the existing tables in a database

Returns:list of the table names
Return type:list
Example:tables = ds.tables_()
Db.table(name: str)

Display info about a table: number of rows and columns

Parameters:name (str) – name of the table
Example:tables = ds.table("mytable")

Insert

Insert.to_db(table: str, dtypes: List[sqlalchemy.sql.sqltypes.SchemaType] = None)

Save the main dataframe to the database

Parameters:
  • table (str) – the table to create
  • dtypes (List[SchemaType], optional) – SqlAlchemy columns type, defaults to None, will be infered if not provided
Insert.update_table(table: str, pks: List[str] = ['id'], mirror: bool = True)

Update records in a database table from the main dataframe

Parameters:
  • table (str) – table to update
  • pks (List[str], optional :param mirror: delete the rows not in the new datataset) – if rows with matching pks exist they will be updated, otherwise a new row is inserted in the table, defaults to [“id”]
Insert.insert(table: str, records: dict, create_cols: bool = False, dtypes: List[sqlalchemy.sql.sqltypes.SchemaType] = None)
Insert one or many records in the database from a dictionary
or a list of dictionaries
Parameters:
  • table (str) – the table to insert into
  • records (dict) – a dictionnary or list of dictionnaries of the data to insert
  • create_cols (bool, optional) – create the columns if they don’t exist, defaults to False
  • dtypes (SchemaType, optional) – list of SqlAlchemy table types, defaults to None. The types are infered if not provided
Insert.upsert(table: str, record: dict, create_cols: bool = False, dtypes: List[sqlalchemy.sql.sqltypes.SchemaType] = None, pks: List[str] = ['id'])

Upsert a record in a table

Parameters:
  • table (str) – the table to upsert into
  • record (dict) – dictionary with the data to upsert
  • create_cols (bool, optional) – create the columns if it doesn’t exist, defaults to False
  • dtypes (List[SchemaType], optional) – list of SqlAlchemy column types, defaults to None
  • pks (List[str], optional) – if rows with matching pks exist they will be updated, otherwise a new row is inserted in the table, defaults to [“id”]

Relations

Relation.relation(table: str, origin_field: str, search_field: str, destination_field: str = None, id_field: str = 'id')

Add a column to the main dataframe from a relation foreign key

Parameters:
  • table (str) – the table to select from
  • origin_field (str) – the column name in the origin table to search from, generally an id column
  • search_field (str) – the column name in the foreign table
  • destination_field (str, optional) – name of the column to be created with the data in the datframe, defaults to None, will be named as the origin_field if not provided
  • id_field (str, optional) – name of the primary key to use, defaults to “id”

example: ds.relation("product", "category_id", "name")

Relation.relation_(table: str, origin_field: str, search_field: str, destination_field=None, id_field='id') → pandas.core.frame.DataFrame
Returns a DataSwim instance with a column filled from a relation
foreign key
Parameters:
  • table (str) – the table to select from
  • origin_field (str) – the column name in the origin table to search from, generally an id column
  • search_field (str) – the column name in the foreign table
  • destination_field (str, optional) – name of the column to be created with the data in the datframe, defaults to None, will be named as the origin_field if not provided
  • id_field (str, optional) – name of the primary key to use, defaults to “id”
Returns:

a pandas DataFrame

Return type:

DataFrame

InfluxDb

influx_init (url, port, user, pwd, db)

Initialize an Influxdb database client

influx_to_csv (measurement, batch_size=5000)

Batch export data from an Influxdb measurement to csv