Example usage

To use magmaviz in a project:

from magmaviz.boxplot import boxplot
from magmaviz.corrplot import corrplot
from magmaviz.histogram import histogram
from magmaviz.scatterplot import scatterplot

from vega_datasets import data

Toy Dataset

from vega_datasets import data
cars = data.cars()
cars.head(3)
Name Miles_per_Gallon Cylinders Displacement Horsepower Weight_in_lbs Acceleration Year Origin
0 chevrolet chevelle malibu 18.0 8 307.0 130.0 3504 12.0 1970-01-01 USA
1 buick skylark 320 15.0 8 350.0 165.0 3693 11.5 1970-01-01 USA
2 plymouth satellite 18.0 8 318.0 150.0 3436 11.0 1970-01-01 USA

Boxplot

We can create boxplots to view the distribution of a certain variable with one or more categories using the boxplot() function. This function will automatically name the columns based on the column names that are supplied, and has an option to facet.

boxplot(cars, 'Miles_per_Gallon', 'Origin')

Correlation plot

We can create a correlation plot based on all the numeric features in the dataframe by calling corrplot(). The values returned are the Pearson correlation coefficents. The color scheme is set to diverging for easy interpretation. There is an option to print the values as a list.

corrplot(cars, print_corr=False)

Histogram

To generate a histogram for a categorical feature with an aggregation function, we can call histogram(). The aggergation functions include average, count, distinct etc. The list of the accepted aggregation functions are supplied in the documentation.

histogram(cars, x='Horsepower', y='count()')

Scatter plot

A scatter plot based on two numeric features can be created using the scatterplot() funciton. The style of the points, labels of the plot and scale are customizable. The names of the x and y are automatically generated if not supplied. Customizations of the points include opacity, shape and size.

scatterplot(cars, 'Horsepower', 'Acceleration', c='Origin')