Skip to content

anazalea/pySankey

Repository files navigation

pySankey

Uses matplotlib to create simple Sankey diagrams flowing only from left to right.

PyPI version Build Status Coverage Status

Requirements

Requires python-tk (for python 2.7) or python3-tk (for python 3.x) you can install the other requirements with:

    pip install -r requirements.txt

Example

With fruits.txt :

true predicted
0 blueberry orange
1 lime orange
2 blueberry lime
3 apple orange
... ... ...
996 lime orange
997 blueberry orange
998 orange banana
999 apple lime

1000 rows × 2 columns

You can generate a sankey's diagram with this code:

import pandas as pd
from pysankey import sankey

pd.options.display.max_rows = 8
df = pd.read_csv(
    'pysankey/fruits.txt', sep=' ', names=['true', 'predicted']
)
colorDict = {
    'apple':'#f71b1b',
    'blueberry':'#1b7ef7',
    'banana':'#f3f71b',
    'lime':'#12e23f',
    'orange':'#f78c1b'
}
sankey(
    df['true'], df['predicted'], aspect=20, colorDict=colorDict,
    fontsize=12, figureName="fruit"
)
# Result is in "fruit.png"

Fruity Alchemy

You could also use weight:

,customer,good,revenue
0,John,fruit,5.5
1,Mike,meat,11.0
2,Betty,drinks,7.0
3,Ben,fruit,4.0
4,Betty,bread,2.0
5,John,bread,2.5
6,John,drinks,8.0
7,Ben,bread,2.0
8,Mike,bread,3.5
9,John,meat,13.0
import pandas as pd
from pysankey import sankey

df = pd.read_csv(
    'pysankey/customers-goods.csv', sep=',',
    names=['id', 'customer', 'good', 'revenue']
)
sankey(
    left=df['customer'], right=df['good'], rightWeight=df['revenue'], aspect=20,
    fontsize=20, figureName="customer-good"
)
# Result is in "customer-good.png"

Customer goods

Package development

Lint

pylint pysankey

Testing

python -m unittest

Coverage

coverage run -m unittest
coverage html
# Open htmlcov/index.html in a navigator