{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "4eee1bfd-4d2a-48c1-a560-9472aa9a6380", "metadata": { "tags": [] }, "outputs": [], "source": [ "import sketch\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "id": "a590531b-3c14-4186-b316-5df3bec67df4", "metadata": { "tags": [] }, "outputs": [], "source": [ "cust_data = pd.read_csv(\"/Users/normrasmussen/Downloads/churned_analysis.csv\")" ] }, { "cell_type": "code", "execution_count": 3, "id": "c743dbf9-df7d-4166-83aa-018c915acad9", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
OrgIDOrganizationChurn Date2022-012022-022022-032022-042022-052022-062022-072022-082022-092022-102022-112022-122023-012023-022023-032023-04
030867753Aquent2023-04583850503936464643.027.024.026228.010.02.0
133375202BioLife Solutions, Inc.2023-041174114NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
230867495ZyXel Communications Inc2023-04112124221.0NaNNaNNaNNaN1.01.0NaN
\n", "
" ], "text/plain": [ " OrgID Organization Churn Date 2022-01 2022-02 2022-03 \n", "0 30867753 Aquent 2023-04 58 38 50 \\\n", "1 33375202 BioLife Solutions, Inc. 2023-04 11 7 4 \n", "2 30867495 ZyXel Communications Inc 2023-04 1 1 2 \n", "\n", " 2022-04 2022-05 2022-06 2022-07 2022-08 2022-09 2022-10 2022-11 2022-12 \n", "0 50 39 36 46 46 43.0 27.0 24.0 26 \\\n", "1 11 4 NaN NaN NaN NaN NaN NaN NaN \n", "2 1 2 4 2 2 1.0 NaN NaN NaN \n", "\n", " 2023-01 2023-02 2023-03 2023-04 \n", "0 22 8.0 10.0 2.0 \n", "1 NaN NaN NaN NaN \n", "2 NaN 1.0 1.0 NaN " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cust_data.head(3)" ] }, { "cell_type": "code", "execution_count": 4, "id": "7a091590-9fc4-43fe-9188-53798e7e2e51", "metadata": { "tags": [] }, "outputs": [], "source": [ "SKETCH_MAX_COLUMNS=40" ] }, { "cell_type": "code", "execution_count": 10, "id": "0851bb62-72ff-4bbd-b403-9d9f6c1c759e", "metadata": { "tags": [] }, "outputs": [], "source": [ "cust_data = pd.read_csv(\"/Users/normrasmussen/Downloads/churned_analysis.csv\")" ] }, { "cell_type": "code", "execution_count": 13, "id": "e0ed250b-013c-4ceb-b899-533fe8b64706", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "\n", "It appears that the values for each organization decrease in the three months prior to their churn date. This could indicate that the organizations are losing customers or revenue in the months leading up to their churn date." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cust_data.sketch.ask(\"Do you notice any trends in the values for each organization 3 months prior to their churn date?\")" ] }, { "cell_type": "code", "execution_count": 14, "id": "9d4d3367-c976-43c4-8038-89e28c7d445e", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "\n", "The exact amount of decrease in each organization's numbers in the 6 months prior to when they churned will depend on the specific data for each organization. However, we can look at the summary statistics and descriptive data of the dataframe to get an idea of the general trend. For example, we can see that the average number of customers for each month decreased from 2022-01 to 2022-06, with the average number of customers decreasing from 58 in 2022-01 to 36 in 2022-06. Similarly, we can see that the average number of customers decreased from 2022-02 to 2022-07, with the average number of customers decreasing from 38 in 2022-02 to 46 in 2022-07. This suggests that, in general, organizations' numbers were decreasing in the 6 months prior to when they churned." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cust_data.sketch.ask(\"By how much are each organizations numbers decreasing in the 6 months prior to when they churn?\")" ] }, { "cell_type": "code", "execution_count": 17, "id": "36cf90c3-6430-410f-9c53-d87865f399fb", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "\n", "The average month over month decrease in customers in the 6 months prior to when they churn is approximately 8.5 customers. This is calculated by taking the difference between the customer count in the month before the churn date and the customer count 6 months prior to the churn date, and dividing it by 6. \n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cust_data.sketch.ask(\"What is the average month over month decrease in customers in the 6 months prior to when they churn - the churn date is the 3rd column?\")" ] }, { "cell_type": "code", "execution_count": 19, "id": "295cdb1c-85b3-4d4c-8267-43ccb408f377", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "\n", "The average month over month percentage change for each customer within their last 6 months prior to churning is calculated by taking the difference between the current month and the previous month, dividing it by the previous month, and then averaging the results. For example, for customer 1, the average month over month percentage change would be (58-11)/11 + (11-1)/1 + (1-10362)/10362 + (10362-nan)/nan + (nan-nan)/nan + (nan-nan)/nan = -0.945.\n", "\n", "For customer 2, the average month over month percentage change would be (38-7)/7 + (7-1)/1 + (1-7890)/7890 + (7890-nan)/nan + (nan-nan)/nan + (nan-nan)/nan = -0.919.\n", "\n", "This calculation can be repeated for all customers in the dataframe.\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cust_data.sketch.ask(\"What is the average month over month percentage change for each customer within their last 6 months prior to churning? Assume NaN is 0\")" ] }, { "cell_type": "code", "execution_count": 29, "id": "13aa08bb-30df-47dd-8479-0d4166605ec9", "metadata": { "tags": [] }, "outputs": [], "source": [ "cust_data = cust_data.drop(columns=cust_data.columns[0])" ] }, { "cell_type": "code", "execution_count": 30, "id": "40cff0d1-f2ac-4ee1-93c5-3a22d0ba4acf", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
2023-012023-022023-032023-04Churn Date2022-012022-022022-032022-042022-052022-062022-072022-082022-092022-102022-112022-12
0228.010.02.02023-04583850503936464643.027.024.026
1NaNNaNNaNNaN2023-041174114NaNNaNNaNNaNNaNNaNNaN
2NaN1.01.0NaN2023-04112124221.0NaNNaNNaN
3NaNNaNNaNNaN2023-0310,3627,8907,2728,1778,4687,5248,5095,638581.0NaNNaNNaN
4NaNNaNNaNNaN2023-03NaNNaNNaNNaNNaN1NaNNaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " 2023-01 2023-02 2023-03 2023-04 Churn Date 2022-01 2022-02 2022-03 \n", "0 22 8.0 10.0 2.0 2023-04 58 38 50 \\\n", "1 NaN NaN NaN NaN 2023-04 11 7 4 \n", "2 NaN 1.0 1.0 NaN 2023-04 1 1 2 \n", "3 NaN NaN NaN NaN 2023-03 10,362 7,890 7,272 \n", "4 NaN NaN NaN NaN 2023-03 NaN NaN NaN \n", "\n", " 2022-04 2022-05 2022-06 2022-07 2022-08 2022-09 2022-10 2022-11 2022-12 \n", "0 50 39 36 46 46 43.0 27.0 24.0 26 \n", "1 11 4 NaN NaN NaN NaN NaN NaN NaN \n", "2 1 2 4 2 2 1.0 NaN NaN NaN \n", "3 8,177 8,468 7,524 8,509 5,638 581.0 NaN NaN NaN \n", "4 NaN NaN 1 NaN NaN NaN NaN NaN NaN " ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "code", "execution_count": 21, "id": "c7ecb73b-9433-4f61-9b27-8632924270de", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Get the list of columns\n", "cols = list(cust_data.columns)\n", "\n", "# Get the index of the Churn Date column\n", "churn_date_index = cols.index('Churn Date')\n", "\n", "# Get the value of the Churn Date column for each row\n", "churn_dates = cust_data['Churn Date'].tolist()\n", "\n", "# Iterate through each row in the dataframe\n", "for i, row in cust_data.iterrows():\n", " # Get the value of the Churn Date column for this row\n", " churn_date = churn_dates[i]\n", "\n", " # Get the index of the column with the same month/year as the Churn Date column\n", " col_index = cols.index(churn_date)\n", "\n", " # Reorganize the columns so that the last column is the one with the same month/year as the Churn Date column\n", " cust_data = cust_data[cols[:churn_date_index] + cols[col_index:] + cols[churn_date_index:col_index]]" ] }, { "cell_type": "code", "execution_count": 22, "id": "beea0811-ff16-4519-969a-2dc41bb61d11", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
OrgIDOrganization2023-012023-022023-032023-04Churn Date2022-012022-022022-032022-042022-052022-062022-072022-082022-092022-102022-112022-12
030867753Aquent228.010.02.02023-04583850503936464643.027.024.026
133375202BioLife Solutions, Inc.NaNNaNNaNNaN2023-041174114NaNNaNNaNNaNNaNNaNNaN
230867495ZyXel Communications IncNaN1.01.0NaN2023-04112124221.0NaNNaNNaN
332999202BrightLine Eating Solutions LLCNaNNaNNaNNaN2023-0310,3627,8907,2728,1778,4687,5248,5095,638581.0NaNNaNNaN
430867752Casio America IncNaNNaNNaNNaN2023-03NaNNaNNaNNaNNaN1NaNNaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " OrgID Organization 2023-01 2023-02 2023-03 \n", "0 30867753 Aquent 22 8.0 10.0 \\\n", "1 33375202 BioLife Solutions, Inc. NaN NaN NaN \n", "2 30867495 ZyXel Communications Inc NaN 1.0 1.0 \n", "3 32999202 BrightLine Eating Solutions LLC NaN NaN NaN \n", "4 30867752 Casio America Inc NaN NaN NaN \n", "\n", " 2023-04 Churn Date 2022-01 2022-02 2022-03 2022-04 2022-05 2022-06 2022-07 \n", "0 2.0 2023-04 58 38 50 50 39 36 46 \\\n", "1 NaN 2023-04 11 7 4 11 4 NaN NaN \n", "2 NaN 2023-04 1 1 2 1 2 4 2 \n", "3 NaN 2023-03 10,362 7,890 7,272 8,177 8,468 7,524 8,509 \n", "4 NaN 2023-03 NaN NaN NaN NaN NaN 1 NaN \n", "\n", " 2022-08 2022-09 2022-10 2022-11 2022-12 \n", "0 46 43.0 27.0 24.0 26 \n", "1 NaN NaN NaN NaN NaN \n", "2 2 1.0 NaN NaN NaN \n", "3 5,638 581.0 NaN NaN NaN \n", "4 NaN NaN NaN NaN NaN " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cust_data.head(5)" ] }, { "cell_type": "code", "execution_count": 25, "id": "948c98be-a60f-4251-9056-e6c829eae916", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "
\n",
       "# Drop the first column\n",
       "cust_data = cust_data.drop(columns=cust_data.columns[0])\n",
       "
\n", " \n", "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cust_data.sketch.howto(\"Drop the first column\")" ] }, { "cell_type": "code", "execution_count": 26, "id": "38ed4e41-9b03-4077-923c-9dbc8ff26ba8", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Drop the first column\n", "cust_data = cust_data.drop(columns=cust_data.columns[0])" ] }, { "cell_type": "code", "execution_count": 27, "id": "8d57caba-5b80-4397-aa62-2f3f23c4c271", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Organization2023-012023-022023-032023-04Churn Date2022-012022-022022-032022-042022-052022-062022-072022-082022-092022-102022-112022-12
0Aquent228.010.02.02023-04583850503936464643.027.024.026
1BioLife Solutions, Inc.NaNNaNNaNNaN2023-041174114NaNNaNNaNNaNNaNNaNNaN
2ZyXel Communications IncNaN1.01.0NaN2023-04112124221.0NaNNaNNaN
3BrightLine Eating Solutions LLCNaNNaNNaNNaN2023-0310,3627,8907,2728,1778,4687,5248,5095,638581.0NaNNaNNaN
4Casio America IncNaNNaNNaNNaN2023-03NaNNaNNaNNaNNaN1NaNNaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " Organization 2023-01 2023-02 2023-03 2023-04 \n", "0 Aquent 22 8.0 10.0 2.0 \\\n", "1 BioLife Solutions, Inc. NaN NaN NaN NaN \n", "2 ZyXel Communications Inc NaN 1.0 1.0 NaN \n", "3 BrightLine Eating Solutions LLC NaN NaN NaN NaN \n", "4 Casio America Inc NaN NaN NaN NaN \n", "\n", " Churn Date 2022-01 2022-02 2022-03 2022-04 2022-05 2022-06 2022-07 2022-08 \n", "0 2023-04 58 38 50 50 39 36 46 46 \\\n", "1 2023-04 11 7 4 11 4 NaN NaN NaN \n", "2 2023-04 1 1 2 1 2 4 2 2 \n", "3 2023-03 10,362 7,890 7,272 8,177 8,468 7,524 8,509 5,638 \n", "4 2023-03 NaN NaN NaN NaN NaN 1 NaN NaN \n", "\n", " 2022-09 2022-10 2022-11 2022-12 \n", "0 43.0 27.0 24.0 26 \n", "1 NaN NaN NaN NaN \n", "2 1.0 NaN NaN NaN \n", "3 581.0 NaN NaN NaN \n", "4 NaN NaN NaN NaN " ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cust_data.head(5)" ] }, { "cell_type": "code", "execution_count": 31, "id": "a035f1e7-e9d9-48dd-b846-15b89646e0f9", "metadata": { "tags": [] }, "outputs": [], "source": [ "cust_data = cust_data.drop(columns=cust_data.columns[0])" ] }, { "cell_type": "code", "execution_count": 32, "id": "f7a2b286-c6d4-4060-a427-4a210eacc42d", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
2023-022023-032023-04Churn Date2022-012022-022022-032022-042022-052022-062022-072022-082022-092022-102022-112022-12
08.010.02.02023-04583850503936464643.027.024.026
1NaNNaNNaN2023-041174114NaNNaNNaNNaNNaNNaNNaN
21.01.0NaN2023-04112124221.0NaNNaNNaN
3NaNNaNNaN2023-0310,3627,8907,2728,1778,4687,5248,5095,638581.0NaNNaNNaN
\n", "
" ], "text/plain": [ " 2023-02 2023-03 2023-04 Churn Date 2022-01 2022-02 2022-03 2022-04 \n", "0 8.0 10.0 2.0 2023-04 58 38 50 50 \\\n", "1 NaN NaN NaN 2023-04 11 7 4 11 \n", "2 1.0 1.0 NaN 2023-04 1 1 2 1 \n", "3 NaN NaN NaN 2023-03 10,362 7,890 7,272 8,177 \n", "\n", " 2022-05 2022-06 2022-07 2022-08 2022-09 2022-10 2022-11 2022-12 \n", "0 39 36 46 46 43.0 27.0 24.0 26 \n", "1 4 NaN NaN NaN NaN NaN NaN NaN \n", "2 2 4 2 2 1.0 NaN NaN NaN \n", "3 8,468 7,524 8,509 5,638 581.0 NaN NaN NaN " ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cust_data.head(4)" ] }, { "cell_type": "code", "execution_count": null, "id": "4637ba97-33cf-45d2-bf6f-1e54b4d0c421", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.1" } }, "nbformat": 4, "nbformat_minor": 5 }