Data Analysis using Funnelling
Data mining is a technology that blends traditional data analysis methods with relatable algorithms to process and analyze a huge amount of data. The process of information discovery tasks is considered to be data mining and the process of inspecting, transforming, and modeling that data with the goal of discovering useful information is Data Analysis, which helps crucial decision-making.
One of many implementable applications in Data Analysis is done via the concept of Data Funneling. Data Funnelling is a type of analysis where ….
Many software companies need to take important decisions based on their customers’ flow in their software model. Funneling helps them here.
Let us an example of e-commerce shopping website which has various screens, events, and flows. The purpose is to calculate the number of users that go through each of the defined data nodes of the model. We can create a predefined funnel so it makes easier to trace only that path using a particular method. The otherwise method would be calculating the entire user flow count and count of all possible ways the user can travel in the model. We’ll discuss the all-flow method later.
In this method, the calculation is done based on the predefined funnel path and the counts of that path. It is going to be calculated by each user’s path flow and counting how far the user has traveled or reached the funnel definition.
Core Logic
Let’s assume we have a user thread traveled for a particular period of time and we have a defined set of nodes to be compared with.
We’ll consider four main steps of the taken e-commerce website
A — Login screen
B — Screen Tab
C — Check Out
D — Payment Page
So, here defined nodes are A->B->C->D in the exact order.
And we have the data of one user who has gone through the website in various ways and we take that into consideration to analyze our model.
User traveled nodes : A -> B -> E -> G -> C -> E -> C -> E -> G -> E -> C -> D
The Strict Path
In the strict path, the user thread should travel in the exact path as defined in the funnel. Anywhere in between the path, the thread takes any different route than expected till the end of the funnel, the funnel breaks there and records the data only till where it traveled. And calculate the data accordingly.
Here, the user thread starts with node-A at index-0, then goes to node-B at index-1 (as per funnel defined), but after that to node-E. Here, expected is node-C. So funnel breaks here.
The final count will be: A-1, B-1, C-0, D-0
The Loose Path
In the loose funnel, the user thread is free to travel anywhere along its path as long as it is in order of the funnel. Here, we travel the entire user thread to check how far the user has traveled the funnel or completed the funnel. And if the user has completed the funnel, we are free to break the funnel and take the data.
The user thread starts with node-A at index-0, then goes to node-B index-1(as per funnel defined), then to node-E, which is not the expected path, but we don’t break the funnel. Here, we look further in the same thread to find out if the user goes to the next point in the funnel. So, we ignore node-E here and move to next, node-G, again ignore because this too isn’t the next checkpoint in the funnel. Then next is node-C (as per funnel defined). Similarly, we keep ignoring the irrelevant nodes that the user has traveled and look for the next checkpoint that is defined as per the funnel. We need to look till the last node of the thread and see how far the user has reached as the order of the funnel defined.
Here, the user reaches node-D at index-11. And once the funnel is completed, we can stop the thread traversal.
So, the final count will be: A-1, B-1, C-1, D-1
This process continues for all the users and the cumulative count is taken for the final result of funnel analysis.
Threshold time
The user can travel the nodes at a particular time and can cover the funnel path ultimately after a long gap of time. But for how long? Here, we need to set a threshold time up to which we can take in the data for funneling logic consideration. The time starts when the user hits the first action of the funnel and the logic checks for the nodes traveled until the defined threshold time.
Usually, the companies want to see the daily stats of the funnel of the users’ data. So commonly practiced time threshold is one day. But practically, even then the time to wait for users’ action to check is too long. This is because the average time spent on an app uninterrupted is 15 mins and the average time spent on the overall application is a total of 2 hours a day.
So, optimally, we can wait for one hour for the user to complete the actions of the funnel. After one hour, we can take into consideration we have until the user has traveled.
Another perspective
There are various approaches to find out the funnel with users' data given with the above-discussed method. To reach a particular node there are various ways in the app flow. So each of the paths will have the count different obviously. This is one among them.