Variable is the Passage of Time

We live in an age where time is the scarcest resource. And it’s been that way since I can remember. Even as a middle school kid, I approached the hours of each day like lines on a budget sheet. The concept of doing something just to do it didn’t exist. There had to be a proven result that would somehow pay off in the next chapter of my life. And that mentality only got stronger as I entered the business world.

Given all the emphasis we place on time, it’s no surprise that time can be very significant when used as a data point in our studies. I discovered this recently when playing with some user data that my buddy, who has run a home cleaning app for two years, sent me to process. I wanted to discover any patterns that might be lurking in the amount of time between a potential customer’s downloading of the app and their last usage of it.

The data set has one column detailing the user’s download date and another containing last usage of the app. I used a simple but effective command as follows to calculate the difference in days between these two columns and insert it as another variable within the data set:

app$date_diff <- as.Date(as.character(app$download), format="%Y-%m-%d")-
                  as.Date(as.character(app$lastuse), format="%Y-%m-%d")

I was happy that such command did not require the use of any additional packages outside the base, which left less room for inexplicable errors.

Long story short, I had already separated some users whose behavior stood out to me into cohorts. The cohort I found most interesting included those who downloaded the app, input their credit card information (which is manual and not quick to do), but did not end up booking a clean. What I found is that this particular cohort spent an average time of 65 days between their download of the app and their last usage. The large amount of users who downloaded the app, never input their credit card information and never booked stopped using the app, on average, after 44 days. I am still playing with this data, so I am sure that there is more to be discovered. However, my interpretation of this discrepancy for now is that the cohort who did input their credit card information had the app on their minds for longer than those who didn’t and may be more receptive to additional messaging encouraging them to book.


Do you have an instance in which the difference between two dates proved significant to one of your projects? How did you measure this difference and in what way did you apply it within your analysis? I want to know. Hit me with your responses on Twitter at @DikayoData or by email at dikayo@dikayodata.com.

Danielle Oberdier