How would you calculate the change of popularity over time?

fpianz · October 20, 2020, 1:52pm

Hi,

I’m working with some colleagues on a project and we are in doubt about the most appropriate way to calculate the popularity of a certain product. I’d like to hear your opinion on options that we have (or suggest others, if you have them!).

Let’s say we want to know how the popularity of pizza in Milano changes over the years. We know how many pizza are served every year in total (tot_pizza), how many of each kind of pizza (e.g. n_margherita), and how many restaurants there are (n_restaurants). All these variables can change year by year.

How would you calculate the average popularity of a kind of pizza in a year?

Please comment explaining the reasons for your choice.

Option 1: margherita_popularity = n_margherita / n_restaurants
Option 2: margherita_popularity = n_margherita / tot_pizza
Option 3: margherita_popularity = n_margherita / tot_pizza * n_restaurants
Option 4: margherita_popularity = n_margherita / tot_pizza / n_restaurants
Other

0 voters

folgert · October 20, 2020, 3:11pm

I’d say other, if you have some information about the frequency with which particular pizzas are eaten at different restaurants. You could then try some measure of dispersion, like Julliand’s D, which should reflect both spread (i.e. how many restaurants) as well as the frequency variance. My intuition would be that pizzas that are frequently eaten at multiple restaurants are more popular than others.

melvin.wevers · October 20, 2020, 3:36pm

Also, for products that are consumed infrequently (less popular) a small spike in popularity might already indicate a burst of popularity. Whereas for popular products fluctuations maybe are less indicative of changes in popularity. Maybe this alludes to the perception of popularity. Kleinberg’s burst algorithm might be useful to examine trends in pizza popularity.

If I had to choose from the options, I’d choose three I think.

fpianz · October 20, 2020, 10:22pm

thanks! I didn’t know about measures of dispersion, it looks like a very useful solution

fpianz · October 20, 2020, 10:25pm

I read this some time ago but I didn’t make the connection to our case… thanks a lot!

folgert · October 21, 2020, 5:45am

This article by Stefan Gries discusses all kinds of dispersions measures: http://www.stgries.info/research/2008_STG_Dispersion_IJCL.pdf Might be interesting for your case (which is probably not only about Margherita pizza ;)).

melvin.wevers · October 21, 2020, 6:49am

Almost feels like time for a reading list topic

fotis · October 22, 2020, 2:12pm

Probably I am slow, but what is the difference between option 3 and option 4?

fpianz · October 22, 2020, 11:55pm

I think they are both imprecise but option 3 tells something more about the popularity of the pizza in space or among producers, e.g. if a pizza starts spreading in restaurants in a new area of the city. Let’s say that 20 margherita out of 100 total pizzas have been sold both in 2018 and 2019, if in 2018 they were sold in 4 restaurants and in 2019 in 5, its popularity would increase from 0.8 to 1.

Option 4 is more weighted with respect to the potential reach. If in 2019 there is one more restaurant, all pizza have more chances to be sold because more people can eat at the same time. With 20/100 margherita sold in 5 rather than 4 restaurants, its popularity would decrease from 0.05 to 0.04 because an increase in potential sales did not correspond to an actual increase in sales. Maybe because people started buying more pepperoni pizza.

P.S. I hope food talk falls within the humanities

folgert · October 23, 2020, 8:58am

It definitely does! Didn’t you write something about food, @melvin.wevers?

melvin.wevers · October 26, 2020, 1:57pm

Indeed, discourse on soft drinks in newspapers. I applied burst detection to car advertisements to detect innovations and their stickiness in advertising discourse.