How would you calculate the change of popularity over time?


I’m working with some colleagues on a project and we are in doubt about the most appropriate way to calculate the popularity of a certain product. I’d like to hear your opinion on options that we have (or suggest others, if you have them!).

Let’s say we want to know how the popularity of pizza in Milano changes over the years. We know how many pizza are served every year in total (tot_pizza), how many of each kind of pizza (e.g. n_margherita), and how many restaurants there are (n_restaurants). All these variables can change year by year.

How would you calculate the average popularity of a kind of pizza in a year?

Please comment explaining the reasons for your choice.

  • Option 1: margherita_popularity = n_margherita / n_restaurants
  • Option 2: margherita_popularity = n_margherita / tot_pizza
  • Option 3: margherita_popularity = n_margherita / tot_pizza * n_restaurants
  • Option 4: margherita_popularity = n_margherita / tot_pizza / n_restaurants
  • Other

0 voters

1 Like

I’d say other, if you have some information about the frequency with which particular pizzas are eaten at different restaurants. You could then try some measure of dispersion, like Julliand’s D, which should reflect both spread (i.e. how many restaurants) as well as the frequency variance. My intuition would be that pizzas that are frequently eaten at multiple restaurants are more popular than others.

1 Like

Also, for products that are consumed infrequently (less popular) a small spike in popularity might already indicate a burst of popularity. Whereas for popular products fluctuations maybe are less indicative of changes in popularity. Maybe this alludes to the perception of popularity. Kleinberg’s burst algorithm might be useful to examine trends in pizza popularity.

If I had to choose from the options, I’d choose three I think.

1 Like

thanks! I didn’t know about measures of dispersion, it looks like a very useful solution

I read this some time ago but I didn’t make the connection to our case… thanks a lot!

1 Like

This article by Stefan Gries discusses all kinds of dispersions measures: Might be interesting for your case (which is probably not only about Margherita pizza ;)).


Almost feels like time for a reading list topic :slight_smile: :books:


Probably I am slow, but what is the difference between option 3 and option 4?

I think they are both imprecise but option 3 tells something more about the popularity of the pizza in space or among producers, e.g. if a pizza starts spreading in restaurants in a new area of the city. Let’s say that 20 margherita out of 100 total pizzas have been sold both in 2018 and 2019, if in 2018 they were sold in 4 restaurants and in 2019 in 5, its popularity would increase from 0.8 to 1.

Option 4 is more weighted with respect to the potential reach. If in 2019 there is one more restaurant, all pizza have more chances to be sold because more people can eat at the same time. With 20/100 margherita sold in 5 rather than 4 restaurants, its popularity would decrease from 0.05 to 0.04 because an increase in potential sales did not correspond to an actual increase in sales. Maybe because people started buying more pepperoni pizza.

P.S. I hope food talk falls within the humanities :stuck_out_tongue:

It definitely does! Didn’t you write something about food, @melvin.wevers?

1 Like

Indeed, discourse on soft drinks in newspapers. I applied burst detection to car advertisements to detect innovations and their stickiness in advertising discourse.

1 Like