Author Topic: [Pic heavy] Inferring general characteristics from Touhou Popularity Poll  (Read 4331 times)

Drake

  • *
The following post I'm making as proxy for the user tryourbreast, who currently cannot post this themselves due to new users requiring captcha and MotK's current issues with them.



i.e Because naming the title "Yet another Touhou popularity poll analysis" would make people think it's about predictions again.



tl;dr version, quoting this post's end:

Quote
This post shows a few characteristics of the Touhou popularity poll:

1. Touhou popularity poll results follow a exponential distribution. For roughly every 10% percentile of rank, net votes halves
2. There is a clear boundary between god tier and top tier characters (which is kind of well-known), and a even clearer boundary between the middle tier and bottom tier (which is not)
3. People put higher-ranked characters on top (as top vote) more frequently than lower-ranked character. This becomes opposite for everything after the 3rd choice. The turning point is around 25th place
3.5. Because of the above, lower ranked characters should get buffed quite a bit in the new 12th popularity poll, since the character slots had increased from 5 to 7
4. Top vote rates differs from character to character in terms of amount and stability, and is not correlated to stages (but later works do receive less top vote rates until several years later), so perhaps people are being indecisive over this precious slot





Touhou popularity polls are kind of like stock market or horse racing, in that there are two topics that keep coming up:

1. How did my favourite character perform? Did she get a rank increase, or decrease? Is it reasonable?
2. What's the prediction about the upcoming popularity poll results?

This goes too overboard that even the word "analysis" are used as "prediction" most of the time, getting people argue over whether if this "analysis" is "objective" or not.


Yet, obviously, the poll is run not for the above reasons, but as a census to obtain the characteristics of the Touhou community. In this spirit, what we really should be doing is to extract interpretation-free characteristics from the poll results.
Somebody has already kindly made some graphs for us (1 2 3 4), but, just like the votes, more of them are better.


p.s This is a revised and improved version of what I've done 6 months ago when I analysed the results of the 11th popularity poll (and posted it on a obscure place), with more afterthought, perspectives and vigorrigor.
The main polls featured are the thwiki Touhou popularity poll, and another one held by Niconico which filled the gap of 2014. They are relatively good (and interesting!) data.



0. Introduction

Before we can even start working, we have to making one important decision: What should the vote distribution look like? This is a profounding question, because it requires knowledge on the matter itself to give a convincing answer. Plotting any sets of data is easy - you can also plot the thwiki Touhou news poll in minutes - but plotting it the right way is hard.
For virtually every graph out in the wild, everything is plotted in linear scale, like this pretty graph. It's not exactly wrong: every data is correct. It's just that it's not a good way to present the data, because:
1. We know from the beginning poll results doesn't follow a linear relationship, so using a linear scale blinds us. It's like using a telescope to read books when what you need is a pair of glasses;
2. It does not tell us how the votes should vary with character rank. Does it decay like a quadratic? Logarithmic? Or exponential? Or something more complex? Sure, we can contemplate at the graph for odd spots till the next poll (which is 12 months later), but it's not going to help us know what the distribution is supposed to look like, and hence finding these weird spots that gives us the characteristic to this particular set of data.


Touhou popularity vote belongs to a category of distribution called rank-size distribution. Generally speaking, rank-size distribution follows power law, which means if you plot it on a log-log scale it'll be linear. In some cases people plot it on log rank and linear size, and find that things become linear there instead.

(For those who aren't familiar with scales: a log-log scale means that when you increase one side by, say, 2 times, the other side is also increase by a fixed number ratio.
Log-linear scale, however, means that if you add a number to the linear scale, the log side will be increased by a ratio.)

Exponential, power and logarithmic relationship are the three simplest ones out there, so they're the most natural thing one would throw at the unknown set of data, and see if anything sticks. Which is what I've done:


Plotting (4th to 11th poll) and fitting (green: 11th poll, blue: 5th poll) the previous poll results with: log-linear (exponential), log-log (power) and linear-log scale (logarithmic). Points in percentage and rank in percentiles.

Power law plot is obviously a no-go. Linear-log scale may look okay at first, until you realise that it does not generalise well to the earlier polls, especially with the tail.
(For reference, see the plots from Wikipedia's rank-size distribution page linked above. They fit very well even for the tail.)
Only the exponential plot actually does well all the way till the last few entries.

In essence, we found that log size and linear rank describes the data the best. This means that we should treat the data as an exponential decay of points and votes in respect to rank.
(p.s This is also how the three Chinese polls have been, so it's not specific to the Japanese.)


The observant may have noticed out that both axis are from 0 to 1. This is for comparing results of different polls: absolute ranks and votes are pointless across different polls. You may have 3000 people voting previously but a whooping 100000 this time, which you have to normalise before you can compare them; and a 10th out of 20th is not as impressive as 10th out of 100th.
By default, I'll be usingpoint/vote percentages and percentiles for everything, unless otherwise noted.

By the way, net votes instead of points will be used unless otherwise noted too, because points are a mixture between net votes and true votes (TV). Mixing them together makes the data not orthogonal anymore, and separating the effects of both parts is not easily.



To conclude:
1. Poll results follow a exponential distribution of net votes versus rank
2. Stop looking at absolute vote count and rank! Only the relative part matters.



1. Distribution characteristics

Reminder: We'll be looking at this upon this point:


With our wavelength set correctly, we can proceed to look for the obvious things that we can see from the distribution.
The first obvious observation is the common sudden drop at the 60% rank percentile:


Jumped the shark, I guess

In fact, for all previous polls there are also a drop from the beginning, then a bump at around 30-60% rank percentile, until around this 60% spot where a sudden drop occurs, and everything becomes normal. With the linear scale we were using before, these two features are hard to see (the absolute difference is only significant for the first 20 places after all).


Things really get interesting when it's plotted in a different way:


Votes drift. Higher ranked characters are brighter. Also, you're not supposed to predict future ranks from there

(Note that I've also put the Niconico poll results in. This means that it's definitely not a accounting bug on the thwiki side, since both operates independently.)


There are three important segments, which I've marked in red, orange and yellow:
The red region is the place where votes between different ranks are suddenly concentrated, and competition is fierce. In technical term, it has a very high character density. This part accounts for the bending point from the mean distribution, i.e the bump.
(Incidentally, Meiling has always been roughly the top of this messy part. So the fandom notion "Meiling wall" isn't baseless at all.)
The orange region only opened after the 8th poll, with Yorihime sitting in the middle. It's sitting somewhere between a genuine feature and a fake one, so I'll leave it as that;
The yellow region (at roughly 60% percentile), however, is really intriguing. It's a massive gap where votes can drop over 30% in just a few places. It's a big chasm which only a few have surpassed: Tokiko, Star Sapphire, Gengetsu and Toyohime, and despite all of this, has remained intact and in place. And see that one jiggling between the middle? For all things that can be there, It's Unzan.



To be more rigorous, tough, it'd be good if we can actually compute the character density, instead of eyeballing and pointing on every minor detail that seems suspicious, like that gap just under the red region on the 11th vote. As the Wikipedia article for rank-size distribution have mentioned, such distributions are actually quantile functions. So we can sample the average density of points over a exponentially spaced interval (this comes from the previous conclusion!) over previous polls, take a moving average to smooth out noise, and get our density:


The reconstructed distribution.


Plotting on the original graph. Turn your head 90 degrees CW when reading the density, Seija said


So, how does the 60% percentile drop come from, and why did the drop move forward in the 11th poll? I suspect it's actually the boundary between Windows characters, and (most of) PC-98 characters and everything else.
Basically, it's like this:

Every poll introduces new Windows characters. However, we have official comics which has lots of non-human minor characters, who're also introduced into the poll for the sake of completeness. This is how you have a youkai fortune teller whose rank even surpassed Unzan.
Since ZUN roughly releases a new work every year after MoF, and the votes are done in the same pace, every time 6-8 characters are thrown above this 60% percentile. Meanwhile, two things happen to the lower side: first one is that new minor characters are introduced; and the second, which was dominant in the earlier polls, was that bottom characters are actually being included back into the ranking because they stop getting 0 votes anymore, as a result of a massive increase in number of voters. Back in the 4th/5th poll, quite some PC-98 characters suffered from being out of the final ranking.
The two sides roughly balance to give a 6:4 ratio of this boundary.
However, at the 11th poll, LoLK is not ready yet so no new Windows shump characters for us; ULiL gives us only 1 character on top (Sumireko) while introducing all the urban legends which will become minor characters; Meanwhile, the official comics (WaHH and FS) are contributing full force to the latter. As a result, the boundary is imbalanced, and shifted from 60% to 57%.



In conclusion, we established in this part that:
1. There is a clear boundary between god tier and top tier characters (which is well-known),
2. And a even clearer boundary between the middle tier and bottom tier (which isn't well-known).




2. True votes (TV) characteristics and choice ranks distribution

I'll be calling true votes TV subsequently. True vote rates (ratio between true votes and total votes for a particular character) are called TV rates.

To begin with, it makes sense only in considering the TV rates. Absolute number does not make sense (it gets affected by total votes).

The overall average TV rates, of course, depends on how many characters you can choose. Despite that some people only vote for their first choice and skip everything else, the actual ratio between the total amount of TV and voters has been around 4.8 to 4.9 throughout the years, so for most purposes assuming an average TV rates of 20% is good enough, leaving you an absolute error of 0.4-0.8% which is not a big deal.

(By the way, normalising true votes by any other method is most likely to be wrong. If a character isn't being voted by someone, it's not going to appear in the first choice either.)


TV rates has no apparent trend. It's as if people are very indecisive on the 1st slot:

Plotted for all characters appeared between 1th to 100th place in the 11th vote. Brighter color = higher rank.

Even the TV rates rate stability varies from character to character. This would mean that TV rates rate prediction is essentially unpredictable.



There are two ways one can model the voting process:
- Consider everyone's 1st choice only, then 2nd choice only, then 3rd choice only, etc...
- Consider one voter's choices, then another voter's choices, then yet another, etc...

The second one is way harder than the first because character choices are not independent - if you put Koishi as your 1st choice, big chances are that you're also putting Satori somewhere close to her.
Modeling the poll this way requires knowledge on the relationships between two characters, and (unfortunately) the official results only reveal the 6 most significant rates for this (and none in the past), so this is practically impossible.

The first one, however, is much easier, once you also assume that the choice ranks immutable, so they're independent of each other, and by limiting the amount of choices we're only cutting out the first n items.
There are arguments against this notion (e.g My true love is A, but if I can choose 7 I'll choose ABCDEFG, and since G is the newest one I'll put my TV on G) but I'd argue that these are not generalisable, so at the end they'd be insignificant anyway.

Moreover, in actual Touhou popularity votes, the number of choices one can make is arbitrary. The rules may say that you can only choose only 1 character as your only vote, or the rules get too generous and let you choose 10 at maximum. So modeling the voting process in the first way helps us coping with this change.

In this regard, if there is anything I like about the Niconico poll in particular, it's that it separates the 2nd/3rd choice from 4th/5th choice.



I shall present the TV rates rank versus net votes ranks first, because there's something I've never seen anyone pointed out before:


11th vote: A main diagonal

This is shocking because it has a correlation coefficient of 0.58 (!). What this tells is that for most of those character on top, they're also getting more TV rates as well, despite a common belief that they should not be related, or they should be anti-correlated instead because the less popular character are (supposed) mostly voted by people who really likes them. (Well, maybe except Rinnosuke and Wriggle.)

Plotting with actual TV rates instead:


Quadratic fit. We're plotting from 1th to 100th place because TV rates errors are too big for characters with inadequate votes.
Correlation coefficient: -0.57

It's even more clear that the top characters are getting more TV rates percentages than others.


What about choice rank below the 1st place? As I've mentioned, Niconico poll gives us good data on this:


The correlation to ranks are: 0.69(1st), -0.40(2nd/3rd), -0.45(4th/5th)

This gives us another result: The TV rates only balance at choice ranks around 2nd and 3rd (a bit closer to 3rd), all while choice ranks below there are the opposite to TVs: top characters actually get less of them.
And we can safely assume that it only gets magnified as we go down the choice ranks.



In conclusion, we established in this part that:
1. TV rates, as a whole, is unlikely to have any big trends, except maybe for a few outliers.
2. However, people put higher-ranked characters on top more frequently than lower-ranked character. This trend becomes opposite after the 3rd choice rank

This has a profounding implication to the increase of character slots in the new 12th poll: By opening up the 6th and 7th choice ranks, we're adding up new choice rank distributions which benefit lower ranked characters more than the higher ones, and also benefits outliers which have insanely low TV rates, e.g Kokoro (ranked 14th while having TV rates as low as 10%).

Moreover, despite being just a speculation, the turning point of TV rates at around 25th place may probably be related to the boundary between god-tier and top-tier discussed above. I can't think of a way to prove this with currently publicly available data, however.




3. Sorted poll results according to work and stage

From this point things get less interesting (i.e specific), so I'll skim over most of this part.


Net votes sorted by work and stage, 11th only.

It's not exactly a news that SDM has particular high total votes, while recent works gets less votes the more recent they are.
SA is high up there because, well, Koishi and Satori.

Though, I should mention that stage actually has relevance on net vote counts.
(EX/PH > 5,6 > 3 > 4 > 2 > 1)
Actual net votes for stage 4 should be higher because there are 3 Prismriver sister and 2 Tsukumo sisters, which all performed quite poorly.




Rank sorted by work and stage, 11th only.

It's also not a big news that SDM has a very high average rank than every other work.
As of their variance, SA is the biggest, then PBC.




Net votes time series, sorted by work and stage. Darker dots are more recent. Accounted for deflation.

Net vote percentages deflates over time as more characters are thrown into the killing field which they compete for critical resources i.e votes. To account for this deflation, the percentages have to be normalised so that the total net vote percentages for the works that appeared polls ago is the same as this poll.




TV rates sorted by work and stage, 11th only.

It's not exactly unknown that overall TV rates for recent works is lower.
Apart from outliers (Wriggle), I guess the only interesting thing is how the TV rates for UFO are so close together.

Also, TV rates are unaffected by stages.



TV rates time series, sorted by work and stage. Darker dots are more recent

It's, again, not exactly unknown that overall TV rates for recent works rise back as time goes, but it might be useful to somebody, who knows? Even things like this are actually very valuable now.

Also, TV rates are unaffected by stages.



I don't think there are conclusions to be drawn there, except we verified that:
1. Overall TV rates for recent works is lower, but rises back as time goes
2. Different stage bosses have different average net votes. In general, everything from stage 5 is privileged



4. Misc

Trivia stuff, which might worth a one-liner mention.

- Correlation coefficient of TV rates and stage for all Windows characters is, surprisingly, 0.03. You'd expect that only to happen by stage and work (for obvious reasons).

- TD has a crazily high female vote percentages. Except Kyouko. Otherwise every other work is quite close to the average.

-I've looked at the Chinese popularity polls and Reddit /r/touhou pouplarity polls too. I didn't talk about them because both polls' voter size is too small to infer useful things from it. As a rule of thumb, if you have a lot of people having less than 20 votes or something, the bad resolution and random fluctuation in people's minds will render the calculation useless. A voter size of 50000 should be good enough.




5. Summary

This post shows a few characteristics of the Touhou popularity poll:

1. Touhou popularity poll results follow a exponential distribution. For roughly every 10% percentile of rank, net votes halves
2. There is a clear boundary between god tier and top tier characters (which is kind of well-known), and a even clearer boundary between the middle tier and bottom tier (which is not)
3. People put higher-ranked characters on top (as top vote) more frequently than lower-ranked character. This becomes opposite for everything after the 3rd choice. The turning point is around 25th place
3.5. Because of the above, lower ranked characters should get buffed quite a bit in the new 12th popularity poll, since the character slots had increased from 5 to 7
4. Top vote rates differs from character to character in terms of amount and stability, and is not correlated to stages (but later works do receive less top vote rates until several years later), so perhaps people are being indecisive over this precious slot



And that's all, have fun waiting for/predicting the results!



A Colorful Calculating Creative and Cuddly Crafty Callipygous Clever Commander
- original art by Aiけん | ウサホリ -

Lollipop

  • stay woke
  • literally and figuratively dying
that's a lot of info :V (but it is useful)
Touhou 1CCS:
Hard: LLS, EoSD(NB), PCB(NB), IN, MoF, TD, DDC(NB), LoLK
Lunatic: EoSD, PCB, DDC, LoLK
Extra: LLS, EoSD, PCB(Extra&Phantasm), IN, MoF, SA, DDC, LoLK
Current Focus: 1cc SA Hard, or an Extra