While I agree with using a team system, I suspect using a point evaluation system that doesn't relate to an opponent actually
prevents less skilled players from making as large a contribution, and restricts choice of categories more than playing against others does.
For example, if survival is scored the same across difficulties, obviously there'd be no incentive to play Lunatic survival since you could play Easy instead. Therefore Lunatic survival would need to be worth much more than Easy. Aside from balancing issues (what would an Easy perfect correspond to? A Hard 1cc?), that means the Easy survival players wouldn't be able to contribute much to their team, since their runs would be worth so little in comparison.
Meanwhile if players are ranked based on how well they do against their opponent(s), it wouldn't matter-- since the Lunatic survival player wouldn't be competing against the Easy player, their runs can both be given the same value.
Additionally, I don't think it's reasonable from an organization standpoint. It'd force every possible category to be evaluated from the start of the contest, instead of only the categories that people actually compete in. This is especially relevant for scoring, where you can't have a single formula that applies across all games. There are plenty of scoring categories where % of WR wouldn't be a good starting point-- for example, in LLS Extra and HRtP almost every scoring-oriented run is within 90% of WR, and in many Extra stages survival (spell bonuses and remaining life count) makes up a significant proportion of your score before you even include scoring techniques.
So either every possible category would need to be given its own balancing formula, which I don't think is practical, or only a limited selection of games or categories could be chosen, which seems to miss the point of allowing free selection.
Even if that could be balanced, it forces everyone to play their specialty if they have one, with no option of improving at something you're less experienced at or even trying something new (against someone else of similar general skill who's also trying it for the first time). In other words, it creates the "wrong" categories you seem to be trying to avoid. For example, if I were participating in this potential contest where you can choose any category but they're all ranked together, I'd have no reason to submit to anything other than DDC Easy (or SoEW which I don't enjoy), since I can score at/near WR level there, so it'd cost points to my team if I chose anything else. In contrast, in a match-up version, I'd specifically be unable to choose DDC Easy scoring since there'd be no one to play against, so I'd need to improve at a different category instead. (It'd be unlikely I'd participate if I'm busy organizing it, but this is for the sake of an example.) I don't think the specialized players need an incentive to play the categories they're already focused on.
As a person who isn't close to the WR in any category, I guess I've never really felt that beating a score set by a fellow player is really that different to beating a score set by yourself.
If you're only aiming for self-improvement, why participate in a contest? While I primarily aim for simply setting PBs, the few times I've had a chance to play against someone else have provided a nice incentive. Especially when multiple players are submitting back-and-forth replays over a period of time.
And evaluating submissions based on the player's improvement, while it seems nice, is an extremely abusable concept-- people could break the system by choosing a category they've never played before.
Thanks for the ideas though.
On a different note, I've been wondering about a system with more than two teams. But that might run into the same balancing issues, since each team wouldn't play the exact same set of categories. Depends on how many people sign up, I guess-- if the teams are large enough it might average out.