Possible explanation!
This is PROBABLY from the iTunes video which has 5.1 Surround. It probably got messed up in the encode so that only the Center channel is heard (the music comes from the L and R speakers)
If this guess is correct, then anyone with the iTunes videos can hear vocals-and-sounds-only or instrumental versions of the songs (and even the entire episodes) just by disabling certain audio channels (if that's even possible; I know nothing about surround sound)!
EDIT—update:Two different YouTube users [http://www.youtube.com/watch?v=tbco1kFU3F0]have[/url]
uploaded the musicless song, and
this user has been making karaoke, musicless, and instrumental versions of other songs (they aren't all perfectly splittable, though).
Edit 2: Updated first link.