Given that sound travels at about 300m/s and that the arena might be 150m deep the basic laws of physics say that by the time the sound reaches the back or the arena it's going to be about half a second delayed from the video. (Half a second is a pretty noticeable delay.) So my guess is that the video was deliberately lagged by that fraction of a second so that for the people at the back of the hall (who needed the video) the two would be in sync.
I've never really thought about this before, but it's pretty obvious when you do.