This paper attempts to offer a plausible syntactic structure of the complement of the perception verb po- 'see' from Korean and English. The core of the proposals is that the perception verb complement is an incomplete clause, lower than TP and higher than vP, which is syntactically realized as an Event Phrase, and that this EventP contains a Voice Phrase. It is also suggested that the EventP involves an event operator, which is then controlled by the event argument assigned by the matrix perception verb, thereby accounting for the fact that the event time is simultaneous with the perception time. It will be shown that the current proposals can well account for various syntactic and semantic properties of the perception verb complements.