Recognizing Human-Object Interactions in Videos