Logistic Regression is Still Alive and Effective: The 3 rd YouTube 8M Challenge Solution of the IVUL-KAUST team

Abstract

In this report, we present our solution for the 3 rd YouTube-8M Video Understanding Challenge for a task of temporal localization of topics within a video. Our team achieves the 9 th place in the Public Leaderboard and the 11th place in the Private Leaderboard with a difference of 4.5 × 10−4 from the 10th gold medal winner. Overall, we train 20 different models independently and use their ensemble to predict segment scores. Along with a video classifier, we generate final scores for each segment. We use one-loss or two-loss training strategies for different models to make full use of video-level annotations and segmentlevel annotations. Furthermore, we adopt a teacher-student model and deep clustering to generate pseudo-labels to increase the amount of fully-annotated data.

Publication
International Conference on Computer Vision