Referring to Objects in Videos Using Spatio-Temporal Identifying Descriptions Peratham Wiriyathammabhum author Abhinav Shrivastava author Vlad Morariu author Larry Davis author 2019-06 text Proceedings of the Second Workshop on Shortcomings in Vision and Language Raffaella Bernardi editor Raquel Fernandez editor Spandana Gella editor Kushal Kafle editor Christopher Kanan editor Stefan Lee editor Moin Nabi editor Association for Computational Linguistics Minneapolis, Minnesota conference publication wiriyathammabhum-etal-2019-referring 10.18653/v1/W19-1802 https://aclanthology.org/W19-1802/ 2019-06 14 25