::: ANTGPT: CAN LARGE LANGUAGE MODELS HELP LONG-TERM ACTION ANTICIPATION FROM VIDEOS?

Table of Contents

1. long-term action anticipation

predict actor future behehaviour from (make-into-verb-noun-seq (observe-video))

2. approach

2.1. top-down

infer goal, plan the next action

2.2. bottom-up

predit next action autoregressively by modeling temporal dynamics

2.2.1. auto

Author: Linfeng He

Created: 2024-04-03 Wed 19:37