DETECTION OF DEEPFAKE VIDEOS USING LONG DISTANCE ATTENTION
Abstract
Facial video forgery has the potential to create very misleading video material and pose serious
security risks because to the fast development of deepfake methods in the last few years. Even more
pressing and difficult is the identification of such fake videos. Right now, most detection algorithms
approach it like any other plain old binary classification issue. The research approaches the subject as
a unique fine-grained classification challenge due to the extremely minor distinctions between actual
and artificial faces. It has been noted that the majority of current face forgeries techniques produce the
same artefacts in both the spatial and temporal domains. These artefacts include generative errors in
the former and inter-frame discrepancies in the latter. Additionally, a spatial-temporal model is
suggested, which consists of two parts: one for detecting global forging traces in space, and the other
in time. Using an innovative long-distance attention mechanism, the two parts are constructed. One
part of the spatial domain is used for artefact capture in a single frame, while the other part of the
temporal domain is employed for artefact capture in successive frames. They produce patch-based
attention maps. A more holistic view is provided by the attention approach, which aids in the extraction
of local statistical information and the better assembly of global information. Lastly, similar to
previous granular classification techniques, the network is directed to concentrate on critical areas of
the face by use of attention maps. Proof that the suggested approach attains state-of-the-art
performance is provided by experimental findings on several publicly available datasets.
and the proposed long distance attention method can effectively capture pivotal parts for face forgery.