Deepfake Fraud in Banking and Finance: Exploiting AI for Identity Theft

By: C S Nakul Kalyan; Asia University

Abstract

This study focuses on the emerging threat of deepfake technologies in the financial and banking sector, where Artificial Intelligence (AI) has been mainly used to indulge in committing fraud and identity theft. The realistic manipulations have been created by using the existing techniques, such as Wave-Net (for Voice cloning), and Style-GAN (for face recreation), etc. The detection accuracy of CNN and RNN is more than 95%, where it is used to find the abnormality in patterns of audio-visual data. To increase security, techniques such as real-time detection and improved regulatory frameworks have been implemented. The proposed framework in this study overcomes the challenges, such as evolving deepfake generation and data scarcity, by developing a GAN-based detection framework to secure the banking and financial systems.

Keywords

Deepfake detection, financial fraud, identity theft, GAN, anomaly detection, and Artificial Intelligence (AI).

Introduction

The development of Artificial Intelligence (AI) has made digital payments and services in the financial and banking sectors easier, as highlighted in the existing research works [1][4]. On the other hand, these rapid improvements pose a critical threat where deepfake technologies have become an efficient tool to perform identity theft and financial crimes [2][7]. The rapidly growing advanced technologies enable scammers to produce realistic audio-video manipulations that imitate real humans to trick banks, customers, and higher authorities [6]. One of the notable incidents took place in a Hong Kong-based financial company that lost about $25 million by using deepfake technologies [1]. This article pro- poses a GAN-based detection framework, including advanced machine learning algorithms, to mainly secure the finance and banking sectors, and also provides a defense mechanism against AI-driven fraud by using existing frameworks and their limitations.

Proposed Methodology

In this section, we will go through the methodologies which has been used to overcome the threats in the finance and banking sector, where it mainly focuses on the identity theft part done by using AI. The recent studies include methods such as data collection, deepfake production, and generation frameworks, security measures, and evaluation protocols [3][6]. The proposed framework includes Generative Adversarial Networks (GANs), Machine Learning (ML), and advanced forensics analysis tools to detect the manipulations done in banking and financial fraud scenarios. The proposed method includes:

Data Collection and Preparation

To replicate the real-world fraud scenarios using deepfakes, the data has been collected from many sources, such as those used in [3][5]. The dataset that has been used has a total of more than 10,000 samples, which have been equally split into real and fake cases corresponding to these segments.

Real-World Financial Images and Videos

A total of 5000+ data samples have been collected from various sources of datasets, such as Google Open Images, Kaggle financial transaction datasets, and the real and imaginary scenarios from the payment and banking platforms, such as Alipay [3]. The data, such as facial images from online payment verifications (ages from 6 – 75 years) with a 1:1 multi-ethnic gender ratio, including Asian, Caucasian, African, and Hispanic, etc, where the image resolution of 1024 X 1024 pixels has been extracted from them.

Synthetic Deepfake Samples

Apart from real-world scenarios, an additional 5000+ synthetic data points have been generated by using advanced GAN frameworks such as Style GAN and Deepfake tools [6]. The deepfake scenarios have been perfectly imitated from the real-world identity thefts in financial sectors, such as manipulated face swaps in video calls, fake images, and fake authorized reports, etc [5]. The primary step in creating manipulated data is web scraping, which is done by regularized methods such as standardization and normalization. The images have been converted into tensors for model input, where the unorganized data has been structured by using regular expressions. To solve the class imbalance in fraud detection, techniques such as minimizing the illegal transactions and maximizing the real transactions have been employed, where this will be used to train the model accordingly to reduce the imbalance, which is a frequently acquired issue in fraud detection [3]. Ethical considerations have been followed by aligning with GDPR, where the personal data and real identities have not been used, and it assures their safety [2][4]. The descriptive statistics between the real and the deepfake payment images have been shown in below Table 1.

Table 1: Descriptive Statistics

Category

Real Payment Images

Deepfake Images

Number of Samples

5,000

5,000

Age Range

6 – 75 years

6 – 75 years

Gender Ratio

Near 1:1

Near 1:1

Ethnicity

Resolution

Asian, Caucasian, African, Hispanic

1024×1024 pixels

Same as real images

1024×1024 pixels

Deepfake Generation Techniques

The main reason to generate deepfake content by using audio and video synthesis is to replicate the real-world identity theft in banking and financial applications, where it mainly focuses on training the detection frameworks to produce more accurate predictions on the real-world banking frauds (such as imitating authorized executives in video conferences for financial transfers, etc.) [1][6].

Audio Generation

For generating high-quality audio deepfakes, methods such as WaveNet and parallel WaveNet have been employed. Voice conversion models, such as Dynamic Time Warping (DTW) and Gaussian Mixture Models (GMM), have been used to transfer voice and speech characteristics, including melfrequency cepstral coefficients (MFCCs). Advanced tools, such as Descript and Resemble AI, have been used to perform real-time voice replication using small samples (like 5-10 seconds of audio), which can replicate fraudulent activities, including financial scam calls [7].

Video and Image Generation

The GAN-based models, such as StyleGAN and CycleGAN, have been used to perform facial reenactment manipulations by swapping and switching the faces easily [3][6]. Long Short-Term Memory (LSTM) and Recurrent Neural Networks (RNNs) have been used to reduce the temporal consistency across every frame and which makes the lip synchronization and head movements smooth. With these advancements, the deepfake generations are targeted to commit financial fraud, including the generation of celebrity deepfakes, and misusing them [4][5]. By generating these advanced deepfakes with the generator, it surpasses the security systems of the discriminator used in the banking systems, leading to easy verification of the scammer’s identity and commitment to fraud.

Deepfake Detection Methods

The Detection framework is a combination of machine learning and deep learning, where they are mainly focused on identifying the manipulations in the financial and banking sectors [3][7]. A hybrid GAN-based detection framework has been developed using above mentioned dataset, which has the classifications of real and fraudulent transactions. Figure 1 presents the architectural overview of the proposed GAN-based detection system, where the Generator G(Z) creates synthetic samples from random noise Z, and the Discriminator D(X) distinguishes between real and fake content through adversarial training.

Figure 1: GAN-Based Detection Framework

Audio Detection

To detect the audio manipulations, techniques such as signal processing and machine learning have been used to investigate the audio pitch, timing, and spectrogram [6][7]. Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been used to detect the inconsistencies in the audio signal frequency (Mel-Spectrograms), where these techniques make sure that they detect any small variations and glitches that can result in scams such as voice cloning, phone number mismatches, and financial and banking fraud.

Video and Image Detection

To detect the visual manipulations, Support Vector Machines (SVMs) and Random Forests have been used to extract and categorize the different characteristics (such as optical flows and image inconsistencies) [6]. To detect the spatial- temporal manipulations (such as lip-sync and eye blinking irregularities), deep learning models such as CNNs, RNNs, hybrids, and temporal representation graphs Convolutional Networks (T-GCN) have been used [3]. The differen- tiation between real and deepfake content has been made by using the pre- trained networks, such as ResNet has also been used for feature extraction with the encoder-decoder in the main GAN framework. Figure 2 demonstrates the encoder-decoder mechanism employed in the detection framework, where original facial images are compressed into latent representations through the encoder (Encoder-1, Encoder-2), and subsequently reconstructed through corresponding decoders (Decoder-1, Decoder-2).

Figure 2: Encoder-Decoder Architecture

Integrated Fraud Detection

A single model has been integrated with audio-visual signals, with Graph-based anomaly detection (GAD) has been used to detect the manipulations in trans- action networks [3]. The system has identified a threshold of 0.18 as a minimum amount of manipulations which has occurred in online payment platforms, and has included a suspicious activity for major illegal activities. The training has been done through adversarial optimization, where back- propagation has been used to improve the accuracy of the discriminator. The hyperparameters have been achieving a learning rate of 0.001 with a batch size of 32, which has been tuned by using grid search.

Mitigation Strategies

With the discussions made by regulatory and ethical debates, the deepfake prevention methods have been listed below, such as [2][4][6]:

The comprehensive framework addresses multiple challenges through targeted mitigation strategies, as detailed in Table 2. These strategies are aligned with industry best practices and regulatory requirements.

Table 2: Deepfake Detection Challenges and Mitigation Strategies

Note: The impact measurements represent improvements over baseline models without the specified mitigation strategies. All strategies were implemented in the proposed framework to achieve the final >95% detection accuracy.

Real-Time Interdiction

This mitigation strategy involves implementing a detection framework in bank- ing APIs for blocking the process immediately if it is detected as a fraudulent one by using tools such as SoftVC VITS for cross-checking and verifying during video conferences and audio calls [6].

Regulatory Frameworks

This strategy involves including the forensic analysis framework (such as using blockchains for transaction auditing) to avoid the data shortage issues and to prioritize transparency and accountability domains [1][4]. The simulation of multi-stakeholder collaborations with the financial industries and regulators has been done to improve the model’s detection accuracy against advanced developing scams such as SIM-swapping attacks [2]. Figure 3 illustrates the complete attack cycle of a SIM-swapping fraud scenario, where attackers manipulate service providers to gain unauthorized access to victim’s financial accounts.

Figure 3: SIM Swap Attack Process: Illustrating the sequential steps of identity theft through SIM card manipulation in financial fraud scenarios

Ethical and Privacy Enhancements

To avoid bias in social groupings, the models have been integrated with separate fairness specifications, which include privacy-preserving methods such as Federated learning (FL) to manage the sensitive and important data [5].

Evaluation and Metrics

The performance of the model has been evaluated by 70-20-10 of training, Vali- dation, and testing with the above-mentioned dataset. The Performance metrics are as follows:

Accuracy and Detection Rate

The prediction accuracy has been achieved >95% for the deepfake detection, where it is measured by precision, Recall, and F1-scores to detect the false positives and negatives in financial and banking scams [3][7]. The comparative analysis of different detection architectures is presented in Table 3, which demonstrates the effectiveness of the proposed GAN-based framework against baseline models

Table 3: Performance Comparison of Detection Models

Model Architecture

Precision (%)

Recall (%)

F1-Score (%)

AUC-ROC

Inference Time (sec/sample)

CNN Only

89.2

87.5

88.3

0.87

0.8

RNN Only

85.7

84.3

85.0

0.84

1.2

SVM + Random Forest

82.4

80.1

81.2

0.81

1.5

Hybrid CNN-RNN

92.8

91.4

92.1

0.91

1.0

Proposed GAN-based Framework

96.3

95.7

96.0

0.94

0.9

Note: All models were evaluated on the same test dataset (10% of 10,000 samples). The proposed GAN-based framework demonstrates superior performance across all metrics while maintaining competitive inference time suitable for real- time fraud detection applications.

Area Under the Curve (AUC)

The ROC-AUC curves have been generated by the model’s performance accuracy against the detection of adversarial attacks, which is done in the banking sector. The values of the curves have been showing >0.90, which shows this is a strong detection framework [3].

Other Metrics

The Multi-class detection framework uses the Mean Average Precision (MAP), where it makes the real-time application reliable on computational efficiency (such as 1 second per sample > inference time) [6]. The 5-fold cross-validation ensured the predictive ability by eliminating the evaluations, such as component impact testing (which decreases the RNNs’ temporal accuracy by 15%). The synthetic augmentation technique has been used to overcome the limitations, such as data scarcity [3]. This methodology shows further opportunities to develop the detection framework to overcome identity theft and scams by using deepfakes in the banking and finance industries.

Conclusion

This study shows the major threat posed by the advanced deepfake technologies by using Artificial Intelligence (AI), engaging in identity theft and fraudulent activities in the finance and banking sectors. The proposed method, which uses the Wave-Net and Style-GAN to generate deepfakes, and a hybrid framework that includes CNNs with RNNs to detect deepfakes, has a prediction accuracy of more than 95% in detecting the malicious inconsistencies. The prevention methods, such as real-time prevention, and regulatory frameworks, can able to handle the challenges such as data scarcity and evolving fraud patterns. To improve the security in the digital financial and banking environments, future research directions have focused on developing a flexible and secure framework, like the GAN-based detection model, which has a higher detection rate and provides security to the sensitive data.

References

  1. Jon Bateman. Deepfakes and synthetic media in the financial system: As- sessing threat scenarios. Carnegie Endowment for International Peace., 2022.
  2. Indra Jaya Gunawan and Sylvia Janisriwati. Legal analysis on the use of deepfake technology: threats to indonesian banking institutions. Law and Justice, 8(2):192–210, 2023.
  3. Zong Ke, Shicheng Zhou, Yining Zhou, Chia Hong Chang, and Rong Zhang. Detection of ai deepfake and fraud in online payments using gan-based mod- els. In 2025 8th International Conference on Advanced Algorithms and Con- trol Engineering (ICAACE), pages 1786–1790. IEEE, 2025.
  4. Leo SF Lin. Examining the role of deepfake technology in organized fraud: Legal, security, and governance challenges. Frontiers in Law, 4:6–17, 2025.
  5. Anil Kumar Pakina, Deepak Kejriwal, Anshul Goel, and Tejaskumar Dat- tatray Pujari. Ai-generated synthetic identities in fin tech: Detecting deep fakes kyc fraud using behavioral biometrics. IOSR Journal of Computer Engineering, 25(3):26–37, 2023.
  6. Akshay Shetye, Nilakshi Jain, Shwetambari Borade, and Vineet Kumar. Deepfake technologies in financial fraud: Generation, detection, and mit- igation strategies. In 2025 Global Conference in Emerging Technology (GINOTECH), pages 1–6. IEEE, 2025.
  7. Damilola Bartholomew Sholademi. Leveraging ai for detecting deep fakes and combating financial fraudulent identity schemes. 2022.
  8. Al-Ayyoub, M., AlZu’bi, S., Jararweh, Y., Shehab, M. A., & Gupta, B. B. (2018). Accelerating 3D medical volume segmentation using GPUs. Multimedia Tools and Applications, 77(4), 4939-4958.
  9. Gupta, S., & Gupta, B. B. (2015, May). PHP-sensor: a prototype method to discover workflow violation and XSS vulnerabilities in PHP web applications. In Proceedings of the 12th ACM international conference on computing frontiers (pp. 1-8).
  10. Gupta, S., & Gupta, B. B. (2018). XSS-secure as a service for the platforms of online social network-based multimedia web applications in cloud. Multimedia Tools and Applications, 77(4), 4829-4861.

Cite As

Kalyan C S N (2025) Deepfake Fraud in Banking and Finance: Exploiting AI for Identity Theft, Insights2Techinfo, pp.1

89230cookie-checkDeepfake Fraud in Banking and Finance: Exploiting AI for Identity Theft
Share this:

Leave a Reply

Your email address will not be published.