spirosgyros.net

Exploring Graph Analytics for Detecting Financial Fraud

Written on

In the realm of finance, numerous institutions rely on rule-based frameworks to identify fraudulent activities. Yet, as deception strategies continue to evolve, it is imperative to advance our detection techniques to effectively counteract these threats.

Our research indicates that by strategically utilizing appropriate data and analytical tools, we can markedly improve our capacity to detect patterns in transaction flows and prioritize accounts that warrant further scrutiny. Our team ventured into network science to analyze the complex interactions between accounts, revealing how funds traverse through a network.

A deeper comprehension of transaction levels, which will be elaborated upon in Section III, can illuminate how fraud operates within a network. For example, fraudsters frequently take advantage of intermediary accounts to mask their identities and launder money across various channels. Early detection of these influences can significantly reduce potential harm and losses. One of the methodologies we adopted involves assessing node centralities, which allows us to quantify the number of connections associated with each account. This helps identify those acting as brokers or bridges, and assess how quickly and effectively an account can transact with others, thereby facilitating the smooth flow of funds across the network, as detailed in Section IV.

Furthermore, we explored the relationship between this metric and instances of fraud across different connection levels. By broadening our analysis to encompass three levels of connections, we gained a more nuanced understanding of potential weaknesses and their implications for the network.

This article is structured as follows: (1) an overview of fraud, (2) the current state of fraud in the Philippines, (3) the application of Python packages and graph databases for fraud pattern analysis, and (4) metrics for fraud detection.

I. Understanding Fraud

There is a growing need for awareness regarding financial crimes in the Philippines. To prevent such occurrences, it is vital to easily identify patterns, raise public awareness, and educate individuals to enhance their safety during transactions. Thus, we begin by defining fraud and its various forms.

According to Article 1338 of the Philippine Civil Code, fraud involves one party using deceptive words or actions to persuade another into an agreement they wouldn't otherwise have accepted. This deception typically involves a cunning scheme with fraudulent intent, often executed through the concealment or omission of critical information (Supreme Court E-library, 2013).

Some common forms of fraud found in financial institutions include:

  1. Card cloning: The unauthorized replication of credit, debit, or ATM cards to make purchases or withdraw funds from another's account.
  2. Phishing/scams: Instances where individuals impersonate legitimate institutions, sending messages to trick victims into divulging personal data, potentially leading to identity theft.
  3. Identity theft: The unauthorized use of someone else's personal information to conduct illicit transactions.
  4. Unofficial online lending: Fraudsters masquerading as lenders from reputable institutions or private entities, deceiving victims into engaging in harmful transactions.
  5. Money mules: Individuals who, knowingly or unknowingly, facilitate fraudulent activities by allowing their accounts to be used in what may be a larger laundering scheme.
  6. Loan application fraud: Employees manipulating processes to secure loan approvals illicitly (Bangko Sentral ng Pilipinas, 2023).

II. The Current Landscape of Fraud in the Philippines

Having defined fraudulent activities, it is crucial to assess their impact and prevalence in the Philippines, and understand the necessity of addressing these issues.

In 2022, approximately 8.7% of digital transactions in the Philippines were flagged as potentially fraudulent, significantly exceeding the global average, positioning the country as the third highest in a study by TransUnion (TransUnion, 2023).

Moreover, according to the Anti-Money Laundering Council (AMLC), the rise in suspicious transaction reports (STRs) in 2021 was linked to the rapid growth of digital banking and electronic wallets. This increase coincided with remarkable growth rates for electronic fund transfers through PESONet and InstaPay, reaching 164% and 223% in the first half of 2021, respectively. The Financial Intelligence Unit reported that 89% of suspicious transactions in 2021 were associated with money mules, with the remaining 11% spanning from 2016 to 2021, accounting for 99% of the total monetary value over those six years (PHP 505 billion).

Given these pervasive issues, it is recommended that organizations equip themselves with the right tools and innovative methods for detecting fraud while maintaining a seamless consumer experience.

III. Utilizing Python Packages and Graph Databases for Fraud Detection

To enhance fraud detection capabilities, our team explored the principles of network science.

While traditional fraud detection systems used by banks are prevalent, they often exhibit limitations, focusing primarily on basic features such as demographics or direct transactions (Figure 1a), neglecting more complex connections (Figures 1b and 1c). This shortcoming can lead to the oversight of numerous fraudulent activities. By examining higher-level connections, we can identify intricate fraud patterns or collusion involving multiple entities. For instance, analyzing third-level connections (Figure 1c) can uncover fraud schemes entailing multiple layers of deception.

To bolster our system, we incorporated network analysis to identify complex relationships among transactions, scrutinizing suspicious activities across three levels of connections. This enhanced evaluation included attributes such as account holder names, transaction dates, amounts, and fraud labels.

Some of the Python packages we utilized include:

  • NetworkX: A widely-used library within the Python ecosystem, known for its user-friendly interface and compatibility with other machine learning libraries. It is instrumental in fraud detection by modeling relationships between entities as graphs. Algorithms like centrality measures and community detection, detailed in Section IV, can reveal patterns indicative of fraudulent behavior. For example, anomalies may be identified through nodes exhibiting unusual influence or tightly-knit communities within a network.
  • igraph: Accessible in both Python and R, igraph excels in handling large-scale networks requiring high-performance computations. It uncovers essential network features, such as centrality measures and betweenness, discussed further in Section IV, to identify key nodes and potential intermediaries in fraudulent schemes. The community detection algorithms in igraph assist analysts in uncovering collusion and enhancing fraud detection efforts.

Fraud detection encompasses various techniques aimed at identifying and preventing deceit. Some methods mentioned earlier include centrality measures, community detection, anomaly detection, pattern recognition, and integrating network analysis with machine learning models for a more sophisticated approach.

Graph centrality measures are particularly valuable for several reasons. A study by Yoo et al., 2023, demonstrated that incorporating centrality measures in Medicare fraud detection improved precision by 4%, recall by 24%, and F1-score by 14% compared to graph neural network models. Similarly, Prusti et al., 2021, found that adding centrality features led to an average increase of up to 6% in machine learning evaluation metrics. Centrality measures provide insights into the significance and influence of account holders within a network, revealing critical relationships and highlighting major players in fraudulent activities that might otherwise go unnoticed.

In the subsequent sections, we will delve deeper into centrality measures and community detection techniques utilized in fraud detection.

IV. Metrics for Fraud Detection

(i) Centrality Measures

Understanding centrality measures within a network is a crucial strategy for flagging suspicious accounts. By identifying central nodes (or account holders) that represent influential accounts, targeted interventions can be executed. Below, we will discuss unusual alterations in centrality measures that may signify potentially fraudulent behavior, and how these influential account holders affect the optimization of money flow within a coordinated money laundering scheme.

We will analyze the following network, illustrated in Figure 2, to explore each centrality measure and determine which nodes—1, 2, or 3—serve as the central point of the network.

(a) Degree Centrality

Degree centrality highlights the most interconnected nodes.

  • It quantifies the number of connections a node possesses in a network.
  • It identifies prominent nodes, indicating their significance based on the number of edges linked to them.

As shown, Node 1 exhibits the highest degree centrality, making it the account with the most connections. This node may be a focal point for fraud investigations, as a highly connected account could serve as a central hub for coordinating fraudulent activities and exert considerable influence over money transfers within the network.

(b) Closeness Centrality

Closeness centrality measures how swiftly a node can reach all others in a network.

  • It is the reciprocal of the sum of distances to other nodes.
  • It emphasizes nodes that are proximal to all others.

At higher connection levels, nodes with elevated closeness centrality are strategically positioned during money flows across a significant portion of the network. Node 3, for instance, has the highest closeness centrality, indicating that it may be flagged as suspicious due to its short average distance to other nodes. This position enables it to control transaction flows and facilitates efficient monetary movement within the network, potentially orchestrating fraudulent activities by connecting other deceitful account holders.

(c) Betweenness Centrality

Betweenness centrality identifies nodes serving as crucial intermediaries or gatekeepers within a network.

  • It measures how frequently a node appears in the shortest paths between others.
  • This is calculated by taking the sum of shortest paths through a node divided by all shortest paths.

In the following figure, traversing from the blue node to the green node through Node 1 is considered a valid path.

As illustrated, Node 3 holds the highest betweenness centrality, placing it in a strategically significant position to regulate or manipulate transaction flows between seemingly disconnected groups. Within a fraud network, the investigation of accounts exhibiting high betweenness could disrupt transactions and reveal hidden connections reliant on those accounts. Sudden increases or unexpected transfers through these accounts should be flagged as suspicious, highlighting the necessity of early detection to comprehend the overall network's connectivity and efficiency.

(d) PageRank Centrality

PageRank centrality evaluates a node's significance based on the quality and quantity of its connections.

  • It highlights influential nodes within the network, considering both inbound and outbound links, making it applicable in both directed and undirected networks.
  • Originally developed by Google co-founders to rank web pages, PageRank can also be leveraged for fraud detection, identifying entities that exert disproportionate influence.

(ii) Community Detection

Various techniques can be utilized to identify groups or communities of nodes in a network that share denser connections among themselves than with the rest of the network. Each method has its strengths and can be employed for fraud detection.

Two primary approaches for understanding clusters or communities include the Divisive method (Girvan-Newman algorithm) and the Agglomerative method (Louvain algorithm). They differ in their community formation and merging processes, with the choice of algorithm depending on network size, granularity of community detection, and specific fraud characteristics being targeted. A combination of both approaches may also prove effective.

(a) Louvain Algorithm

The Louvain algorithm takes a bottom-up approach, iteratively merging nodes based on their similarity to others. It optimizes the identification of densely connected groups (Figure 10), making it suitable for large networks.

Fraudulent behaviors often involve collaboration among multiple individuals. The Louvain algorithm can reveal communities with unusual transaction volumes or sudden structural shifts over time, highlighting behavioral patterns that differ from legitimate activities. Additionally, accounts that do not align with any established community or span multiple communities, like node 0 (Figure 11), may be flagged as anomalies warranting further investigation.

(b) Girvan-Newman Algorithm

The Girvan-Newman algorithm (Girvan and Newman, 2002) adopts a contrasting approach to Louvain by recursively dividing the network into smaller communities. It is effective for detecting communities with distinct boundaries.

The procedure initiates by iteratively removing edges with the highest betweenness centrality, disrupting inter-community flows and revealing network structure. For instance, the edge between nodes [0,31] has the highest betweenness score and is removed first (Figure 12). After ten iterations, the edge between nodes [2,13] is eliminated.

As previously mentioned, edges with high betweenness signify vital connections within a network. The Girvan-Newman approach uncovers communities that may assist in identifying potential fraud rings or collusion (Ayeb et al, 2020).

These algorithms are beneficial when examining higher connection levels, as fraudulent schemes often involve collaboration between multiple entities. Analyzing connections beyond the first degree provides a more nuanced insight into complex relationships within the network, enhancing our fraud detection efforts.

References

  • G.R. ?171428 — ALEJANDRO V. TANKEH, PETITIONER, VS. DEVELOPMENT BANK OF THE PHILIPPINES, STERLING SHIPPING LINES, INC., RUPERTO V. TANKEH, VICENTE ARENAS, AND ASSET PRIVATIZATION TRUST, RESPONDENTS.D E C I S I O N — Supreme Court E-Library. elibrary.judiciary.gov.ph/thebookshelf/showdocs/1/56359#:~:text=Under%20Article%201338%20of%20the,would%20not%20have%20agreed%20to.
  • Bangko Sentral ng Pilipinas [Consumer Protection and Market Conduct Office Strategic Communication and Advocacy]. “PROTECT YOURSELF FROM FRAUD AND SCAM.” Bangko Sentral Ng Pilipinas, www.bsp.gov.ph/Media_and_Research/Primers%20Faqs/Protect_yourself_from_Fraud_and_Scam. Accessed 21 Dec. 2023.
  • TransUnion. “TransUnion Report Finds Digital Fraud Attempts Fall 18% in the Philippines but Rise 80% Globally From Pre-Pandemic Levels.” TransUnion, 31 Mar. 2023, newsroom.transunion.ph/transunion-report-finds-digital-fraud-attempts-fall-18-in-the-philippines-but-rise-80-globally-from-pre-pandemic-levels. Accessed 27 Dec. 2023.
  • Agcaoili, Lawrence. “AMLC Warns Money Mules Scams Rising in Philippines.” Philstar.com, 12 Feb. 2023, www.philstar.com/business/2023/02/13/2244465/amlc-warns-money-mules-scams-rising-philippines#:~:text=In%20a%20report%20on%20money,the%20first%20quarter%20of%202022.
  • PageRank U.S. Patent — Method for node ranking in a linked database — Patent number 6,285,999
  • Arasteh, M., Alizadeh, S. A fast divisive community detection algorithm based on edge degree betweenness centrality. Appl Intell 49, 689–702 (2019). https://doi.org/10.1007/s10489-018-1297-9
  • Chaudhary, L., Singh, B. (2019). Community Detection Using an Enhanced Louvain Method in Complex Networks. In: Fahrnberger, G., Gopinathan, S., Parida, L. (eds) Distributed Computing and Internet Technology. ICDCIT 2019. Lecture Notes in Computer Science(), vol 11319. Springer, Cham. https://doi.org/10.1007/978-3-030-05366-6_20
  • Girvan, M., & Newman, M. E. (2002). Community structure in social and biological networks. Proceedings of the national academy of sciences, 99(12), 7821–7826.
  • Safa El Ayeb, Baptiste Hemery, Fabrice Jeanne, Estelle Pawlowski Cherrier. Community Detection for Mobile Money Fraud Detection. 2020 Seventh International Conference on Social Networks Analysis, Management and Security (SNAMS), Dec 2020, Paris, France. https://doi.org/10.1109/SNAMS52053.2020.9336578
  • Prusti, D., Das, D. & Rath, S.K. Credit Card Fraud Detection Technique by Applying Graph Database Model. Arab J Sci Eng 46, 1–20 (2021). https://doi.org/10.1007/s13369-021-05682-9
    1. Yoo, J. Shin and S. Kyeong, “Medicare Fraud Detection Using Graph Analysis: A Comparative Study of Machine Learning and Graph Neural Networks,” in IEEE Access, vol. 11, pp. 88278–88294, 2023, doi: 10.1109/ACCESS.2023.3305962.

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

The Fascinating Chemistry of Firework Colors Explained

Explore the chemistry behind the vibrant colors of fireworks and how they light up the night sky during celebrations.

# A Quantum Comedy: When Science Meets Humor in the Universe

Explore a humorous take on quantum physics through a clever joke involving famous physicists and their theories.

Discovering True Freedom: The Path Beyond Materialism

Explore the distinction between misguided and true freedom, emphasizing self-awareness and inner growth over material success.

Conquering Procrastination: Effective Strategies for Success

Explore actionable strategies to overcome procrastination and boost productivity, transforming

Embracing Our Flaws: A Journey to Self-Acceptance

Explore the importance of accepting our flaws as part of the human experience and how to navigate them for personal growth.

Challenging the Myth of the Soulmate: A Realistic Approach

The belief in a soulmate is a romantic illusion. True partnerships are built through effort, compatibility, and mutual understanding.

# Transform Your Life Through Journaling: A Guide to Getting Started

Discover the transformative power of journaling and learn how to start a rewarding practice that enhances your mental wellbeing.

A Software Engineer's Creative Solution to Lost Luggage Dilemma

A software engineer creatively solves the issue of lost luggage after receiving inadequate assistance from an airline.