Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

Share

apidays LIVE Hong Kong 2021 - Federated Learning for Banking by Isaac Wong, WeBank

Download to read offline

apidays LIVE Hong Kong 2021 - API Ecosystem & Data Interchange
August 25 & 26, 2021

Federated Learning for Banking
Isaac Wong, AI Solution Architect at WeBank

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

apidays LIVE Hong Kong 2021 - Federated Learning for Banking by Isaac Wong, WeBank

  1. 1. Federated Learning(FL) in Action Business Use Case and Value Proposition
  2. 2. Agenda 2 • Brief Intro of WeBank • Top Strategic Technology Trend • Why Federated Learning • Proven Use Cases of Federated Learning • Federated Learning Benchmarking • Q & A
  3. 3. WeBank China’s First Digital Bank
  4. 4. IT Staff Founded in 2014 Driven by technology and innovation Peak Daily Transactions Individuals & SMEs Served by WeBank Customers First Digital Bank 1st 270+mn >56% 750mn High Concurrent Transaction Processing 4 1.8+mn WeBank: China’s 1st Digital Bank
  5. 5. ( ─────────────── ) Efficiency * UX * Scale Cost * Risk Optimize Value of FinTech: • Chatbot Handles 98% of inquiries • Remote KYC FAR ~1 in a million • Quality Control 100% coverage rate • FL for Credit Scoring AUC increased by 12%, cost reduced by 5%-10% • AI Risk Mgmt. Diverse alternative data sources (e.g. satellite, GPS, sentiment) • ARM-Infrastructure Stable and secure • Distributed Architecture 12k standardized servers • Precision-Marketing Customer acquisition cost reduced by 93% • Smart Risk Mgmt. 100K+ variables; 387 models of 44 types Artificial Intelligence Blockchain Cloud Computing Big Data • Supplier of China’s Blockchain-based Service Network • A/C Mgmt. & Reconciliation Financial-grade, 170 million+ transactions w/o error • Arbitration Chain Stored 3 billion+ records Dispute resolution reduced from 6+ months to ~7 days • Supply Chain Finance Platform Served over 10k companies • Mainland-Macau Health Code Improved cross border traveling efficiency • Macau Smart City Improved government service efficiency by 50% • Copy Rights Platform Stored over 5 million press releases 5 Leading Technological Capabilities : ABCD
  6. 6. Top Strategic Technology Trend
  7. 7. Hype Cycle for Privacy, 2021 7 • Hype cycle: In H1 2020, researchers published more than 1,000 papers on FL— compared to just 180 total papers in 2018. • Google searches for the term is on surging trend
  8. 8. Gartner Top Strategic Technology Trends for 2021 8 • Privacy is becoming a bigger issue, and new regulations will force organizations to be more concerned about privacy protection. • Gartner believes that by 2025, half of large organizations will implement privacy-enhancing computation(PEC) for multiparty data analytics use cases.
  9. 9. Why Federated Learning
  10. 10. Limits of Traditional ML 10 Combining Results - RISKY Buying Data - ILLEGAL Using Desensitization Data - INEFFECTIVE Directly buying data from 3rd party companies is getting banned around the world and violates privacy. Unresolved Issue Getting and using desensitization data between corporations cannot provide any guarantee of the outcome and performance of modeling. Using results from models individually from different data sources: Companies take their own risks to the results. Bank A Social Media B Companies cannot buy data directly under more restricted laws. Further audit and privacy concerns make companies unwilling to collaborate. Ways Blocked Between Collaborators Financial Data include credit reports, transaction history and fraud detection, etc. User Data include user portrait, activity history, interest labels, and consumption habits, etc. Current Challenges Unwillingness of Data Sharing within Departments/Subsidiaries Data Platform Data? NO Consumers Dept SME Dept Corporate Dept Parent company finds it hard to build a universal data platform. Suffocation of Data Collaboration limits the Effectiveness of ML
  11. 11. FL resolves Limitations of Traditional ML 11 FL deployed for a Single Financial Services Institute(FSI) FSI A Data Partner 2 Dept/Subs 1 Dept/Subs 2 Dept/Subs N … Data Partner N Data Partner 1 … FL Network deployed for Multiple FSI Operator Data Provider 2 FSI A FSI B FSI N … Data Provider N Data Provider 1 … • A distributed machine learning framework that helps multiple parties (e.g. multiple departments / subsidiaries / organizations) effectively and collaboratively building models in compliance with user privacy and data security rules, as well as government policies and regulations
  12. 12. Categorization of FL 12 Large overlap of sample IDs (users) of the two data sets Large overlap of features of the two data sets Horizontal FL “Aggregate” IDs Vertical FL Samples Features Samples Features “Aggregate” features Data from C Vertical Federated Learning Aggregate Features Labels Data from B Data from A Labels Horizontal Federated Learning Aggregate Samples Labels Data from B
  13. 13. Proven Use Cases of FL 13
  14. 14. WeBank serves more SME with FL technology Limited SME data Difficult for WeBank to achieve large-scale growth in its SME credit business Achieve inclusive finance WeBank hopes to serve more SME that is underserved in financial services. No. customers : SME 1.88 Mn+ FL technology WeBank  ID, Y(Overdue)  Loan size: 2Bn+  No. customers: 300K+ A large invoicing Co.  ID, X (Invoice data for SME) 02 03 01 Goal Result Challenge
  15. 15. Strengthen Anti-Money Laundering(AML) in Banking Industry Internet company Vertical FL Mobile payment and geolocation data  E-commerce shopping  Map track  ... Expand AML samples through horizontal FL and build a baseline AML model Expand the dimension of customer characteristics through vertical FL to further optimize the model effect Bank 1 Bank 3 Horizontal FL Bank 2 Bank transaction data  Transfer  To pay  ... • Due to data security requirements, financial institutions such as banks and insurance companies model data locally • With FL, models of various institutions can be combined to break the barriers between data and improve the accuracy of the AML system and the efficiency of reviewers
  16. 16. Enhance bank credit risk control capabilities 02 03 01 Goal Result Challenge Unresolved data privacy challenge The lack of privacy data protection mechanisms prohibit usage of external data Ramping up risk control capability The bank aim at improving credit risk control capabilities through leveraging external data to fulfill regulatory requirements Enhance retail credit model, while fulfilling the self- built risk control regulatory requirements A bank  ID, Y (Overdue)  Loan size: 10Bn+  No. customers: 1Mn+ A large Internet Co.  ID, X (Internet behavior data) FL Technology
  17. 17. Optimize Pricing for insurance industry 17 Internet company Insurance company 2 Vertical FL Horizontal FL Insurance company N Assist reinsurance companies to establish auto insurance pricing model for insurance company: • Vertical federation introduces and mines the Internet big data "from the human factor", • Horizontal federation expands the scale of the insurer’ s traditional factor data set, enhancing risk analysis of car owners Insurance company data  Underwriting data  Claim data  Internet of Vehicles data  ... Internet behavior data  Trip data  Consumption data  Information preference  Driving violation data  ... Insurance company 1
  18. 18. FL Benchmarking 18
  19. 19. FL Illustration– Give Me Some Credit(GMSC) 19 • Use a public dataset: https://www.kaggle.com/c/GiveMeSomeCredit/data • Credit scoring algorithms predicting the probability that somebody will experience financial distress in the next two years. • Public Best AUC – 0.86390 Data Summary Data Set Name: Give Me Some Credit No. Records: 150K Target Variable: With Default(Y/N) Explanatory Variables: 10 1. Age 2. Debt Ratio 3. Monthly Income 4. No. Time 30- 59 Days Past Due Not Worse 5. No. Time 60-89 Days Past Due Not Worse 6. No. Times 90 Days Late 7. Revolving Utilization Of Unsecured Lines 8. No. Open Credit Lines And Loans 9. No. Real Estate Loans Or Lines 10. No. Dependents
  20. 20. Data Preprocessing Modelling FL Illustration Scope Binary Classification for Credit Default Prediction Party A Data Party B Data POC Data • 150K Records • 6 X • Y Party A 1. Debt Ratio 2. Monthly Income 3. No. Time 30-59 Days Past Due Not Worse 4. No. Time 60-89 Days Past Due Not Worse 5. No. Times 90 Days Late 6. Age ID1 ID2 ID3 ID 150K . . . . . Binary classification • Cr Default Machine Learning Model • Predict Credit Default • Train/ Validation Ratio: 8:2 X: Explanatory Variables Y: Target Variable Data Attributes Tree LR SBT SVM NN Algorithms Demonstrated 1. Secure Gradient Boost(SBT) Party B 1.No. Dependents 2.No. Open Credit Lines And Loans 3.No. Real Estate Loans Or Lines 4.Revolving Utilization Of Unsecured Lines • 150K Records • 4 X Intersect Read Data Participat e
  21. 21. FL with Federated AI Technology Enabler (FATE) Enterprise Version 21 FATE support Secured Boosting Tree(SBT) Algorithm using Homomorphic Encryption, similar to XGBoost/GBDT Algo configuration Training Process Modelling Dashboard
  22. 22. Training Result Comparison 22 Bank Data 150k IDs 6 X | 1 Y(label) Local ML XGBoost iter: 70 max_depth: 5 Train AUC 0.821 Validation AUC 0.802 Partial Local Result Bank Data 150k IDs 6 X | 1 Y(label) Federated ML Secure Boosting Tree iter: 50 max_depth: 5 Train AUC 0.879 Validation AUC 0.862 FL Result 3rd Party Data 150k IDs 4 X val +7.5% improvement All Data 150k IDs 10 X | 1 Y(label) Centralized ML XGBoost iter: 70 max_depth: 5 Train AUC 0.878 Validation AUC 0.862 Centralized Result VS
  23. 23. FATE Credentials 23 WeBank led the standard of “Federated Learning Architecture in AIOSS WeBank publish the first book of “Federated Learning” WeBank is the founding member of IEEE P3652.1. We are pushing an IEEE Standard of Federate Learning Application. Standard Recommendation Publication Award & Certification Vision FL won the AAAI-20 Award of Best Industrial Application. The FATE platform got certifications from CAICT on both FL and MPC compliance tests. HKMA encourages banks in Hong Kong to co- create a digital framework on advanced technologies such as Federated Learning. WeBank introduced Federated Learning cases and the regulator encouraged use cases of this technology to PBOC Shenzhen about the Anti- Money Laundering.
  24. 24. 24 Q & A

apidays LIVE Hong Kong 2021 - API Ecosystem & Data Interchange August 25 & 26, 2021 Federated Learning for Banking Isaac Wong, AI Solution Architect at WeBank

Views

Total views

514

On Slideshare

0

From embeds

0

Number of embeds

6

Actions

Downloads

13

Shares

0

Comments

0

Likes

0

×