Carlos Gershenson; Gerardo Iñiguez; Carlos Pineda; Rita Guerrero; Eduardo Islas; Omar Pineda; Martín Zumaya
Mexico has a centralized digital record of all taxable transactions since 2014. We received an anonymized dataset of more than 80M contributors (individuals and companies) and almost 7B monthly-aggregations of invoices among contributors between January, 2015 and December, 2018. With these, we were able to build temporal networks (monthly and yearly) where nodes represent contributors and directed links invoice(s) made by A for B in a given time slice. We also received a list of almost 10K contributors that had already been identified as tax evaders. In particular, they are called “EFOS” (empresas facturadoras de operaciones simuladas: invoicing businesses of simulated operations in Spanish). EFOS fabricate invoices for non-existing products or services so that recipients can deduce (evade) taxes.
Analyzing the network properties around EFOS, it became clear that their interaction patterns differ from those of the majority of contributors. For example, there is an overrepresentation of invoicing loops among EFOS and their clients.
We also used two methods for classifying suspicious contributors: deep neural networks and random forest. We trained each method with part of the 10K list and tested with the rest, each method obtaining more than 0.9 accuracy. Each classifier was used with the complete dataset of contributors, each method classifying more than 100K suspects. More than 40K suspects were classified by both methods. We further reduced the number of suspects by focussing on those that were at a close network distance from known EFOS. We were able to then produce a list of highly suspicious contributors, sorted by the amount of tax evaded for the authorities to investigate. With our methods, we identified a previously undetected tax evasion estimated in the order of $10B USD per year by about 10K contributors.
Full report published in Spanish and publicly available: Gershenson, C., Iñiguez, G., Pineda, C., Guerrero, R., Islas, E., Pineda, O., & Zumaya, M. (2019). Evasión en IVA: Análisis de redes. Estudio contratado por el SAT. http://omawww.sat.gob.mx/gobmxtransparencia/Paginas/documentos/estudio_opiniones/Evasion_en_IVA_Analisis_de_Redes.pdf
See also: Identifying tax evasion in Mexico with tools from network science and machine learning https://arxiv.org/abs/2104.13353
source