Development economists are harnessing satellite imagery, mobile phone metadata, government tax records and even leaked documents to answer questions that were previously out of reach.
Development economists have more data at their fingertips today than ever before. Research featured on VoxDev has showcased how new data sources – from satellites, phones, leaks and more – are transforming research. In this blog, I highlight a range of VoxDev articles that draw on novel data sources to answer interesting questions.
New insights from mobile phone data
The spread of phones in low-income countries has created vast streams of metadata, enabling economists to fill data gaps where reliable surveys are missing.
During the COVID-19 pandemic, Togo lacked up-to-date poverty data to target emergency mobile money transfers. Emily Aiken, Suzanna Bellue, Joshua Blumenstock, Dean Karlan and Christopher Udry partnered with the government to test different methods of targeting for this Novissi cash transfer programme. They found that targeting using mobile phone metadata, compared to geographic and occupation-based targeting, was the most accurate option available under the circumstances. However, they find that impact evaluation using this same mobile data was substantially more challenging than targeting poverty.
Mobile phone data can also proxy social phenomena that are hard to measure. Oeindrila Dube, Joshua Blumenstock, and Michael Callen document that during Islamic prayer times in Afghanistan, especially the Maghrib prayer window after sunset, phone activity drops sharply as people pause to pray. By analysing an eight-year dataset comprising millions of unique mobile phone numbers, they can quantify this ‘Maghrib dip’ – a consistent 25% drop in call volume. The size of this dip serves as a behavioural measure of religiosity and using this novel indicator, they find that violent insurgent attacks led to decreases in religiosity, while income shocks from climate disasters increased religiosity.
Studying people’s movement within countries is challenging for a host of reasons: it’s tough to track migrants in panel datasets; most censuses and surveys only record migration decisions which are longer-term, more permanent moves; details on origins and destinations are often missing; and travel surveys often focus on commuting as opposed to non-work trips. Paul Blanchard, Douglas Gollin and Martina Kirchberger develop a new approach using smartphone app data to solve these issues and shed light on high-frequency mobility. They use ‘pings’, which show where and when a phone connects to the internet, to show that populations in Nigeria, Kenya and Tanzania are highly mobile, frequently travelling far from home and to cities.
How AI is unlocking data: Policy analysis with Large Language Models
Governments produce volumes of policy documents, laws, and regulations that contain valuable information, but take a very long time to process. Now, with advances in large language models (LLMs), it’s easier for economists to convert text into data. On VoxDev, we have recently featured cutting-edge research where LLMs help researchers classify and analyse industrial policies.
Reka Juhász, Nathan Lane, Emily Oehlsen and Verónica C. Pérez (2025) set out to measure industrial policy systematically by analysing the text of policy announcements. They classify thousands of policy documents from the Global Trade Alert database by using LLMs to decide whether policies are “industrial policy” in intent – they have made their new public dataset of industrial policy interventions in each country and year available online. Their results reveal a surge in industrial policy worldwide since the 2010s, especially in advanced economies.
Another example zeroes in on China’s industrial policies. Using over three million government documents from 2000–2022, Hanming Fang, Ming Li and Guangli Lu guided an LLM to identify which documents contained industrial policies and to extract structured details from them. They ended up with a dataset of almost 800,000 industrial policy records, which yielded four key facts about China’s industrial policy.
A bird's-eye view of development: How satellite imagery has opened new frontiers
One of the most striking recent advances has come from satellites. This might not be considered particularly groundbreaking nowadays, but researchers continue to innovate on how this type of data can be used.
Remotely sensed data offers a powerful way to observe places where on-the-ground data is scarce. For example, night-time lights captured by satellites have become a widely used proxy for economic activity and growth. Dave Donaldson and Adam Storeygard note that satellite data can track everything from urban land use to agriculture and environmental change, overcoming many gaps in official data. Satellites are also illuminating the process of urbanisation. Gordon Hanson and Amit Khandelwal show how daytime and nightlight imagery complement each other when identifying and tracking urban markets from space.
More recently, Dev Patel joined the VoxDev podcast to discuss how he combined machine learning with satellite imagery to map floods in Bangladesh, creating a detailed flood exposure database where traditional sources fall short. This allows Patel to measure how floods impact local economies and how households adapt over time. His work finds persistent reductions in night-light activity (a proxy for economic activity) up to seven years after a flood, as well as changes in occupations and schooling as people adjust.
Administrative data and tax records
Researchers are increasingly working with governments to access their administrative data. For example, VoxDev has featured several studies leveraging administrative tax data – especially VAT (Value-Added Tax) records – to understand firm-to-firm trade and tax evasion.
In Rwanda, the government introduced electronic invoicing (e-invoicing) to improve revenue mobilisation. Christos Kotsogiannis, Luca Salvadori, John Karangwa and Innocente Murasi use Rwanda’s tax authority data on VAT filings from audits conducted between 2012 and 2019 to measure the impact of this digital reform. They find that e-invoicing can increase tax compliance if designed and implemented efficiently and accompanied by an effective tax audit strategy.
Another example is a study of India’s VAT system and its effect on supply chains. Lucie Gadenne and Roland Rathelot use administrative tax data covering 180,000 firms in the Indian state of West Bengal to map who trades with whom. They document a striking pattern: firms’ tax status (VAT-registered or not) strongly predicts their trading partners. VAT-paying firms mostly trade with each other, while non-VAT firms stick to the informal network, effectively segmenting supply chains.
One organisation helping to encourage more of this type of research is SA-TIED, who provide researchers access to comprehensive anonymised tax data in the Southern African region. More details here: Southern Africa – Towards Inclusive Economic Development.
Learning about tax evasion from leaks and unconventional sources
Some data comes to light in unexpected ways. Data leaks and new survey methods have become a valuable resource for economists studying illicit behaviour.
Juliana Londoño-Vélez matched tax data from Colombia with the Panama Papers microdata – the information leaked from Mossack Fonseca’s offshore services firm. This merge of leaked offshore account data and official tax filings enabled Londoño-Vélez to study the impacts of wealth taxes. Her results highlight the challenges of enforcement if wealthy individuals underreport their wealth, but also that disclosure incentives and greater enforcement can improve tax collection.
Economists have also gotten creative with survey methods. In the realm of tax evasion, new methods can elicit sensitive information: Christopher Hoy, Filip Jolevski and Anthony Obeyesekere used a double list experiment in surveys of Indonesian firms to measure tax evasion indirectly. They uncover that about 25% of formal firms self-report evading taxes – a figure far higher than what official audits alone suggest.
Demographic and Health Surveys: Old but gold - and hopefully saved by donors!
Traditional household surveys remain vital. The Demographic and Health Surveys in particular are an invaluable resource that have been used countless times in the research featured on VoxDev.
Here I’ve highlighted some of the key policy insights researchers have garnered from this vital source of data:
- Cash transfers reduce adult and child mortality rates in low- and middle-income countries: Aaron Richterman and Harsha Thirumurthy.
- Having a firstborn daughter - as opposed to a son - significantly shapes the trajectory of a woman’s life in Africa: Garance Genicot and Maria Hernandez-de-Benito.
- Electronic waste dumping is causing a health crisis in Ghana and Nigeria, claiming the lives of newborns and infants living nearby: Stefania Lovo and Sam Rawlings.
- Approximately 1.5 million lives have been saved by Gavi’s vaccine funding: Kartini Shastry and Daniel Tortorice.
Unfortunately, the US government recently announced it was terminating funding for the DHS. If you want to learn more about the importance of this data source, make sure to check out Saloni Dattani’s recent piece on Our World in Data. I also wanted to highlight Wilson King’s new research which finds that when DHS surveys started covering a new country, published economic research on that country increased significantly by 7 percentage points.
Better data, better policy
Research on VoxDev paints an exciting picture of how economists are innovating with data to understand development. The toolkit now includes satellites that monitor night lights and floods, mobile phone metadata, administrative records, leaked records that expose hidden wealth, LLMs that read policy texts, and more.
As this blog has shown, there are many ways better data can inform policymakers and directly improve policy. Another particularly clear example from Muhammad Haseeb and Kate Vyborny’s research in Pakistan shows that data-driven targeting made social assistance more pro-poor and more politically sustainable, while reducing favouritism.
To learn more about the data being used in the field of macro development, check out this recent virtual STEG course: Data in Macro Development.
.png)

