Going Beyond Earth Engine: Scalable Downloads of Google’s 2.5D Building Dataset
Google’s Open Buildings dataset has become a powerful resource for understanding urban form and built-up environments at a global scale. Its latest 2.5D version adds another layer of insight, literally, by including building height, count, and presence layers at 0.5 meters, with a download resolution of bands. These rich spatial layers are invaluable for analysing patterns of urban growth, inequality, infrastructure access, and more.

However, working with this dataset at the national or multi-country scale presents a major challenge: Google Earth Engine (GEE) export limits. While GEE is excellent for visualisation and targeted analysis, it becomes impractical for downloading gigabytes or even terabytes of high-resolution raster tiles across large territories. Researchers often hit quota ceilings, timeout errors, or file size restrictions that make it difficult to extract complete data for countries or regions.
To overcome this, we created a simple but powerful tool as part of the DEPRIMAP project – a Python Jupyter notebook that allows users to download the entire Google 2.5D building dataset for one or more countries and for a year of their choice, without relying on Earth Engine at all. By using publicly available .tif links provided by Google, this workflow enables a smooth, repeatable, and scalable download process.
Github Repository link: https://github.com/saiga143/google-2.5d-bulk-download

Challenges with GEE
Google Earth Engine (GEE) is a powerful cloud-based platform widely used by researchers and analysts for working with geospatial datasets. It offers access to planetary-scale data archives and enables scripting, visualisation, and analysis in a browser environment. However, when it comes to exporting high-resolution data over large areas, it quickly runs into limitations.
For the 2.5D buildings dataset, these limitations become especially apparent:
- Export size constraints: GEE restricts the size of exports (both in terms of pixel count and total file size). National-level exports for countries with dense urban coverage often exceed these limits.
- Quota and task failures: Users are limited in the number of concurrent and total export tasks. Long-running exports are prone to timeout or failure, particularly when attempting to mosaic or clip thousands of tiles at once.
- Manual processing burden: Even if the export technically works, the user must often script complex batch processing workflows to tile the exports, manage retries, and download them manually.
- Lack of control over storage: GEE exports must be staged to Google Drive (most cases) or Google Cloud, which adds another layer of configuration and often leads to synching issues, storage limit issues for large file sets.
This creates a bottleneck for projects that require complete, offline access to building data across entire countries or multiple regions – for example, comparative studies of urban inequality, morphological change analysis, or slum detection models operating at a national scale.
Recognising this gap, we luckily came across a Google Cloud Storage that hosts this data and developed a script that completely bypasses GEE, instead using direct download links hosted by Google. These links point to the raw .tif files, already split and organised for global access. All that’s needed is a streamlined way to fetch them at scale, which is exactly what our notebook does.
How to use the Notebook?
Once you have downloaded the repository and extracted the URLs zip folders and made a single folder named ‘urls’, using the notebook is straightforward. The script is built entirely on Python and runs within a Jupyter notebook environment, making it easy to inspect, modify, and execute step-by-step.
Here’s a quick walkthrough of the key components:
a) Set the country codes and year
The first step is to specify which countries and which year you want to download data for.

In the code:
selected countries = ["KEN", "BRA"]
target_year = "2023"
- Each country should be entered as its ISO3 code (you can refer to the ‘country_codes.csv’ file in the repo).
- The year must be between 2016 and 2023 (that is, the temporal data availability, 2016/2023 included).
So, if you want to download data for Kenya and Brazil for the year 2023, you mention ‘KEN’ and ‘BRA’ in selected_countries and ‘2023’ in target_year as shown in the above code snippet.
Give proper paths under the url_folder and output_base variables in the code (shown in above example photo).
b) Understand the output structure
The notebook creates a sub-folder for each country-year combination, and stores all downloaded .tif files there.

This means you get a clearly organised directory with no overlaps, and the script automatically skips tiles that have already been downloaded, making it safe to re-run or resume after interruptions.
c) Where the URLs come from?
You don’t need to manually hunt for links, each country’s tile URLs are already listed in text files like:

Each line in these .txt files is a direct HTTPS link to a raster tile hosted by Google.
d) Run the notebook
Once everything is set, run the notebook cell by cell. The download process uses progress bars (via tqdm) to show status as tiles are fetched.

The download logic ensures:
- Already downloaded tiles are skipped.
- Errors (e.g., missing URLs) are printed, but do not crash the script.
- You can stop and restart the notebook at any point.
At the end, you’ll have complete .tif file sets for each selected country and year, stored locally and ready for analysis. The data is downloaded at a resolution of 0.5 meters. These are three-band rasters with the layers:
- fractional_building_count – fractional building count per pixel.
- building_height – estimated building height per pixel.
- building_presence – confidence layer ranging between 0 and 1.
Who is this for & What to keep in mind
The notebook is designed for anyone working with the Google 2-5D Open Buildings dataset at the national or multi-country scale. If you’ve struggled with GEE limits or just want to build a local archive of the data for your region of interest, this tool is for you.
It can be particularly useful for:
- Researchers studying urban change, informal settlements, infrastructure, or building morphology.
- Development organisations working in multiple countries needing consistent, high-resolution building data.
- Students or analysts working in settings with limited cloud compute access.
- GIS practitioners who want raw .tif files they can mosaic, analyse, and share offline.
⚠️ A few practical considerations
While the notebook is flexible and robust, keep these limitations in mind:
- Storage size: Some countries (especially large, urbanised ones like India or Nigeria) may require 200-400 GB of space or more.
- Internet bandwidth: The script downloads hundreds or thousands of tiles per country, so a stable connection is essential.
- No preprocessing: The notebook only handles downloading. Mosaicking, visualisation, or analysis is left to the user’s workflow.
Despite these caveats, the notebook offers a simple, reproducible, and scalable solution to a long-standing challenge: getting high-resolution global building data out of the cloud and into your hands.
Closing Thoughts
This tool/script was born out of necessity. While working on the DEPRIMAP project, which involves large-scale spatial analysis of deprived urban areas, we quickly realised that working with high-resolution building datasets at the country scale was nearly impossible through conventional GEE workflows. We needed something reliable, lightweight, and offline-capable.
The notebook presented here is our response to that need. By leveraging Google’s public hosting of .tif files and automating the download process, we hope this tool enables researchers, students, and practitioners to build consistent datasets across countries, without technical bottlenecks or quota frustrations.
Whether you’re building models of urban form, analysing slum morphology, or just exploring new geospatial methods, we believe this tool can help make your workflow a little smoother.
🔗 GitHub Repository
https://github.com/saiga143/google-2.5d-bulk-download
No responses yet