{"id":1370,"date":"2025-07-15T15:36:25","date_gmt":"2025-07-15T15:36:25","guid":{"rendered":"https:\/\/sola.kau.se\/deprimap\/?p=1370"},"modified":"2025-07-15T16:00:34","modified_gmt":"2025-07-15T16:00:34","slug":"google-25d-download","status":"publish","type":"post","link":"https:\/\/sola.kau.se\/deprimap\/2025\/07\/15\/google-25d-download\/","title":{"rendered":"Going Beyond Earth Engine: Scalable Downloads of Google’s 2.5D Building Dataset"},"content":{"rendered":"
Google’s Open Buildings dataset has become a powerful resource for understanding urban form and built-up environments at a global scale. Its latest 2.5D version<\/a> adds another layer of insight, literally, by including building height, count, and presence layers at 0.5 meters, with a download resolution of bands. These rich spatial layers are invaluable for analysing patterns of urban growth, inequality, infrastructure access, and more.<\/p>\n\n\n\n However, working with this dataset at the national or multi-country scale presents a major challenge: Google Earth Engine (GEE) export limits<\/strong>. While GEE is excellent for visualisation and targeted analysis, it becomes impractical for downloading gigabytes or even terabytes of high-resolution raster tiles across large territories. Researchers often hit quota ceilings, timeout errors, or file size restrictions that make it difficult to extract complete data for countries or regions.<\/p>\n\n\n\n To overcome this, we created a simple but powerful tool as part of the DEPRIMAP project – a Python Jupyter notebook that allows users to download the entire Google 2.5D building dataset for one or more countries and for a year of their choice, without relying on Earth Engine at all. By using publicly available .tif links provided by Google, this workflow enables a smooth, repeatable, and scalable download process.<\/p>\n\n\n\n Github Repository link:<\/strong> https:\/\/github.com\/saiga143\/google-2.5d-bulk-download<\/a><\/p>\n\n\n\n Google Earth Engine (GEE) is a powerful cloud-based platform widely used by researchers and analysts for working with geospatial datasets. It offers access to planetary-scale data archives and enables scripting, visualisation, and analysis in a browser environment. However, when it comes to exporting high-resolution data over large areas<\/strong>, it quickly runs into limitations.<\/p>\n\n\n\n For the 2.5D buildings dataset, these limitations become especially apparent:<\/p>\n\n\n\n This creates a bottleneck for projects that require complete, offline access to building data across entire countries or multiple regions – for example, comparative studies of urban inequality, morphological change analysis, or slum detection models operating at a national scale.<\/p>\n\n\n\n Recognising this gap, we luckily came across a Google Cloud Storage that hosts this data and developed a script that completely bypasses GEE, instead using direct download links hosted by Google. These links point to the raw .tif files, already split and organised for global access. All that’s needed is a streamlined way to fetch them at scale, which is exactly what our notebook does.<\/p>\n\n\n\n Once you have downloaded the repository and extracted the URLs zip folders and made a single folder named ‘urls’, using the notebook is straightforward. The script is built entirely on Python and runs within a Jupyter notebook environment, making it easy to inspect, modify, and execute step-by-step.<\/p>\n\n\n\n Here’s a quick walkthrough of the key components:<\/p>\n\n\n\n a) Set the country codes and year<\/strong><\/p>\n\n\n\n The first step is to specify which countries and which year you want to download data for.<\/p>\n\n\n\n In the code:<\/p>\n\n\n\n So, if you want to download data for Kenya and Brazil for the year 2023, you mention ‘KEN’ and ‘BRA’ in selected_countries and ‘2023’ in target_year as shown in the above code snippet.<\/p>\n\n\n\n Give proper paths under the url_folder and output_base variables in the code (shown in above example photo).<\/p>\n\n\n\n b) Understand the output structure<\/strong><\/p>\n\n\n\n The notebook creates a sub-folder for each country-year combination, and stores all downloaded .tif files there.<\/p>\n\n\n\n This means you get a clearly organised directory with no overlaps, and the script automatically skips tiles that have already been downloaded<\/strong>, making it safe to re-run or resume after interruptions.<\/p>\n\n\n\n c) Where the URLs come from?<\/strong><\/p>\n\n\n\n You don’t need to manually hunt for links, each country’s tile URLs are already listed in text files like:<\/p>\n\n\n\n Each line in these .txt files is a direct HTTPS link to a raster tile hosted by Google.<\/p>\n\n\n\n d) Run the notebook<\/strong><\/p>\n\n\n\n Once everything is set, run the notebook cell by cell. The download process uses progress bars (via tqdm) to show status as tiles are fetched.<\/p>\n\n\n\n The download logic ensures:<\/p>\n\n\n\n At the end, you’ll have complete .tif file sets for each selected country and year, stored locally and ready for analysis. The data is downloaded at a resolution of 0.5 meters. These are three-band rasters with the layers:<\/p>\n\n\n\n The notebook is designed for anyone working with the Google 2-5D Open Buildings dataset at the national or multi-country scale. If you’ve struggled with GEE limits or just want to build a local archive of the data for your region of interest, this tool is for you.<\/p>\n\n\n\n It can be particularly useful for:<\/p>\n\n\n\n \u26a0\ufe0f A few practical considerations<\/strong><\/p>\n\n\n\n While the notebook is flexible and robust, keep these limitations in mind:<\/p>\n\n\n\n Despite these caveats, the notebook offers a simple, reproducible, and scalable solution to a long-standing challenge: getting high-resolution global building data out of the cloud and into your hands.<\/p>\n\n\n\n This tool\/script was born out of necessity. While working on the DEPRIMAP project, which involves large-scale spatial analysis of deprived urban areas, we quickly realised that working with high-resolution building datasets at the country scale was nearly impossible through conventional GEE workflows. We needed something reliable, lightweight, and offline-capable.<\/p>\n\n\n\n The notebook presented here is our response to that need. By leveraging Google’s public hosting of .tif files and automating the download process, we hope this tool enables researchers, students, and practitioners to build consistent datasets across countries, without technical bottlenecks or quota frustrations.<\/p>\n\n\n\n Whether you’re building models of urban form, analysing slum morphology, or just exploring new geospatial methods, we believe this tool can help make your workflow a little smoother.<\/p>\n\n\n\n \ud83d\udd17 GitHub Repository<\/strong><\/p>\n\n\n\n https:\/\/github.com\/saiga143\/google-2.5d-bulk-download<\/a><\/p>\n\n\n\n
<\/figure>\n\n\n\n
<\/figure>\n\n\n\n
\n\n\n\nChallenges with GEE<\/strong><\/h3>\n\n\n\n
\n
\n\n\n\nHow to use the Notebook?<\/strong><\/h3>\n\n\n\n
<\/figure>\n\n\n\nselected countries = [\"KEN\", \"BRA\"]\ntarget_year = \"2023\"<\/strong><\/code><\/pre>\n\n\n\n\n
<\/figure>\n\n\n\n
<\/figure>\n\n\n\n
<\/figure>\n\n\n\n\n
\n
\n\n\n\nWho is this for & What to keep in mind<\/strong><\/h3>\n\n\n\n
\n
\n
\n\n\n\nClosing Thoughts<\/strong><\/h3>\n\n\n\n