When: 19 July 2025

Where: Vancouver, Canada

Official workshop schedule on ICML website: ICML.cc - TerraBytes 2025

Submission website: OpenReview - TerraBytes 2025

About the workshop

Earth Observation (EO) presents unique challenges and opportunities that set it apart from other fields of machine learning (ML) and computer vision (CV) (Rolf et al., 2024). EO data is abundant, repeatedly covering a large but limited environment: our planet. The colocation and evolution of these observations is a rich, emergent source of information, of multimodal and multitemporal nature. However, due to both local and global changes, especially climate change, the statistical distribution of EO data is inherently non-stationary (Tarasiou et al., 2023). These properties break some of the usual assumptions of ML, and EO data require special care for data handling and modeling (Mai et al., 2022). In addition, current EO datasets are sampled with spatio-temporal biases. Some areas, e.g., the global South, are strongly under-represented within EO datasets (Cornebise et al., 2022). In optical imagery, cloud cover is undesirable, thus leading to datasets that remove cloudy images at the risk of biasing their geographical coverage (Tiede et al., 2021). Addressing these distributional biases is of primary importance, as they have an impact on the performance and reliability of models for downstream applications in ecology, geosciences, agriculture, urban planning, etc. (Kattenborn et al., 2022).

TerraBytes is an initiative to address these challenges. At the intersection of data curation, data archiving, and representation learning, this workshop will foster a holistic discussion covering major steps in the EO from downlinked satellite data, training paradigms to downstream applications.

Call for submissions

We invite submissions on the following topics:

Submission process

Submission to the TerraBytes workshop is double-blind. The workshop accepts short research papers (4 pages, excluding references) and full-length papers (8 pages, excluding references). Authors are required to follow standard ICML paper format (LaTeX style files). Short research papers can described work in progress, opinion papers or datasets and research papers that have been already published in another venue (conference or journal) in the last 6 months that are relevant the TerraBytes’ topics of interest. For already published works, please submit a summarized four-pages short paper. We are especially looking to broadcast journal papers to a larger audience during the workshop.

All submissions should be submitted through OpenReview before the deadline (see important dates below).

PMLR Logo

Note that the workshop will not publish proceedings under the ICML conference. However, authors of accepted papers will be able to opt-in for a special issue comprised of the accepted full-length paper in PMLR. Please note that accepted papers will not go through another reviewing process to be published in the PMLR volume. Also, all the papers accepted for the PMLR proceedings are expected to be original contributions (i.e. no dual submissions are allowed). All submissions will be reviewed by at least two non-conflicting reviewers. All papers will be presented during a dedicated poster session. Full-length papers and selected short papers will also be presented as spotlight presentations throughout the day.

It is expected that at least one author of each accepted paper will register for the workshop and present the article during the event. Online presentations will be considered for presenters that are not allowed to come, e.g. to visa or personal constraints. If you have any questions, do not hesitate to contact the organizers (see contacts below).

Important dates


Program


Morning session

Afternoon session

Speakers


Sujit Roy (NASA IMPACT)

Dr. Sujit Roy works as a Lead AI Researcher and Computer Scientist at NASA’s Interagency Implementation and Advanced Concept Teams (IMPACT), where he leads the development of foundational models for analyzing satellite imagery, enhancing weather forecasting and Heliophysics, resulting in multiple practical scientific applications. Prior to his tenure at NASA IMPACT, Dr. Roy contributed to the field of Explainable AI at the University of Manchester. He received his PhD (UKIERI fellowship) in Computer Science from Ulster University in collaboration with the Indian Institute of Technology Kanpur, India. In his PhD, he contributed to the domain of Computational Neuroscience by developing algorithms for Advancing MEG- and EEG-Based Decoding of Motor Imagery for Practical Brain-Computer Interfaces.

Razvan Cosac (ESA Ground Segment)

Razvan has a background in Earth Sciences and has been working in the Earth observation domain for the past 12 years. He joined the European Space Agency (ESA) in 2013 as a Young Graduate Trainee working as part of the Long Term Data Preservation programme. He then continued his career in industry, working on various ESA projects regarding the processing, consolidation, archiving and dissemination of EO satellite imagery. Razvan rejoined ESA in 2019 as Copernicus Ground Segment Engineer, where he contributed to the Copernicus Space Component (CSC) ESA Earth Observation Framework (EOF) and data management operations, focusing on the data access and dissemination activities. Razvan is currently managing several ESA CSC EOF service operations contracts in this domain, the largest and most impactful one being the coordination of the Copernicus Data Space Ecosystem (CDSE).

Erin Trochim (University of Alaska Fairbanks)

Erin Trochim is a geospatial data scientist developing better data and decisions for energy and its uses in the north. She values a collaborative approach to research. During 2022, she was a project lead for the University of Washington’s Data Science for Social Good program and is a co-investigator for the NSF Research Experience for Undergraduates at ACEP. PredictFest, a hackathon event founded by Trochim, was most recently supported by NSF’s Navigating the New Arctic initiative. She was an inaugural Google Cloud Research Innovator in 2021. Trochim earned a Ph.D. in remote sensing and hydrology from UAF, supported by a NASA Earth and Space Science graduate fellowship. Her postdoctoral experience included producing permafrost information for policy application through the NSF SEARCH program with the Alaska Climate Adaptation Science Center. Present energy-related projects include the Railbelt Decarbonization study, the Arctic Energy Atlas, and creating environmental data for marine and hydrokinetic applications. Trochim is from Whitehorse, Yukon, Canada. Winter is her favorite season; her huskies keep her active with skijoring long distances, which helps during masters ski club training.

Accepted papers

Long papers

Label-Efficient Hyperspectral Image Classification via Spectral FiLM Modulation of Low-Level Pretrained Diffusion Features Yuzhen Hu, Biplab Banerjee, Saurabh Prasad

Smoothing Continual Segmentation Oscillations with Latent Domain PPCA Decoder Marie-Ange Boum, Pierre Fournier, Dawa Derksen, Stéphane Herbin

High-Resolution LFMC Maps for Wildfire Risk From Multimodal Earth Observation Data Patrick Alan Johnson, Gabriel Tseng, Yawen Zhang, Heather Heward, Virginia Sjahli, Favyen Bastani, Joseph Redmon, Patrick Beukema

Optimizing Cloud-to-GPU Throughput for Deep Learning With Earth Observation Data Akram Zaytar, Caleb Robinson, Girmaw Abebe Tadesse, Tammy Glazer, Gilles HACHEME, Anthony Ortiz, Rahul M Dodhia, Juan M Lavista Ferres

Shaping Fine-Tuning of Geospatial Foundation Models: Effects of Label Availability and Temporal Resolution Giovanni Castiglioni, Nicolás Gonzalo Isla Fernández, Cristian Buc Calderon, Javiera Castillo Navarro, Sébastien Lefèvre, Valentin Barriere

The Cloud-Based Geospatial Benchmark: Challenges and LLM Evaluation Jeffrey A. Cardille, Renee Johnston, Simon Ilyushchenko, Johan Kartiwa, Zahra Shamsi, Matthew Abraham, Khashayar Azad, Kainath Ahmed, Emma Bergeron Quick, Nuala Caughie, Noah Jencz, Karen Dyson, Andrea Puzzi Nicolau, Maria Fernanda Lopez-Ornelas, David Saah, Michael Brenner, Subhashini Venugopalan, Sameera S Ponda

AirCast: Improving Air Pollution Forecasting Through Multi-Variable Data Alignment Vishal Nedungadi, Muhammad Akhtar Munir, Marc Rußwurm, Ron Sarafian, Ioannis N. Athanasiadis, Yinon Rudich, Fahad Shahbaz Khan, Salman Khan

Resampling Augmentation for Time Series Contrastive Learning: Application to Remote Sensing Antoine Saget, Baptiste Lafabregue, Antoine Cornuéjols, Pierre Gançarski

Deploying geospatial foundation models in the real world Christina Butsko, Gabriel Tseng, Kristof Van Tricht, Giorgia Milli, David Rolnick, Ruben Cartuyvels, Inbal Becker Reshef, Zoltan Szantoi, Hannah Kerner

Where are the Whales: A Human-in-the-loop Detection Method for Identifying Whales in High-resolution Satellite Imagery Caleb Robinson, Kimberly T. Goetz, Christin B. Khan, Meredith Sackett, Kathleen Leonard, Rahul M Dodhia, Juan M Lavista Ferres

Using Multiple Input Modalities can Improve Data-Efficiency and O.O.D. Generalization for ML with Satellite Imagery Arjun Rao, Esther Rolf

Short papers

High-Performance Lightweight Vision Models for Land Cover Classification with Coresets and Compression Tushar Shinde

Enhancing Generative Seismic Modeling via Proposed Paired Dataset Construction Method Jaehyuk Lee, Jaeheun Jung, Hanyoung Kim, Chang-Hae Jung, Donghun Lee

Less is More? Data Specialization for Self-Supervised Remote Sensing Models Alvard Barseghyan, Ani Vanyan, Hakob Tamazyan, Evan Shelhamer, Hrant Khachatrian

Open-source federated learning across multi cloud environment Leonardo P. Tizzei, Gabrielle Nyirjesy, Levente J Klein, Theodore van Kessel, Maciel Zortea, Marcus Freitag, ILDAR KHABIBRAKHMANOV, Hendrik Hamann, Kamal Das

A Multimodal Deep Learning Framework for Locating Nomadic Pastoralists to Strengthen Public Health Outreach Benjamin Liu, Stace Maples, Jessie Kong, Francesco Fava, Nathaniel Jensen, Philemon Chelanga, Sergio Charles, Theodore Utomo, Aritro Chatterjee, Kevin Zhu, James Hassell, Lance W. Robinson, Luke Glowacki, Michele Barry, Hannah Wild

GeoCrossBench: Cross-Band Generalization for Remote Sensing Hakob Tamazyan, Ani Vanyan, Alvard Barseghyan, Anna Khosrovyan, Evan Shelhamer, Hrant Khachatrian

The Missing Piece: Standardising for AI-ready Earth Observation Datasets Cesar Aybar, Julio Contreras, Chen Ma, Oscar J. Pellicer-Valero, Gonzalo Mateo-García, Luis Gómez-Chova, Gustau Camps-Valls, Nils Lehmann, Mikolaj Czerkawski, David Montero, Miguel D. Mahecha, Ingrid Aybar

Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data Haozhe Si, Yuxuan Wan, Minh N. Do, Deepak Vasisht, Han Zhao, Hendrik Hamann

Galileo: Learning Global & Local Features of Many Remote Sensing Modalities Gabriel Tseng, Anthony Fuller, Marlena Reil, Henry Herzog, Patrick Beukema, Favyen Bastani, James R Green, Evan Shelhamer, Hannah Kerner, David Rolnick

End-to-End Reconstruction of High-Resolution Temperature Data Using Physics-Guided Deep Learning Shengjie Liu, Lu Zhang, Siqin Wang

Domain Adaptation for Onboard Cloud Segmentation in Thermal Earth Observation: Transfer Learning from Landsat to a CubeSat Constellation Niklas Wölki, Lukas Kondmann, Christian Mollière, Martin Langer, Julia Gottfriedsen, Martin Werner

Landsat-Bench: Datasets and Benchmarks for Landsat Foundation Models Isaac Corley, Lakshay Sharma, Ruth Crasto

Towards LLM Agents for Earth Observation Chia Hsiang Kao, Wenting Zhao, Shreelekha Revankar, Samuel Speas, Snehal Bhagat, Rajeev Datta, Cheng Perng Phoo, Utkarsh Mall, Carl Vondrick, Kavita Bala, Bharath Hariharan

People


Organizers


Program committee


Primary Contacts: Valerio Marsocci (valerio.marsocci@esa.int), Nicolas Audebert (nicolas.audebert@ign.fr)

Sponsors


We thank CopernicusLAC for their sponsorship of this event.

CopernicusLAC Chile sponsor
IEEE IADF sponsor