Dataloader test
MARS-S2L test plume examples 
- Last Modified: 26-04-2026
- Author: Gonzalo Mateo-Garcia
Overview
This notebook demonstrates how to inspect validated plume examples from the test split of the MARS-S2L dataset directly from the public Hugging Face repository using the marss2l package.
Specifically, it shows how to:
- Load the public image metadata from the Hugging Face dataset.
- Select the
test_2023split used for held-out model evaluation. - Build a
DatasetPlumesobject for analysis without simulation. - Randomly plot positive plume examples from the test dataset.
Important
- The test split is intended for evaluation and qualitative inspection of held-out plume cases.
- For end-to-end downloading, retrieval, inference, and quantification on a specific scene, use the
download_and_inference.ipynbexample instead. - The MARS-S2L database, trained models and tutorials in this package are released under a Creative Commons non-commercial share-alike licence
Install marss2l package
Install the published package before running the notebook:
pip install marss2l
This notebook uses the marss2l dataset loaders to read metadata and imagery directly from the public Hugging Face repository, so the package must be installed in the active environment first.
import matplotlib
import os
import logging
from marss2l.utils import setup_stream_logger, fs_from_path, pathjoin
logger = logging.getLogger(__name__)
setup_stream_logger(logger, level=logging.DEBUG)
matplotlib.rcParams['mathtext.fontset'] = 'stix'
matplotlib.rcParams['font.family'] = 'STIXGeneral'
from marss2l import loaders
from marss2l.huggingface import CSV_PATH_DEFAULT_HF, CSV_PLUME_PATH_DEFAULT_HF
# path_images = "../data/validated_images_all.csv"
# path_prepend_data = "../data/"
# Load images and plumes from HuggingFace
path_images = CSV_PATH_DEFAULT_HF
fs = fs_from_path(path_images)
dataframe_data_traintest = loaders.read_csv(path_images,
add_columns_for_analysis=False,
fs=fs)
dataframe_data_traintest.shape
from marss2l.loaders import DatasetPlumes
from marss2l import dataframe_image_plumes
import json
import os
split = "test_2023"
mode = "test"
dataframe_image_test, _, _ = dataframe_image_plumes.load_dataframe_split(
dataframe_or_csv_path=dataframe_data_traintest,
dataframe_or_csv_path_plumes=None,
dataframe_or_csv_path_sources=None,
split=split,
fs=fs,
logger=logger,
all_locs=None,
load_plumes=False
)
dataset = DatasetPlumes(mode=mode,
strprependlogs=split,
device="cpu",
image_dataframe=dataframe_image_test,
do_simulation=False,
analysis_mode=True,
fs=fs)
dataset.image_dataframe
Plot plumes in the test dataset
import matplotlib.pyplot as plt
import numpy as np
size = 5
idx_to_plot = np.random.choice(dataset.image_dataframe[dataset.image_dataframe.isplume].index,replace=False, size=size)
for _i, idx in enumerate(idx_to_plot):
fig, axs = dataset.plot_item(dataset[idx], text_prepend=f"{_i+1}/{len(idx_to_plot)}")
plt.show(fig)
plt.close(fig)
Licence
The marss2l package is published under a GNU Lesser GPL v3 licence
The MARS-S2L database and all pre-trained models are released under a Creative Commons non-commercial share-alike licence. For using the models and data in comercial pipelines written consent by UNEP IMEO must be provided.
marss2l tutorials and notebooks are released under a Creative Commons non-commercial share-alike licence.
If you find this work useful please cite:
@article{allen_2025,
title = {Artificial intelligence for methane detection: from continuous monitoring to verified mitigation},
author = {Allen, Anna and Mateo-Garcia, Gonzalo and Irakulis-Loitxate, Itziar and Martin, Manuel Montesino-San and Watine, Marc and Requeima, James and Gorroño, Javier and Randles, Cynthia and Mokalled, Tharwat and Guanter, Luis and Turner, Richard E. and Cifarelli, Claudio and Caltagirone, Manfredi},
url = {http://arxiv.org/abs/2511.21777},
doi = {10.48550/arXiv.2511.21777},
month = nov,
year = {2025}
}