Computer Vision and Pattern Recognition 4
♻ ☆ Fully Automated Segmentation of Fiber Bundles in Anatomic Tracing Data MICCAI 2025
Kyriaki-Margarita Bintsi, Yaël Balbastre, Jingjing Wu, Julia F. Lehman, Suzanne N. Haber, Anastasia Yendiki
Anatomic tracer studies are critical for validating and improving diffusion
MRI (dMRI) tractography. However, large-scale analysis of data from such
studies is hampered by the labor-intensive process of annotating fiber bundles
manually on histological slides. Existing automated methods often miss sparse
bundles or require complex post-processing across consecutive sections,
limiting their flexibility and generalizability. We present a streamlined,
fully automated framework for fiber bundle segmentation in macaque tracer data,
based on a U-Net architecture with large patch sizes, foreground aware
sampling, and semisupervised pre-training. Our approach eliminates common
errors such as mislabeling terminals as bundles, improves detection of sparse
bundles by over 20% and reduces the False Discovery Rate (FDR) by 40% compared
to the state-of-the-art, all while enabling analysis of standalone slices. This
new framework will facilitate the automated analysis of anatomic tracing data
at a large scale, generating more ground-truth data that can be used to
validate and optimize dMRI tractography methods.
comment: Accepted at CDMRI, MICCAI 2025
♻ ☆ Hyperspectral Image Generation with Unmixing Guided Diffusion Model
We address hyperspectral image (HSI) synthesis, a problem that has garnered
growing interest yet remains constrained by the conditional generative
paradigms that limit sample diversity. While diffusion models have emerged as a
state-of-the-art solution for high-fidelity image generation, their direct
extension from RGB to hyperspectral domains is challenged by the high spectral
dimensionality and strict physical constraints inherent to HSIs. To overcome
the challenges, we introduce a diffusion framework explicitly guided by
hyperspectral unmixing. The approach integrates two collaborative components:
(i) an unmixing autoencoder that projects generation from the image domain into
a low-dimensional abundance manifold, thereby reducing computational burden
while maintaining spectral fidelity; and (ii) an abundance diffusion process
that enforces non-negativity and sum-to-one constraints, ensuring physical
consistency of the synthesized data. We further propose two evaluation metrics
tailored to hyperspectral characteristics. Comprehensive experiments, assessed
with both conventional measures and the proposed metrics, demonstrate that our
method produces HSIs with both high quality and diversity, advancing the state
of the art in hyperspectral data generation.
♻ ☆ Vehicle detection from GSV imagery: Predicting travel behaviour for cycling and motorcycling using Computer Vision
Kyriaki, Kokka, Rahul Goel, Ali Abbas, Kerry A. Nice, Luca Martial, SM Labib, Rihuan Ke, Carola Bibiane Schönlieb, James Woodcock
Transportation influence health by shaping exposure to physical activity, air
pollution and injury risk. Comparative data on cycling and motorcycling
behaviours is scarce, particularly at a global scale. Street view imagery, such
as Google Street View (GSV), combined with computer vision, is a valuable
resource for efficiently capturing travel behaviour data. This study
demonstrates a novel approach using deep learning on street view images to
estimate cycling and motorcycling levels across diverse cities worldwide. We
utilized data from 185 global cities. The data on mode shares of cycling and
motorcycling estimated using travel surveys or censuses. We used GSV images to
detect cycles and motorcycles in sampled locations, using 8000 images per city.
The YOLOv4 model, fine-tuned using images from six cities, achieved a mean
average precision of 89% for detecting cycles and motorcycles. A global
prediction model was developed using beta regression with city-level mode
shares as outcome, with log transformed explanatory variables of counts of
GSV-detected images with cycles and motorcycles, while controlling for
population density. We found strong correlations between GSV motorcycle counts
and motorcycle mode share (0.78) and moderate correlations between GSV cycle
counts and cycling mode share (0.51). Beta regression models predicted mode
shares with $R^2$ values of 0.614 for cycling and 0.612 for motorcycling,
achieving median absolute errors (MDAE) of 1.3% and 1.4%, respectively.
Scatterplots demonstrated consistent prediction accuracy, though cities like
Utrecht and Cali were outliers. The model was applied to 60 cities globally for
which we didn't have recent mode share data. We provided estimates for some
cities in the Middle East, Latin America and East Asia. With computer vision,
GSV images capture travel modes and activity, providing insights alongside
traditional data sources.
♻ ☆ WIPES: Wavelet-based Visual Primitives
Pursuing a continuous visual representation that offers flexible frequency
modulation and fast rendering speed has recently garnered increasing attention
in the fields of 3D vision and graphics. However, existing representations
often rely on frequency guidance or complex neural network decoding, leading to
spectrum loss or slow rendering. To address these limitations, we propose
WIPES, a universal Wavelet-based vIsual PrimitivES for representing
multi-dimensional visual signals. Building on the spatial-frequency
localization advantages of wavelets, WIPES effectively captures both the
low-frequency "forest" and the high-frequency "trees." Additionally, we develop
a wavelet-based differentiable rasterizer to achieve fast visual rendering.
Experimental results on various visual tasks, including 2D image
representation, 5D static and 6D dynamic novel view synthesis, demonstrate that
WIPES, as a visual primitive, offers higher rendering quality and faster
inference than INR-based methods, and outperforms Gaussian-based
representations in rendering quality.
comment: IEEE/CVF International Conference on Computer Vision 2025