Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions _pages/Schedule_WACV_2026.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ nav_order: 4
| Time | Schedule |
| --- | --- |
| **13:00-13:10** | **Opening Remarks** |
| **13:10-13:40** | **Keynote Presentation** <br> **Dr. Lars Hammarstrand** |
| **13:10-13:40** | **Keynote: Learning 3D Worlds Without Labels: Occupancy, Semantics, and Dynamics at Scale** <br> **Dr. Lars Hammarstrand** |
| **13:40-13:50** | **Paper 5:** *Lightweight Multi-Scale Fusion for Real-Time Autonomous Driving Segmentation* <br> *(Remote)* |
| **13:50-14:00** | **Paper 7:** *FROST-Drive: Scalable and Efficient End-to-End Driving with a Frozen Vision Encoder* |
| **13:50-14:00** | **Paper 7:** *FROST-Drive: Scalable and Efficient End-to-End Driving with a Frozen Vision Encoder* <br> *(Remote)*|
| **14:00-14:10** | **Paper 8:** *Efficient Visual Question Answering Pipeline for Autonomous Driving via Scene Region Compression* <br> *(Remote)* |
| **14:10-14:20** | **Paper 10:** *Benchmarking Vision-Language Models for Traffic Scene Understanding in Inclement Winter Weather: The AWDB Benchmark* |
| **14:20-14:30** | **Paper 11:** *Role of Language-Guidance in Knowledge Distillation for Semantic Segmentation Under Limited Field-Of-View Autonomous Driving* |
Expand Down
12 changes: 12 additions & 0 deletions _pages/WACV_2026.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,9 @@ TBD -->

### Keynote Speakers
<div class="row projects pt-1 pb-1">
<div class="col-sm-4">
{% include people.html name="Lars Hammarstrand" affiliation="Chalmers University of Technology" img="assets/img/larsHammarstand.jpg" %}
</div>
<div class="col-sm-4">
{% include people.html name="Tong Shen" affiliation="CMU" img="assets/img/TongShen.jpg" %}
</div>
Expand All @@ -86,7 +89,16 @@ TBD -->

### Keynote Talks

#### Learning 3D Worlds Without Labels: Occupancy, Semantics, and Dynamics at Scale (Lars Hammarstrand)

**Abstract:**
Scalability in autonomous driving is hindered by the high cost of 3D semantic annotations. We present two self-supervised paradigms, GASP (WACV 2026) and QueryOcc (CVPR 2026), which model environments as continuous 4D occupancy fields using independent spatio-temporal queries. By deriving supervision from future sensor data and Vision Foundation Models (VFMs), GASP enables scalable LiDAR pre-training that encodes geometric and semantic scene evolution, significantly enhancing downstream mapping and prediction. We extend these principles to camera-only systems with QueryOcc, leveraging VFMs to construct pseudo-semantic point clouds for explicit 3D occupancy prediction. Together, these works demonstrate how broad 4D pre-training can be distilled into precise, label-efficient scene understanding for autonomous mobility.

**Bio:**
Lars Hammarstrand is a professor in Signal Processing at Chalmers University of Technology, where he leads research on robust perception for autonomous systems. His work focuses on self-supervised and semi-supervised learning for 3D scene understanding, including geometry, semantic occupancy, and dynamic motion modeling from raw sensor streams. Hammarstrand’s research spans both foundational methods and practical systems, with applications in autonomous driving, mapping, and localisation.

#### Teaching Trackers to Think: From Appearance Matching to Agentic Perception (Tong Shen)

**Abstract:**
Deep learning trackers have made remarkable progress on standard benchmarks, yet they still fail catastrophically at critical moments, such as occlusions, near visually similar distractors, or through rapid motion. Humans, by contrast, track effortlessly through these challenges. We don't just match appearances; we reason about identity ("that rabbit has no black markings"), predict through occlusion ("it should reappear on the other side"), and recognize our own mistakes ("wait, I'm following the wrong one"). We hypothesize that this cognitive dimension, world knowledge and reasoning beyond what any tracking dataset can provide, is what makes human tracking robust, and that bridging this gap is the key to approaching human-level performance.

Expand Down
Binary file added assets/img/larsHammarstand.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.