RGB-W: When Vision Meets Wireless

Computer Science Department, Stanford University
International Conference on Computer Vision, December 2015


Inspired by the recent success of RGB-D cameras, we propose the enrichment of RGB data with an additional quasi-free modality, namely, the wireless signal emitted by individuals' cell phones, referred to as RGB-W. The received signal strength acts as a rough proxy for depth and a reliable cue on a person's identity. Although the measured signals are noisy, we demonstrate that the combination of visual and wireless data significantly improves the localization accuracy. We introduce a novel image-driven representation of wireless data which embeds all received signals onto a single image. We then evaluate the ability of this additional data to (i) locate persons within a sparsity-driven framework and to (ii) track individuals with a new confidence measure on the data association problem. Our solution outperforms existing localization methods. It can be applied to the millions of currently installed RGB cameras to better analyze human behavior and offer the next generation of high-accuracy location-based services.


Full Paper: [pdf] (1.2 MB)

Bibtex Citation

  title         = {RGB-W: When Vision Meets Wireless},
  author        = {Alexandre Alahi and Albert Haque and Li Fei-Fei},
  booktitle     = {International Conference on Computer Vision},
  year          = {2015}

RGB-W Dataset





Sequence Name Length (mm:ss) # Frames # People # W Devices Modalities Download
conference-1 01:53 1,697 5 5 rgb, depth, W zip (116 M)
conference-2 05:18 4,782 12 12 rgb, depth, W zip (379 M)
conference-3 23:31 21,165 1 2 rgb, depth, W zip (1.32 G)
conference-4 06:27 4,832 1 2 rgb, depth, W zip (357 M)
conference-5 06:03 4,525 2 2 rgb, depth, W zip (290 M)
patio-1 07:22 6,636 4 4 rgb, depth, W zip (474 M)
patio-2 04:36 4,144 2 2 rgb, depth, W zip (258 M)
all 55:10 47,781 -- -- rgb, depth, W zip (3.23 G)


Check the box to show contact information