Holopix50k: A Large-Scale In-the-wild Stereo Image Dataset

Presented at CVPR 2020 Workshop on Computer Vision for AR/VR

Owen Hua, Puneet Kohli, Pritish Uplavikar *, Anand Ravi *, Saravana Gunaseelan, Jason Orozco, Edward Li

* denotes equal contribution


Abstract. With the mass-market adoption of dual-camera mobile phones, leveraging stereo information in computer vision has become increasingly important. Current state-of-the-art methods utilize learning-based algorithms, where the amount and quality of training samples heavily influence results. Existing stereo image datasets are limited either in size or subject variety. Hence, algorithms trained on such datasets do not generalize well to scenarios encountered in mobile photography. We present Holopix50k, a novel in-the-wild stereo image dataset, comprising 49,368 image pairs contributed by users of the Holopix™ mobile social platform. In this work, we describe our data collection process and statistically compare our dataset to other popular stereo datasets. We experimentally show that using our dataset significantly improves results for tasks such as stereo super-resolution and self-supervised monocular depth estimation. Finally, we showcase practical applications of our dataset to motivate novel works and use cases.

Dataset samples

The following video showcases some of the left-right image pairs from the Holopix50k dataset

The class diversity of the Holopix50k dataset can be seen in the images below

Holopix50k diversity

Downloading the dataset

You can find the download instructions for Holopix50k here.


If you find the dataset or the benchmarks provided in this paper useful in your research, please cite this work using the following BibTeX:

author = {Yiwen Hua and Puneet Kohli and Pritish Uplavikar and Anand Ravi and Saravana Gunaseelan and Jason Orozco and Edward Li},
title = {Holopix50k: A Large-Scale In-the-wild Stereo Image Dataset},
booktitle = {CVPR Workshop on Computer Vision for Augmented and Virtual Reality, Seattle, WA, 2020.},
month = {June},
year = {2020}



The Holopix50k dataset was crowd-sourced from the Holopix™ mobile social platform. Holopix was created in 2018 by Leia Inc. as a Lightfield image-sharing social media platform.


If you have any questions about the dataset, feel free to contact any of the above listed authors at {first}.{last}@leiainc.com.