OVRseen: Auditing Network Traffic and Privacy Policies in Oculus VR

This paper is included in the Proceedings of the

31st USENIX Security Symposium.

August 10–12, 2022 • Boston, MA, USA

978-1-939133-31-1

Open access to the Proceedings of the

31st USENIX Security Symposium is

sponsored by USENIX.

OVRseen: Auditing Network Traffic

and Privacy Policies in Oculus VR

Rahmadi Trimananda, Hieu Le, Hao Cui, and Janice Tran Ho,

University of California, Irvine; Anastasia Shuba, Independent Researcher;

Athina Markopoulou, University of California, Irvine

https://www.usenix.org/conference/usenixsecurity22/presentation/trimananda

OVRSEEN: Auditing Network Trafﬁc and Privacy Policies in Oculus VR

Rahmadi Trimananda,

Hieu Le,

Hao Cui,

Janice Tran Ho,

Anastasia Shuba,

and Athina Markopoulou

University of California, Irvine

Independent Researcher

Abstract

Virtual reality (VR) is an emerging technology that enables

new applications but also introduces privacy risks. In this

paper, we focus on Oculus VR (OVR), the leading platform in

the VR space and we provide the ﬁrst comprehensive analysis

of personal data exposed by OVR apps and the platform itself,

from a combined networking and privacy policy perspective.

We experimented with the Quest 2 headset and tested the

most popular VR apps available on the ofﬁcial Oculus and the

SideQuest app stores. We developed OVRSEEN, a method-

ology and system for collecting, analyzing, and comparing

network trafﬁc and privacy policies on OVR. On the network-

ing side, we captured and decrypted network trafﬁc of VR

apps, which was previously not possible on OVR, and we

extracted data ﬂows, deﬁned as

app, data type, destination

Compared to the mobile and other app ecosystems, we found

OVR to be more centralized and driven by tracking and an-

alytics, rather than by third-party advertising. We show that

the data types exposed by VR apps include personally identi-

ﬁable information (PII), device information that can be used

for ﬁngerprinting, and VR-speciﬁc data types. By comparing

the data ﬂows found in the network trafﬁc with statements

made in the apps’ privacy policies, we found that approxi-

mately 70% of OVR data ﬂows were not properly disclosed.

Furthermore, we extracted additional context from the privacy

policies, and we observed that 69% of the data ﬂows were

used for purposes unrelated to the core functionality of apps.

1 Introduction

Virtual reality (VR) technology has created an emerging mar-

ket: VR hardware and software revenues are projected to in-

crease from $800 million in 2018 to $5.5 billion in 2023 [50].

Among VR platforms, Oculus VR (OVR) is a pioneering, and

arguably the most popular one: within six months since Oc-

tober 2020, an estimated ﬁve million Quest 2 headsets were

sold [16, 20]. VR technology enables a number of applica-

tions, including recreational games, physical training, health

therapy, and many others [49].

VR also introduces privacy risks: some are similar to those

on other Internet-based platforms (e.g., mobile phones [12,13],

IoT devices [3, 17], and Smart TVs [35, 64]), but others are

unique to VR. For example, VR headsets and hand controllers

are equipped with sensors that may collect data about the

user’s physical movement, body characteristics and activity,

voice activity, hand tracking, eye tracking, facial expressions,

and play area [25, 34], which may in turn reveal information

about our physique, emotions, and home. The privacy aspects

of VR platforms are currently not well understood [2].

To the best of our knowledge, our work is the ﬁrst large

scale, comprehensive measurement and characterization of

privacy aspects of OVR apps and platform, from a combined

network and privacy policy point of view. We set out to char-

acterize how sensitive information is collected and shared

in the VR ecosystem, in theory (as described in the privacy

policies) and in practice (as exhibited in the network trafﬁc

generated by VR apps). We center our analysis around the

concept of data ﬂow, which we deﬁne as the tuple

app, data

type, destination

extracted from the network trafﬁc. First,

we are interested in the sender of information, namely the

VR app. Second, we are interested in the exposed data types,

including personally identiﬁable information (PII), device in-

formation that can be used for ﬁngerprinting, and VR sensor

data. Third, we are interested in the recipient of the infor-

mation, namely the destination domain, which we further

categorize into entity or organization, ﬁrst vs. third party w.r.t.

the sending app, and ads and tracking services (ATS). Inspired

by the framework of contextual integrity [38], we also seek

to characterize whether the data ﬂows are appropriate or not

within their context. More speciﬁcally, our notion of context

includes: consistency, i.e., whether actual data ﬂows extracted

from network trafﬁc agree with the corresponding statements

made in the privacy policy; purpose, extracted from privacy

policies and conﬁrmed by destination domains (e.g., whether

they are ATS); and other information (e.g., “notice and con-

sent”). Our methodology and system, OVRSEEN, is depicted

on Fig. 1. Next we summarize our methodology and ﬁndings.

Network trafﬁc: methodology and ﬁndings.

We were

able to explore 140 popular, paid and free, OVR apps; and

then capture, decrypt, and analyze the network trafﬁc they

generate in order to assess their practices with respect to col-

lection and sharing of personal data on the OVR platform.

OVRSEEN collects network trafﬁc without rooting the

Quest 2, by building on the open-source AntMonitor [51],

which we had to modify to work on the Android 10-based Ocu-

USENIX Association 31st USENIX Security Symposium 3789

Network Traffic Analysis

App Stores

Raw Data

PCAPNG

JSON

Network Traffic Collection

Oculus Quest 2

App

Frida

Agent

AntMonitor

Unity & Unreal

Libraries

Frida Client

Oculus/Facebook,

Unity, Unreal,

Third Parties

Privacy Policies

Data Flows

In Context

App

Data Type

Destination

Entity Ontology

Data Ontology Collection

Statements

App

Data Type

Entity

Polisis

Improved

PoliCheck

Translation

Context

Consistency

Purpose

Other

Data Flows

App

Data Type

Destination

Network

Traffic

Analysis

Data Types

Exposures

ATS

Ecosystem

2 31

Figure 1:

Overview of OVRSEEN.

We select the most popular apps from the ofﬁcial Oculus and SideQuest app stores. First, we

experiment with them and analyze their

network trafﬁc:

(1) we obtain raw data in PCAPNG and JSON; (2) we extract data

ﬂows

app, data type, destination

; and (3) we analyze them w.r.t. data types and ATS ecosystem. Second, we analyze the same

apps’ (and their used libraries’)

privacy policies

: (4) we build VR-speciﬁc data and entity ontologies, informed both by network

trafﬁc and privacy policy text; and (5) we extract collection statements

app, data type, entity

from the privacy policy. Third,

compare the two

: (6) using our improved PoliCheck, we map each data ﬂow to a collection statement, and we perform

network-to-policy consistency analysis. Finally, (7) we translate the sentence containing the collection statement into a text

segment that Polisis can use to extract the data collection purpose. The end result is that data ﬂows, extracted from network

trafﬁc, are augmented with additional context, such as consistency with policy and purpose of collection.

lus OS. Furthermore, we successfully addressed new technical

challenges for decrypting network trafﬁc on OVR. OVRSEEN

combines dynamic analysis (using Frida [42]) with binary

analysis to ﬁnd and bypass certiﬁcate validation functions,

even when the app contains a stripped binary [63]. This was a

challenge speciﬁc to OVR: prior work on decrypting network

trafﬁc on Android [35,52] hooked into standard Android SDK

functions and not the ones packaged with Unity and Unreal,

which are the basis for game apps.

We extracted and analyzed data ﬂows found in the col-

lected network trafﬁc from the 140 OVR apps, and we made

the following observations. We studied a broad range of 21

data types that are exposed and found that 33 and 70 apps

send PII data types (e.g., Device ID, User ID, and Android

ID) to ﬁrst-and third-party destinations, respectively (see Ta-

ble 3). Notably, 58 apps expose VR sensory data (e.g., physi-

cal movement, play area) to third-parties. We used state-of-

the-art blocklists to identify ATS and discovered that, unlike

other popular platforms (e.g., Android and Smart TVs), OVR

exposes data primarily to tracking and analytics services, and

has a less diverse tracking ecosystem. Notably, the blocklists

identiﬁed only 36% of these exposures. On the other hand,

we found no data exposure to advertising services as ads on

OVR is still in an experimental phase [41].

We provide

an NLP-based methodology for analyzing the privacy policies

that accompany VR apps. More speciﬁcally, OVRSEEN maps

each data ﬂow (found in the network trafﬁc) to its correspond-

ing data collection statement (found in the text of the privacy

policy), and checks the consistency of the two. Furthermore,

it extracts the purpose of data ﬂows from the privacy pol-

icy, as well as from the ATS analysis of destination domains.

Consistency, purpose, and additional information provide con-

text, in which we can holistically assess the appropriateness

of a data ﬂow [38]. Our methodology builds on, combines,

and improves state-of-the-art tools originally developed for

mobile apps: PolicyLint [4], PoliCheck [5], and Polisis [19].

We curated VR-speciﬁc ontologies for data types and entities,

guided by both the network trafﬁc and privacy policies. We

also interfaced between different NLP models of PoliCheck

and Polisis to extract the purpose behind each data ﬂow.

Our network-to-policy consistency analysis revealed that

about 70% of data ﬂows from VR apps were not disclosed

or consistent with their privacy policies: only 30% were con-

sistent. Furthermore, 38 apps did not have privacy policies,

including apps from the ofﬁcial Oculus app store. Some app

developers also had the tendency to neglect declaring data

collected by the platform and third parties. We found that by

automatically including these other parties’ privacy policies

in OVRSEEN’s network-to-policy consistency analysis, 74%

of data ﬂows became consistent. We also found that 69%

of data ﬂows have purposes unrelated to the core function-

3790 31st USENIX Security Symposium USENIX Association

ality (e.g., advertising, marketing campaigns, analytics), and

only a handful of apps are explicit about notice and consent.

OVRSEEN’s implementation and datasets are made available

at [59].

Overview.

The rest of this paper is structured as follows.

Section 2 provides background on the OVR platform and its

data collection practices that motivate our work. Section 3

provides the methodology and results for OVRSEEN’s net-

work trafﬁc analysis. Section 4 provides the methodology and

results for OVRSEEN’s policy analysis, network-to-policy

consistency analysis, and purpose extraction. Section 5 dis-

cusses the ﬁndings and provides recommendations. Section 6

discusses related work. Section 7 concludes the paper.

2 Oculus VR Platform and Apps

In this paper, we focus on the Oculus VR (OVR), a represen-

tative of state-of-the art VR platform. A pioneer and leader

in the VR space, OVR was bought by Facebook in 2014 [16]

(we refer to both as “platform-party”), and it maintains to

be the most popular VR platform today. Facebook has inte-

grated more social features and analytics to OVR and now

even requires users to sign in using a Facebook account [39].

We used the latest Oculus device, Quest 2, for testing. Quest

2 is completely wireless: it can operate standalone and run

apps, without being connected to other devices. In contrast,

e.g., Sony Playstation VR needs to be connected to a Playsta-

tion 4 as its game controller. Quest 2 runs Oculus OS, a variant

of Android 10 that has been modiﬁed and optimized to run

VR environments and apps. The device comes with a few

pre-installed apps, such as the Oculus browser. VR apps are

usually developed using two popular game engines called

Unity [62] and Unreal [15]. Unlike traditional Android apps

that run on Android JVM, these 3D app development frame-

works compile VR apps into optimized (i.e., stripped) binaries

to run on Quest 2 [63].

Oculus has an ofﬁcial app store and a number of third-

party app stores. The Oculus app store offers a wide range

of apps (many of them are paid), which are carefully curated

and tested (e.g., for VR motion sickness). In addition to the

Oculus app store, we focus on SideQuest—the most popular

third-party app store endorsed by Facebook [32]. In contrast

to apps from the ofﬁcial store, apps available on SideQuest

are typically at their early development stage and thus are

mostly free. Many of them transition from SideQuest to the

Oculus app store once they mature and become paid apps. As

of March 2021, the ofﬁcial Oculus app store has 267 apps

(79 free and 183 paid), and the SideQuest app store has 1,075

apps (859 free and 218 paid).

Motivation: privacy risks in OVR.

VR introduces pri-

vacy risks, some of which are similar to other Internet-based

platforms (e.g., Android [12, 13], IoT devices [3, 17], Smart

TVs [35, 64]), etc.), while others are unique to the VR plat-

form. For example, VR headsets and hand controllers are

equipped with sensors that collect data about the user’s physi-

cal movement, body characteristics, voice activity, hand track-

ing, eye tracking, facial expressions, and play area [25,34,36],

which may in turn reveal sensitive information about our

physique, emotions, and home. Quest 2 can also act as a ﬁt-

ness tracker, thanks to the built-in Oculus Move app that

tracks time spent for actively moving and amount of calories

burned across all apps [40]. Furthermore, Oculus has been

continuously updating their privacy policy with a trend of

increasingly collecting more data over the years. Most no-

tably, we observed a major update in May 2018, coinciding

with the GDPR implementation date. Many apps have no pri-

vacy policy, or fail to properly include the privacy policies of

third-party libraries. Please see

Appendix A in [58]

for more

detail on observations that motivated our study, and Section 6

on related work. The privacy risks on the relatively new VR

platform are not yet well understood.

Goal and approach: privacy analysis of OVR.

In this pa-

per, we seek to characterize the privacy risks introduced when

potentially-sensitive data available on the device are sent by

the VR apps and/or the platform to remote destinations for var-

ious purposes. We followed an experimental and data-driven

approach, and we chose to test and analyze the most popular

VR apps. In Section 3, we characterize the actual behavior

exhibited in the network trafﬁc generated by these VR apps

and platform. In Section 4, we present how we downloaded

the privacy policies of the selected VR apps, the platform,

and relevant third-party libraries, used NLP to extract and

analyze the statements made about data collection, analyzed

their consistency when compared against the actual data ﬂows

found in trafﬁc, and extracted the purpose of data collection.

App corpus.

We selected OVR apps that are widely used by

players. Our app corpus consists of 150 popular paid and free

apps from both the ofﬁcial Oculus app store and SideQuest.

In contrast, previous work typically considered only free apps

from the ofﬁcial app store [12,13,35,64]. We used the number

of ratings/reviews as the popularity metric, and considered

only apps that received at least 3.5 stars. We selected three

groups of 50 apps each: (1) the top-50 free apps and (2) the

top-50 paid apps from the Oculus app store, and (3) the top-50

apps from the SideQuest store. We selected an equal number

of paid and free apps from the Oculus app store to gain insight

into both groups equally. We purposely did not just pick the

top-100 apps, because paid apps tend to receive more reviews

from users and this would bias our ﬁndings towards paid apps.

Speciﬁcally, this would make our corpus consist of 90% paid

and 10% free apps.

Our app corpus is representative of both app stores. Our top-

50 free and top-50 paid Oculus apps constitute close to 40%

of all apps on the Oculus app store, whereas the total number

of downloads of our top-50 SideQuest apps is approximately

USENIX Association 31st USENIX Security Symposium 3791

45% of all downloads for the SideQuest store. Out of these

150 apps, selected for their popularity and representativeness,

we were able to decrypt and analyze the network trafﬁc for

140 of them for reasons explained in Section 3.2.1.

3 OVRSEEN: Network Trafﬁc

In this section, we detail our methodology for collecting

and analyzing network trafﬁc. In Section 3.1, we present

OVRSEEN’s system for collecting network trafﬁc and high-

light our decryption technique. Next, in Section 3.2, we de-

scribe our network trafﬁc dataset and the extracted data ﬂows.

In Section 3.3, we report our ﬁndings on the OVR ATS ecosys-

tem by identifying domains that were labeled as ATS by pop-

ular blocklists. Finally, in Section 3.4, we discuss data types

exposures in the extracted data ﬂows according to the context

based on whether their destination is an ATS or not.

3.1 Network Trafﬁc Collection

In this section, we present OVRSEEN’s system for collecting

and decrypting the network trafﬁc that apps generate (

Fig. 1). It is important to mention that OVRSEEN does not re-

quire rooting Quest 2, and as of June 2021, there are no known

methods for doing so [21]. Since the Oculus OS is based on

Android, we enhanced AntMonitor [51] to support the Oculus

OS. Furthermore, to decrypt TLS trafﬁc, we use Frida [42], a

dynamic instrumentation toolkit. Using Frida to bypass cer-

tiﬁcate validation speciﬁcally for Quest 2 apps presents new

technical challenges, compared to Android apps that have a

different structure. Next, we describe these challenges and

how we address them.

Trafﬁc collection.

For collecting network trafﬁc,

OVRSEEN integrates AntMonitor [51]—a VPN-based

tool for Android that does not require root access. It runs

completely on the device without the need to re-route

trafﬁc to a server. AntMonitor stores the collected trafﬁc

in PCAPNG format, where each packet is annotated (in

the form of a PCAPNG comment) with the name of the

corresponding app. To decrypt TLS connections, AntMonitor

installs a user CA certiﬁcate. However, since Oculus OS

is a modiﬁed version of Android 10, and AntMonitor only

supports up to Android 7, we made multiple compatibility

changes to support Oculus OS. In addition, we enhanced

the way AntMonitor stores decrypted packets: we adjust the

sequence and ack numbers to make packet re-assembly by

common tools (e.g.,

tshark

) feasible in post-processing.

We will submit a pull request to AntMonitor’s open-source

repository, so that other researchers can make use of it, not

only on Quest 2, but also on other newer Android devices.

For further details, see Appendix B.1 in [58].

TLS decryption.

Newer Android devices, such as Quest

2, pose a challenge for TLS decryption: as of Android 7,

apps that target API level 24 (Android 7.0) and above no

longer trust user-added certiﬁcates [7]. Since Quest 2 cannot

be rooted, we cannot install AntMonitor’s certiﬁcate as a sys-

tem certiﬁcate. Thus, to circumvent the mistrust of AntMoni-

tor’s certiﬁcate, OVRSEEN uses Frida (see Fig. 1) to intercept

certiﬁcate validation APIs. To use Frida in a non-rooted envi-

ronment, we extract each app and repackage it to include and

start the Frida server when the app loads. The Frida server

then listens to commands from a Frida client that is running

on a PC using ADB. Although ADB typically requires a USB

connection, we run ADB over TCP to be able to use Quest 2

wirelessly, allowing for free-roaming testing of VR apps.

OVRSEEN uses the Frida client to load and inject our cus-

tom JavaScript code that intercepts various APIs used to ver-

ify CA certiﬁcates. In general, Android and Quest 2 apps

use three categories of libraries to validate certiﬁcates: (1)

the standard Android library, (2) the Mbed TLS library [61]

provided by the Unity SDK, and (3) the Unreal version of the

OpenSSL library [14]. OVRSEEN places Frida hooks into

the certiﬁcate validation functions provided by these three

libraries. These hooks change the return value of the inter-

cepted functions and set certain ﬂags used to determine the

validity of a certiﬁcate to ensure that AntMonitor’s certiﬁcate

is always trusted. While bypassing certiﬁcate validation in

the standard Android library is a widely known technique [9],

bypassing validation in Unity and Unreal SDKs is not. Thus,

we developed the following technique.

Decrypting Unity and Unreal.

Since most Quest 2 apps

are developed using either the Unity or the Unreal game en-

gines, they use the certiﬁcate validation functions provided

by these engines instead of the ones in the standard Android

library. Below, we present our implementation of certiﬁcate

validation bypassing for each engine.

For Unity, we discovered that the main function that

is responsible for checking the validity of certiﬁcates

mbedtls_x509_crt_verify_with_profile()

in the

Mbed TLS library, by inspecting its source code [6]. This

library is used by the Unity framework as part of its SDK.

Although Unity apps and its SDK are written in C#, the ﬁnal

Unity library is a C++ binary. When a Unity app is pack-

aged for release, unused APIs and debugging symbols get

removed from the Unity library’s binary. This process makes

it difﬁcult to hook into Unity’s functions since we cannot

locate the address of a function of interest without having

the symbol table to look up its address. Furthermore, since

the binary also gets stripped of unused functions, we can-

not rely on the debug versions of the binary to look up ad-

dresses because each app will have a different number of

APIs included. To address this challenge, OVRSEEN auto-

matically analyzes the debug versions of the non-stripped

Unity binaries (provided by the Unity engine), extracts

the function signature (i.e., a set of hexadecimal numbers)

mbedtls_x509_crt_verify_with_profile()

, and then

looks for this signature in the stripped version of the binary

3792 31st USENIX Security Symposium USENIX Association

App Store Apps Domains eSLDs Packets TCP Fl.

Oculus-Free 43 85 48 2,818 2,126

Oculus-Paid 49 54 35 2,278 1,883

SideQuest 48 57 40 2,679 2,260

Total 140 158 92 7,775 6,269

Table 1:

Network trafﬁc dataset summary.

Note that the

same domains and eSLDs can appear across the three groups

of “App Store”, so their totals are based on unique counts.

to ﬁnd its address. This address can then be used to create the

necessary Frida hook for an app. The details of this automated

binary analysis can be found in Appendix B.2 in [58].

For Unreal, we discovered that the main function that is

responsible for checking the validity of certiﬁcates is the func-

tion

x509_verify_cert()

in the OpenSSL library, which

is integrated as part of the Unreal SDK. Fortunately, the

Unreal SDK binary ﬁle comes with a partial symbol table

that contains the location of

x509_verify_cert()

, and thus,

OVRSEEN can set a Frida hook for it.

3.2 Network Trafﬁc Dataset

3.2.1 Raw Network Trafﬁc Data

We used OVRSEEN to collect network trafﬁc for 140

apps

in our corpus during the months of March and April 2021. To

exercise these 140 apps and collect their trafﬁc, we manually

interacted with each one for seven minutes. Although there are

existing tools that automate the exploration of regular (non-

gaming) mobile apps (e.g., [28]), automatic interaction with

a variety of games is an open research problem. Fortunately,

manual testing allows us to customize app exploration and

split our testing time between exploring menus within the app

to cover more of the potential behavior, and actually playing

the game, which better captures the typical usage by a human

user. As shown by prior work, such testing criteria lead to

more diverse network trafﬁc and reveal more privacy-related

data ﬂows [22, 47, 64]. Although our methodology might not

be exhaustive, it is inline with prior work [35, 64].

Table 1 presents the summary of our network trafﬁc dataset.

We discovered 158 domains and 92 eSLDs in 6,269 TCP ﬂows

that contain 7,775 packets. Among the 140 apps, 96 were

developed using the Unity framework, 31 were developed

using the Unreal framework, and 13 were developed using

other frameworks.

The remaining 10 apps were excluded for the following reasons: (1) six

apps could not be repackaged; (2) two apps were browser apps, which would

open up the web ecosystem, diverting our focus from VR; (3) one app was

no longer found on the store—we created our lists of top apps one month

ahead of our experiments; and (4) one app could not start on the Quest 2 even

without any of our modiﬁcations.

3.2.2 Network Data Flows Extracted

We processed the raw network trafﬁc dataset and identiﬁed

1,135 data ﬂows:

app, data type, destination

. Next, we de-

scribe our methodology for extracting that information.

App names.

For each network packet, the app name

is obtained by AntMonitor [51]. This feature required

a modiﬁcation to work on Android 10, as described in

Appendix B.1 in [58].

Data types.

The data types we extracted from our network

trafﬁc dataset are listed in Table 3 and can be categorized into

roughly three groups. First, we ﬁnd personally identiﬁable in-

formation (PII), including: user identiﬁers (e.g., Name, Email,

and User ID), device identiﬁers (Android ID, Device ID, and

Serial Number), Geolocation, etc. Second, we found system

parameters and settings, whose combinations are known to

be used by trackers to create unique proﬁles of users [35, 37],

i.e., Fingerprints. Examples include various version informa-

tion (e.g., Build and SDK Versions), Flags (e.g., indicating

whether the device is rooted or not), Hardware Info (e.g., De-

vice Model, CPU Vendor, etc.), Usage Time, etc. Finally, we

also ﬁnd data types that are unique to VR devices (e.g., VR

Movement and VR Field of View) and group them as VR Sen-

sory Data. These can be used to uniquely identify a user or

convey sensitive information—the VR Play Area, for instance,

can represent the actual area of the user’s household.

We use several approaches to ﬁnd these data types in

HTTP headers and bodies, and also in any raw TCP seg-

ments that contain ASCII characters. First, we use string

matching to search for data that is static by nature. For exam-

ple, we search for user proﬁle data (e.g., User Name, Email,

etc.) using our test OVR account and for any device iden-

tiﬁers (e.g., Serial Number, Device ID, etc.) that can be re-

trieved by browsing the Quest 2 settings. In addition, we

search for their MD5 and SHA1 hashes. Second, we utilize

regular expressions to capture more dynamic data types. For

example, we can capture different Unity SDK versions using

UnityPlayer/[\d.]+\d

. Finally, for cases where a packet

contains structured data (e.g., URL query parameters, HTTP

Headers, JSON in HTTP body, etc.), we split the packet into

key-value pairs and create a list of unique keys that appear

in our entire network trafﬁc dataset. We then examine this

list to discover keys that can be used to further enhance our

search for data types. For instance, we identiﬁed that the

keys “user_id” and “x–playeruid” can be used to ﬁnd User

IDs.

Appendix C.1 in [58]

provides more details on our data

types.

Destinations.

To extract the destination fully qualiﬁed do-

main name (FQDN), we use the HTTP Host ﬁeld and the TLS

SNI (for cases where we could not decrypt the trafﬁc). Using

tldextract, we also identify the effective second-level domain

(eSLD) and use it to determine the high level organization

that owns it via Crunchbase. We also adopt similar labeling

USENIX Association 31st USENIX Security Symposium 3793

(a)

(b)

Figure 2:

Top-10 platform and third-party (a) eSLDs and (b) ATS FQDNs.

They are ordered by the number of apps that

contact them. Each app may have a few ﬁrst-party domains: we found that 46 out of 140 (33%) apps contact their own eSLDs.

methodologies from [64] and [5] to categorize each destina-

tion as either ﬁrst-, platform-, or third-party. To perform the

categorization, we also make use of collected privacy poli-

cies (see Fig. 1 and Section 4), as described next. First, we

tokenize the domain and the app’s package name. We label a

domain as ﬁrst-party if the domain’s tokens either appear in

the app’s privacy policy URL or match the package name’s

tokens. If the domain is part of cloud-based services (e.g.,

vrapp.amazonaws.com), we only consider the tokens in the

subdomain (vrapp). Second, we categorize the destination as

platform-party if the domain contains the keywords “oculus”

or “facebook”. Finally, we default to the third-party label.

This means that the data collection is performed by an entity

that is not associated with app developers nor the platform,

and the developer may not have control of the data being

collected. The next section presents further analysis of the

destination domains.

3.3 OVR Advertising & Tracking Ecosystem

In this section, we explore the destination domains found in

our network trafﬁc dataset (see Section 3.2.2). Fig. 2a presents

the top-10 eSLDs for platform and third-party. We found that,

unlike the mobile ecosystem, the presence of third-parties

is minimal and platform trafﬁc dominates in all apps (e.g.,

oculus.com

f acebook.com

). The most prominent third-party

organization is Unity (e.g., unity3d.com), which appears in

68 out of 140 apps (49%). This is expected since 96 apps in

our dataset were developed using the Unity engine (see Sec-

tion 3.2.1). Conversely, although 31 apps in our dataset were

developed using the Unreal engine, it does not appear as a ma-

jor third-party data collector because Unreal does not provide

its own analytics service. Beyond Unity, other small players

include Alphabet (e.g., google.com, cloudfunctions.net) and

Amazon (e.g., amazonaws.com). In addition, 87 out of 140

apps contact four or fewer third-party eSLDs (62%).

Identifying ATS domains.

To identify ATS domains, we

apply the following popular domain-based blocklists: (1) Pi-

Hole’s Default List [43], a list that blocks cross-platform ATS

domains for IoT devices; (2) Mother of All Adblocking [8],

a list that blocks both ads and tracking domains for mobile

devices; and (3) Disconnect Me [10], a list that blocks track-

ing domains. For the rest of the paper, we will refer to the

above lists simply as “blocklists”. We note that there are no

blocklists that are curated for VR platforms. Thus, we choose

blocklists that balance between IoT and mobile devices, and

one that specializes in tracking.

OVR ATS ecosystem.

The majority of identiﬁed ATS do-

mains relate to social and analytics-based purposes. Fig. 2b

provides the top-10 ATS FQDNs that are labeled by our block-

lists. We found that the prevalent platform-related FQDNs

along with Unity, the prominent third party, are labeled as

ATS. This is expected: domains such as graph.oculus.com

and perf-events.cloud.unity3d.com are utilized for social

features like managing leaderboards and app analytics,

respectively. We also consider the presence of organiza-

tions based on the number of unique domains contacted.

The most popular organization is Alphabet, which has

13 domains, such as google-analytics.com and ﬁrebase-

settings.crashlytics.com. Four domains are associated with

Facebook, such as graph.facebook.com. Similarly, four are

from Unity, such as userreporting.cloud.unity3d.com and

conﬁg.uca.cloud.unity3d.com. Other domains are associated

with analytics companies that focus on tracking how users

interact with apps (e.g., whether they sign up for an ac-

count) such as logs-01.loggly.com, api.mixpanel.com, and

api2.amplitude.com. Lastly, we provide an in-depth compari-

son to other ecosystems in Section 5.1.

Missed by blocklists.

The three blocklists that we use in

OVRSEEN are not tailored for the Oculus platform. As a

result, there could be domains that are ATS related but not

labeled as such. To that end, we explored and leveraged data

ﬂows to ﬁnd potential domains that are missed by blocklists.

In particular, we start from data types exposed in our network

trafﬁc, and identify the destinations where these data types

are sent to. Table 2 summarizes third-party destinations that

3794 31st USENIX Security Symposium USENIX Association

FQDN Organization Data Types

bdb51.playfabapi.com Microsoft 11

sharedprod.braincloudservers.com bitHeads Inc. 8

cloud.liveswitch.io

Frozen Mountain

Software

datarouter.ol.epicgames.com Epic Games 6

9e0j15elj5.execute-api.us-west-

1.amazonaws.com

Amazon 5

Table 2: Top-5 third-party FQDNs that are missed by block-

lists based on the number of data types exposed.

collect the most data types and are not already captured by

any of the blocklists. We found the presence of 11 different

organizations, not caught by blocklists, including: Microsoft,

bitHeads Inc., and Epic Games—the company that created

the Unreal engine. The majority are cloud-based services that

provide social features, such as messaging, and the ability to

track users for engagement and monetization (e.g., promotions

to different segments of users). We provide additional FQDNs

missed by blocklists in Appendix C.2 in [58].

3.4 Data Flows in Context

The exposure of a particular data type, on its own, does not

convey much information: it may be appropriate or inappropri-

ate depending on the context [38]. For example, geolocation

sent to the GoogleEarth VR or Wander VR app is necessary

for the functionality, while geolocation used for ATS purposes

is less appropriate. The network trafﬁc can be used to partly

infer the purpose of data ﬂows, e.g., depending on whether

the destination being ﬁrst-, third-, or platform-party; or an

ATS. Table 3 lists all data types found in our network trafﬁc,

extracted using the methods explained in Section 3.2.2.

Third party.

Half of the apps (70 out of 140) expose data

ﬂows to third-party FQDNs, 36% of which are labeled as

ATS by blocklists. Third parties collect a number of PII data

types, including Device ID (64 apps), User ID (65 apps), and

Android ID (31 apps), indicating cross-app tracking. In addi-

tion, third parties collect system, hardware, and version info

from over 60 apps—denoting the possibility that the data

types are utilized to ﬁngerprint users. Further, all VR speciﬁc

data types, with the exception of VR Movement, are collected

by a single third-party ATS domain belonging to Unity. VR

Movement is collected by a diverse set of third-party desti-

nations, such as google-analytics.com, playfabapi.com and

logs-01.loggly.com, implying that trackers are becoming in-

terested in collecting VR analytics.

Platform party.

Our ﬁndings on exposures to platform-

party domains are a lower bound since not all platform trafﬁc

could be decrypted (see Section 7). However, even with lim-

ited decryption, we see a number of exposures whose destina-

tions are ﬁve third-party FQDNs. Although only one of these

Data Types (21) Apps FQDNs % Blocked

PII

Pl. 1

Pl.

Device ID 6 64 2 6 13 1 0 38 100

User ID 5 65 0 5 13 0 20 38 -

Android ID 6 31 18 67217 43 50

Serial Number 0 0 18 002 - - 50

Person Name 1701400 50 -

Email 2502500 20 -

Geolocation 050040 - 50 -

Fingerprint

SDK Version 23 69 20 34 28 4 6 46 0

Hardware Info 21 65 19 25 23 3 4 39 33

System Version 16 62 19 20 21 3 5 43 33

Session Info 7 66 2 7 13 1 14 46 100

App Name 4 65 2 4 10 1 25 40 100

Build Version 0 61 0 030 - 100 -

Flags 6 53 2 6810 50 100

Usage Time

2 59 0 2400 50 -

Language 5 28 16 5910 56 0

Cookies 5425310 33 100

VR Sensory Data

VR Play Area 0 40 0 010 - 100 -

VR Movement 1 24 2 1610 67 100

VR Field of View

0 16 0 010 - 100 -

VR Pupillary 0 16 0 010 - 100 -

Distance

Total 33 70 22 44 39 5 5 36 20

Table 3:

Data types exposed in the network trafﬁc dataset.

Column “Apps” reports the number of apps that send the data

type to a destination; column “FQDNs” reports the number of

FQDNs that receive that data type; and column “% Blocked”

reports the percentage of FQDNs blocked by blocklists. Using

sub-columns, we denote party categories: ﬁrst (1

), third (3

and platform (Pl.) parties.

FQDNs is labeled as ATS by the blocklists, other platform-

party FQDNs could be ATS domains that are missed by block-

lists (see Section 3.3). For example, graph.facebook.com is an

ATS FQDN, and graph.oculus.com appears to be its counter-

part for OVR; it collects six different data types in our dataset.

Notably, the platform party is the sole party responsible for

collecting a sensitive hardware ID that cannot be reset by the

user—the Serial Number. In contrast to OVR, the Android

developer guide strongly discourages its use [18].

First party.

Only 33 apps expose data ﬂows to ﬁrst-party

FQDNs, and only 5% of them are labeled as ATS. Interest-

ingly, the blocklists tend to have higher block rates for ﬁrst-

party FQDNs if they collect certain data types, e.g., Android

ID (17%), User ID (20%), and App Name (25%). Popular

data types collected by ﬁrst-party destinations are Hardware

Info (21 apps), SDK Version (23 apps), and System Version

(16 apps). For developers, this information can be used to

prioritize bug ﬁxes or improvements that would impact the

most users. Thus, it makes sense that only ~5% of ﬁrst-party

FQDNs that collect this information are labeled as ATS.

USENIX Association 31st USENIX Security Symposium 3795

Summary.

The OVR ATS ecosystem is young when com-

pared to Android and Smart TVs. It is dominated by tracking

domains for social features and analytics, but not by ads. We

have detailed 21 different data types that OVR sends to ﬁrst-,

third-, and platform-parties. State-of-the-art blocklists only

captured 36% of exposures to third parties, missing some

sensitive exposures such as Email, User ID, and Device ID.

4 OVRSEEN: Privacy Policy Analysis

In this section, we turn our attention to the intended data

collection and sharing practices, as stated in the text privacy

policy. For example, from the text ”We may collect your email

address and share it for advertising purposes”, we want to ex-

tract the collection statement (“we”, which implies the app’s

ﬁrst-party entity; “collect” as action; and “email address” as

data type) and the purpose (“advertising”). In Section 4.1.1,

we present our methodology for extracting data collection

statements, and comparing them against data ﬂows found in

network trafﬁc for consistency. OVRSE EN builds and im-

proves on state-of-the-art NLP-based tools: PoliCheck [5]

and PolicyLint [4], previously developed for mobile apps.

In Section 4.1.2, we present our VR-speciﬁc ontologies for

data types and entities. In Section 4.1.3, we report network-

to-policy consistency results. Section 4.2 describes how we

interface between the different NLP models of PoliCheck and

Polisis to extract the data collection purpose and other context

for each data ﬂow.

Collecting privacy policies.

For each app in Section 3, we

also collected its privacy policy on the same day that we

collected its network trafﬁc. Speciﬁcally, we used an auto-

mated Selenium [56] script to crawl the webstore and ex-

tracted URLs of privacy policies. For apps without a policy

listed, we followed the link to the developer’s website to ﬁnd

a privacy policy. We also included eight third-party policies

(e.g., from Unity, Google), referred to by the apps’ policies.

For the top-50 free apps on the Oculus store, we found that

only 34 out of the 43 apps have privacy policies. Surprisingly,

for the top-50 paid apps, we found that only 39 out of 49

apps have privacy policies. For the top-50 apps on SideQuest,

we found that only 29 out of 48 apps have privacy policies.

Overall, among apps in our corpus, we found that only 102

(out of 140) apps provide valid English privacy policies. We

treated the remaining apps as having empty privacy policies,

ultimately leading OVRSE EN to classify their data ﬂows as

omitted disclosures.

4.1 Network-to-Policy Consistency

Our goal is to analyze text in the app’s privacy policy, extract

statements about data collection (and sharing), and compare

them against the actual data ﬂows found in network trafﬁc.

4.1.1 Consistency Analysis System

OVRSEEN builds on state-of-the-art tools: PolicyLint [4] and

PoliCheck [5]. PolicyLint [4] provides an NLP pipeline that

takes a sentence as input. For example, it takes the sentence

“We may collect your email address and share it for advertising

purposes”, and extracts the collection statement “(entity: we,

action: collect, data type: email address)”. More generally,

PolicyLint takes the app’s privacy policy text, parses sentences

and performs standard NLP processing, and eventually ex-

tracts data collection statements deﬁned as the tuple

P =h

app,

data type, entity

, where app is the sender and entity is the

recipient performing an action (collect or not collect) on the

data type. PoliCheck [5] takes the app’s data ﬂows (extracted

from the network trafﬁc and deﬁned as

F =h

data type, entity

)

and compares it against the stated P for consistency.

PoliCheck classiﬁes the disclosure of

as clear (if the data

ﬂow exactly matches a collection statement), vague (if the

data ﬂow matches a collection statement in broader terms),

omitted (if there is no collection statement corresponding to

the data ﬂow), ambiguous (if there are contradicting collection

statements about a data ﬂow), or incorrect (if there is a data

ﬂow for which the collection statement states otherwise). Fol-

lowing PoliCheck’s terminology [5], we further group these

ﬁve types of disclosures into two groups: consistent (clear and

vague disclosures) and inconsistent (omitted, ambiguous, and

incorrect) disclosures. The idea is that for consistent disclo-

sures, there is a statement in the policy that matches the data

type and entity, either clearly or vaguely. Table 4 provides

real examples of data collection disclosures extracted from

VR apps that we analyzed.

Consistency analysis relies on pre-built ontologies and syn-

onym lists used to match (i) the data type and destination

that appear in each

with (ii) any instance of

that dis-

closes the same (or a broader) data type and destination

OVRSEEN’s adaptation of ontologies speciﬁcally for VR is

described in Section 4.1.2. We also improved several aspects

of PoliCheck, as described in detail in

Appendix D.1 in [58]

First, we added a feature to include a third-party privacy policy

for analysis if it is mentioned in the app’s policy. We found

that 30% (31/102) of our apps’ privacy policies reference

third-party privacy policies, and the original PoliCheck would

mislabel third-party data ﬂows from these apps as omitted.

Second, we added a feature to more accurately resolve ﬁrst-

party entity names. Previously, only ﬁrst-person pronouns

(e.g., “we”) were used to indicate a ﬁrst-party reference, while

some privacy policies use company and app names in ﬁrst-

party references. The original PoliCheck would incorrectly

For example (see Fig. 3a), “email address” is a special case of “contact

info” and, eventually, of “pii”. There is a clear disclosure w.r.t. data type if

the “email address” is found in a data ﬂow and a collection statement. A

vague disclosure is declared if the “email address” is found in a data ﬂow

and a collection statement that uses the term “pii” in the privacy policy. An

omitted disclosure means that “email address” is found in a data ﬂow, but

there is no mention of it (or any of its broader terms) in the privacy policy.

3796 31st USENIX Security Symposium USENIX Association

Disclosure Type Privacy Policy Text Action : Data Collection Statement (P) Data Flow (F)

Consistent

Clear

“For example, we collect information ..., and a

timestamp for the request.”

collect : hcom.cvr.terminus, usage time, weihusage time, wei

Vague “We will share your information (in some cases collect : hcom.HomeNetGames.WW1oculus, hserial number, oculusi

personal information) with third-parties, ...” pii, third partyihandroid id, oculusi

Omitted - collect : hcom.kluge.SynthRiders, -, -ihsystem version, oculusi

hsdk version, oculusi

hardware information, oculus

Inconsistent

Ambiguous “..., Skydance will not disclose any Personally collect : hcom.SDI.TWD, pii, third partyihserial number, oculusi

Identiﬁable Information to third parties ... handroid id, oculusi

your Personally Identiﬁable Information will be

disclosed to such third parties and ...”

Incorrect “We do not share our customer’s personal in- not_collect : hcom.downpourinteractive. hdevice id, unityi

formation with unafﬁliated third parties ...” onward, pii, third partyihuser id, oculusi

Table 4:

Examples to illustrate the types of disclosures identiﬁed by PoliCheck.

A data collection statement (P) is extracted

from the privacy policy text and is deﬁned as the tuple

P =h

app, data type, entity

. A data ﬂow (F) is extracted from the network

trafﬁc and is deﬁned as

F =h

data type, entity

. During the consistency analysis, each

can be mapped to zero, one, or more

recognize these ﬁrst-party references as third-party entities

for 16% (16/102) of our apps’ privacy policies.

4.1.2 Building Ontologies for VR

Ontologies are used to represent subsumptive relationships

between terms: a link from term A to term B indicates that A is

a broader term (hypernym) that subsumes B. There are two on-

tologies, namely data and entity ontologies: the data ontology

maps data types and entity ontology maps destination entities.

Since PoliCheck was originally designed for Android mobile

app’s privacy policies, it is important to adapt the ontologies

to include data types and destinations speciﬁc to VR’s privacy

policies and actual data ﬂows.

VR data ontology.

Fig. 3a shows the data ontology we de-

veloped for VR apps. Leaf nodes correspond to all 21 data

types found in the network trafﬁc and listed in Table 3. Non-

leaf nodes are broader terms extracted from privacy policies

and may subsume more speciﬁc data types, e.g., “device iden-

tiﬁer” is a non-leaf node that subsumes “android id”. We built

a VR data ontology, starting from the original Android data

ontology, in a few steps as follows. First, we cleaned up the

original data ontology by removing data types that do not

exist on OVR (e.g., “IMEI”, “SIM serial number”, etc.). We

also merged similar terms (e.g., “account information” and

“registration information”) to make the structure clearer. Next,

we used PoliCheck to parse privacy policies from VR apps.

When PoliCheck parses the sentences in a privacy policy, it

extracts terms and tries to match them with the nodes in the

data ontology and the synonym list. If PoliCheck does not ﬁnd

a match for the term, it will save it in a log ﬁle. We inspected

each term from this log ﬁle, and added it either as a new node

in the data ontology or as a synonym to an existing term in

the synonym list. Finally, we added new terms for data types

identiﬁed in network trafﬁc (see Section 3.4) as leaf nodes in

the ontology. Most notably, we added VR-speciﬁc data types

(see VR Sensory Data category shown in Table 3): “biomet-

ric info” and “environment info”. The term “biometric info”

includes physical characteristics of human body (e.g., height,

weight, voice, etc.); we found some VR apps that collect

user’s “pupillary distance” information. The term “environ-

ment information” includes VR-speciﬁc sensory information

that describes the physical environment; we found some VR

apps that collect user’s “play area” and “movement”. Table 5

shows the summary of the new VR data ontology. It consists

of 63 nodes: 39 nodes are new in OVRSEEN’s data ontology.

Overall, the original Android data ontology was used to track

12 data types (i.e., 12 leaf nodes) [5], whereas our VR data

ontology is used to track 21 data types (i.e., 21 leaf nodes)

appearing in the network trafﬁc (see Table 3 and Fig. 3a).

VR entity ontology.

Entities are names of companies and

other organizations which refer to destinations. We use a list

of domain-to-entity mappings to determine which entity each

domain belongs to (see

Appendix D.1 in [58]

)—domain ex-

traction and categorization as either ﬁrst-, third-, or platform-

party are described in detail in Section 3.2.2. We modiﬁed

the Android entity ontology to adapt it to VR as follows: (1)

we pruned entities that were not found in privacy policies of

VR apps or in our network trafﬁc dataset, and (2) we added

new entities found in both sources. Table 5 summarizes the

new entity ontology. It consists of 64 nodes: 21 nodes are new

in OVRSEEN’s entity ontology. Fig. 3b shows our VR entity

ontology, in which we added two new non-leaf nodes: “plat-

form provider” (which includes online distribution platforms

or app stores that support the distribution of VR apps) and

“api” (which refers to various third-party APIs and services

that do not belong to existing entities). We identiﬁed 16 new

entities that were not included in the original entity ontology.

We visited the websites of those new entities and found that:

three are platform providers, four are analytic providers, and

12 are service providers; these become the leaf nodes of “api”.

We also added a new leaf node called “others” to cover a few

data ﬂows, whose destinations cannot be determined from the

IP address or domain name.

USENIX Association 31st USENIX Security Symposium 3797

(a) Data Ontology (b) Entity Ontology

Figure 3:

Ontologies for VR data ﬂows.

Please recall that each data ﬂow, F, is deﬁned as

F =h

data type, entity

. We started

from the PoliCheck ontologies, originally developed for Android (printed in gray). First, we eliminated nodes that did not appear

in our VR network trafﬁc and privacy policies. Then, we added new leaf nodes (printed in black) based on new data types found

in the VR network trafﬁc and/or privacy policies text. Finally, we deﬁned additional non-leaf nodes, such as “biometric info” and

”api”, in the resulting VR data and entity ontologies.

Platform Data Ontology Entity Ontology

Android [5] 38 nodes 209 nodes

OVR (OVRSEEN) 63 nodes 64 nodes

New nodes in OVR

39 nodes 21 nodes

Table 5: Comparison of PoliCheck and OVRSEEN Ontologies.

Nodes include leaf nodes (21 data types and 16 entities) and

non-leaf nodes (see Fig. 3).

Summary.

Building VR ontologies has been non-trivial.

We had to examine a list of more than 500 new terms and

phrases that were not part of the original ontologies. Next, we

had to decide whether to add a term into the ontology as a new

node, or as a synonym to an existing node. In the meantime,

we had to remove certain nodes irrelevant to VR and merge

others because the original Android ontologies were partially

machine-generated and not carefully curated.

4.1.3 Network-to-Policy Consistency Results

We ran OVRSEEN’s privacy policy analyzer to perform

network-to-policy consistency analysis. Please recall that we

extracted 1,135 data ﬂows from 140 apps (see Section 3.2.2).

OVR data ﬂow consistency.

In total, 68% (776/1,135) data

ﬂows are classiﬁed as inconsistent disclosures. The large ma-

jority of them 97% (752/776) are omitted disclosures, which

are not declared at all in the apps’ respective privacy policies.

Fig. 4 presents the data-ﬂow-to-policy consistency analysis

results. Out of 93 apps which expose data types, 82 apps have

at least one inconsistent data ﬂows. Among the remaining

32% (359/1,135) consistent data ﬂows, 86% (309/359) are

classiﬁed as vague disclosures. They are declared in vague

terms in the privacy policies (e.g., the app’s data ﬂows contain

the data type “email address”, whereas its privacy policy only

declares that the app collects “personal information”). Clear

disclosures are found in only 16 apps.

Data type consistency.

Fig. 5a reports network-to-policy

consistency analysis results by data types—recall that in Sec-

tion 3.2.2 we introduced all the exposed data types into three

categories: PII, Fingerprint, and VR Sensory Data. The PII

category contributes to 22% (250/1,135) of all data ﬂows.

Among the three categories, PII has the best consistency: 57%

(142/250) data ﬂows in this category are classiﬁed as consis-

tent disclosures. These data types are well understood and also

treated as PII in other platforms. On Android [5], it is reported

that 59% of PII ﬂows were consistent—this is similar to our

observation on OVR. The Fingerprint category constitutes

69% (784/1,135) of all data ﬂows: around 25% (199/784) of

data ﬂows in this category are classiﬁed as consistent disclo-

sures. The VR Sensory Data category constitutes around 9%

(101/1,135) of all data ﬂows—this category is unique to the

VR platform. Only 18% (18/101) data ﬂows of this category

are consistent—this indicates that the collection of data types

in this category is not properly disclosed in privacy policies.

3798 31st USENIX Security Symposium USENIX Association

Figure 4:

Summary of network-to-policy consistency analysis results.

Columns whose labels are in parentheses provide

aggregate values: e.g., column “(platform)” aggregates the columns “oculus” and “facebook”; column “(other 3rd parties)”

aggregates the subsequent columns. The numbers count data ﬂows; each data ﬂow is deﬁned as happ, data type, destinationi).

Entity consistency.

Fig. 5b reports our network-to-policy

consistency results, by entities. Only 29% (298/1,022) of third-

party and platform data ﬂows are classiﬁed as consistent dis-

closures. First-party data ﬂows constitute 10% (113/1,135) of

all data ﬂows: 54% (61/113) of these ﬁrst-party data ﬂows are

classiﬁed as consistent disclosures. Thus, 69% (785/1,135) of

all data ﬂows are classiﬁed as inconsistent disclosures. Third-

party and platform data ﬂows constitute 90% (1,022/1,135) of

all data ﬂows—surprisingly, only 29% (298/1,022) of these

third-party and platform data ﬂows are classiﬁed as consistent

disclosures.

Unity is the most popular third-party entity, with 66%

(746/1,135) of all data ﬂows. Only 31% (232/746) of these

Unity data ﬂows are classiﬁed as consistent, while the ma-

jority (69%) are classiﬁed as inconsistent disclosures. Plat-

form (i.e., Oculus and Facebook) data ﬂows account for 11%

(122/1,135) of all data ﬂows; only 28% (34/122) of them are

classiﬁed as consistent disclosures. Other less prevalent enti-

ties account only around 14% (154/1,135) of all data ﬂows.

Referencing Oculus and Unity privacy policies.

Privacy

policies can link to each other. For instance, when using Quest

2, users should be expected to consent to the Oculus privacy

policy (for OVR). Likewise, when app developers utilize a

third party engine (e.g., Unity) their privacy policies should

include the Unity privacy policy. To the best of our knowledge,

this aspect has not been considered in prior work [5, 27, 69].

Interestingly, when we included the Oculus and Unity

privacy policies (when applicable) in addition to the app’s

own privacy policy, we found that the majority of platform

(116/122 or 96%) and Unity (725/746 or 97%) data ﬂows get

classiﬁed as consistent disclosures. Thus, 74% (841/1,135) of

all data ﬂows get classiﬁed as consistent disclosures. Fig. 6

shows the comparison of the results from this new experiment

with the previous results shown in Fig. 5b. These show that

data ﬂows are properly disclosed in Unity and Oculus privacy

policies even though the app developers’ privacy policies

usually do not refer to these two important privacy policies.

Furthermore, we noticed that the Oculus and Unity privacy

policies are well-written and clearly disclose collected data

types. As discussed in [5], developers may be unaware of their

responsibility to disclose third-party data collections, or they

may not know exactly how third-party SDKs in their apps

collect data from users. This is a recommendation for future

improvement.

Validation of PoliCheck results (network-to-policy consis-

tency).

To test the correctness of PoliCheck when applied

to VR apps, we manually inspected all data ﬂows from apps

that provided a privacy policy, and checked their consistency

with corresponding collection statements in the policy. Three

authors had to agree on the consistency result (one of the ﬁve

disclosure types) of each data ﬂow. We found the following.

First, we considered multi-class classiﬁcation into consis-

tent, omitted and incorrect disclosures, similar to PoliCheck’s

evaluation [5]. The performance of multi-class classiﬁcation

USENIX Association 31st USENIX Security Symposium 3799

(a)

(b)

Figure 5: Network-to-policy consistency analysis results ag-

gregated by (a) data types, and (b) destination entities.

can be assessed using micro-averaging or macro-averaging

of metrics across classes. Micro-averaging is more appro-

priate for imbalanced datasets and was also used for consis-

tency analysis of Android apps [5] and Alexa skills [27]. In

our VR dataset, we obtained 84% micro-averaged precision,

recall and F1-score

. This is comparable to the correspond-

ing numbers when applying PoliCheck to mobile [5] and

Alexa Skills [27], which reported 90.8% and 83.3% (micro-

averaged) precision/recall/F1-score, respectively. For com-

pleteness, we also computed the macro-averaged precision,

recall and F1-score to be 74%, 89%, and 81% respectively

(see Table 8 in [58]).

Second, we considered the binary classiﬁcation case (i.e.,

we treat inconsistent disclosures as positive and consistent

disclosures as negative samples). We obtained 77% precision,

94% recall, and 85% F1-score (see

Appendix D.2 in [58]

for

more details). Overall, PoliCheck, along with our improve-

ments for OVRSEEN, works well on VR apps

In multi-class classiﬁcation, every misclassiﬁcation is a false positive

for one class and a false negative for other classes; thus, micro-averaged

precision, recall, and F1-score are all the same (see Appendix D.2 in [58]).

However, the precision is lower when distinguishing between clear and

Figure 6:

Referencing Oculus and Unity privacy policies.

Comparing the results from the ideal case (including Unity

and Oculus privacy policies by default) and the previous ac-

tual results (only including the app’s privacy policy and any

third-party privacy policies linked explicitly therein).

4.2 Data Collection in Context

Consistent (i.e., clear, or even vague) disclosures are desirable

because they notify the user about the VR apps’ data collec-

tion and sharing practices. However, they are not sufﬁcient to

determine whether the information ﬂow is within its context

or social norms. This context includes (but is not limited to)

the purpose and use, notice and consent, whether it is legally

required, and other aspects of the “transmission principle” in

the terminology of contextual integrity [38]. In the previous

section, we have discussed the consistency of the network

trafﬁc w.r.t. the privacy policy statements: this provides some

context. In this section, we identify an additional context: we

focus on the purpose of data collection.

Purpose.

We extract purpose from the app’s privacy policy

using Polisis [19]—an online privacy policy analysis service

based on deep learning. Polisis annotates privacy policy texts

with purposes at text-segment level. We developed a transla-

tion layer to map annotated purposes from Polisis into consis-

tent data ﬂows (see

Appendix D.3 in [58]

). This mapping is

possible only for data ﬂows with consistent disclosures, since

we need the policy to extract the purpose of a data ﬂow. We

were able to process 293 (out of 359) consistent data ﬂows

that correspond to 141 text segments annotated by Polisis.

Out of the 293 data ﬂows, 69 correspond to text segments an-

notated as “unspeciﬁc”, i.e., Polisis extracted no purpose. The

remaining 224 data ﬂows correspond to text segments anno-

tated with purposes. Since a data ﬂow can be associated with

multiple purposes, we expanded the 224 into 370 data ﬂows,

so that each data ﬂow has exactly one purpose. There are nine

distinct purposes identiﬁed by Polisis (including advertising,

analytics, personalization, legal, etc.; see Fig. 7).

vague disclosures. Our validation shows 23% vague disclosures were actu-

ally clearly disclosed. This is because OVRSEEN’s privacy policy analyzer

inherits the limitations of PoliCheck’s NLP model which cannot extract data

types and entities from a collection statement that spans multiple sentences..

Polisis did not process the text segments that correspond to the remaining

66 consistent data ﬂows: it did not annotate the text segments and simply

reported that their texts were too short to analyze.

3800 31st USENIX Security Symposium USENIX Association

unity 255

Oculus 251

SideQuest 119

advertising 119

analytics 70

merger 64

user id 51

session info 45

usage time 44

1st party 43

additional feature 38

sdk version 29

language 27

android id 27

device id 27

marketing 27

hardware info 20

loggly 20

system version 19

oculus 19

basic feature 17

playfab 15

security 14

personalization 12

flags 11

app name 11

build version 11

email address 10

serial number 9

legal 9

vr movement 7

vr play area 7

person name 6

others 5

cookie 4

epic 4

vr field of view 3

facebook 3

vr pupillary distance 2

avatar sdk 2

google 2

gamesparks 1

firefox 1

Destination (Entity)

App

Purpose

Data Type

Figure 7:

Data ﬂows in context.

We consider the data ﬂows (

app, data type, destination

) found in the network trafﬁc, and,

in particular, the 370 data ﬂows associated with consistent disclosures. We analyze these in conjunction with their purpose

as extracted from the privacy policy text and depict the augmented tuples

app, data type, destination, purpose

in the above

alluvial diagram. The diagram is read from left to right, for example: (1) out of 251 data ﬂows from the Oculus app store, no

more than 51 data ﬂows collect User ID and send it to various destinations; (2) the majority of User ID is collected by Unity; and

(3) Unity is responsible for the majority of data ﬂows with the purpose of advertising. Finally, the color scheme of the edges helps

keep track of the ﬂow. From App to Data Type, the color indicates the app store: blue for Oculus apps and gray for SideQuest

apps. From Data Type to Destination, the color indicates the type of data collected: PII and VR Sensory Data data ﬂows are in

orange, while Fingerprinting data ﬂows are in green. From Destination to Purpose, we use blue to denote ﬁrst-party destinations

and red to denote third-party destinations.

To further understand whether data collection is essential

to app functionality, we distinguish between purposes that

support core functionality (i.e., basic features, security, per-

sonalization, legal purposes, and merger) and those unrelated

to core functionality (i.e., advertising, analytics, marketing,

and additional features) [33]. Intuitively, core functionality in-

dicates services that users expect from an app, such as reading

articles from a news app or making a purchase with a shopping

app. We found that only 31% (116/370) of all data ﬂows are re-

lated to core functionality, while 69% (254/370) are unrelated.

Interestingly, 81% (94/116) of core-functionality-related data

ﬂows are associated with third-party entities, indicating that

app developers use cloud services. On the other hand, data

collection purposes unrelated to core functionality can be

used for marketing emails or cross-platform targeted adver-

tisements. This is partly also corroborated by our ATS ﬁndings

in Section 3.3: 83% (211/254) are associated with third-party

tracking entities. In OVR, data types can be collected for

tracking purposes and used for ads on other mediums (e.g.,

Facebook website) and not on the Quest 2 device itself.

Next, we looked into the data types exposed for different

purposes. The majority of data ﬂows related to core function-

ality (56% or 65/116) expose PII data types, while Fingerprint-

ing data types appear in most (66% or 173/254) data ﬂows

unrelated to functionality. We found that 15 data types are

collected for functionality: these are comprised of Fingerprint-

ing (41% or 48/116 data ﬂows) and VR Sensory Data (3% or

3/116 data ﬂows). We found that 19 data types are collected

for purposes unrelated to functionality: these are comprised

of PII (26% or 65/254 data ﬂows) and VR Sensory Data (6%

or 16/254 data ﬂows). Interestingly, VR Movement, VR Play

Area, and VR Field of View are mainly used for “advertising”,

while VR Movement and VR Pupillary Distance are used for

“basic features”, “security”, and “merger” purposes [19].

Validation of Polisis results (purpose extraction).

In or-

der to validate the results pertaining to purpose extraction,

we read all the 141 text segments previously annotated by

Polisis. Then, we manually annotated each text segment with

one or more purposes (based on the nine distinct purposes

identiﬁed by Polisis). We had three authors look at each seg-

ment independently and then agree upon its annotation. We

then compared our annotation with the purpose output by

Polisis for the same segment. We found that this purpose ex-

traction has 80%, 79%, and 78% micro-averaged precision,

recall, and F1-score respectively

. These micro-averaged re-

sults are directly comparable to the Polisis’ results in [19]:

OVRSEEN’s purpose extraction works well on VR apps. For

completeness, we also computed the macro-averaged preci-

Please note that this is multi-label classiﬁcation. Thus, unlike multi-class

classiﬁcation for PoliCheck, precision, recall, and F1-score are different.

USENIX Association 31st USENIX Security Symposium 3801

sion, recall, and F1-score: 81%, 78%, and 78%, respectively.

Table 9

in Appendix D.3 in [58]

reports the precision, recall,

and F1-score for each purpose classiﬁcation, and their micro-

and macro-averages.

5 Discussion

5.1 VR-Speciﬁc Considerations

VR tracking has unique aspects and trends compared to other

ecosystems, including but not limited to the following.

VR ads.

The VR advertising ecosystem is currently at its

infancy. Our analysis of destinations from the network trafﬁc

revealed that ad-related activity was missing entirely from

OVR at the time of our experiments. Facebook recently started

testing on-device ads for Oculus in June 2021 [41]. Ads on

VR platforms will be immersive experiences instead of ﬂat

visual images; for example, Unity uses separate virtual rooms

for ads [60]. We expect that tracking will further expand once

advertising comes into VR (e.g., to include tracking how

users interact and behave within the virtual ad space). As VR

advertising and tracking evolve, our OVRSEEN methodology,

system, and datasets will continue to enable analysis that was

not previously possible on any VR platforms.

Comparison to other ecosystems.

Our analysis showed

that the major players in the OVR tracking ecosystem are

currently Facebook and Unity (see Fig. 2 and 5). The more

established ecosystems such as mobile and Smart TVs are

dominated by Alphabet [23,64]; they also have a more diverse

playing ﬁeld of trackers (e.g., Amazon, Comscore Inc., and

Adobe)—spanning hundreds of tracking destinations [23, 52,

64]. OVR currently has only a few players (e.g., Facebook,

Unity, Epic, and Alphabet). OVRSEEN can be a useful tool

for continuing the study on this developing ecosystem.

Sensitive data.

Compared to other devices, such as mobile,

Smart TVs and IoT, the type of data that can be collected from

a VR headset is arguably more sensitive. For example, OVR

has access to various biometric information (e.g., pupillary

distance, hand geometry, and body motion tracking data) that

can be used to identify users and even infer their health [40].

A study by Miller et al. [34] revealed the feasibility of identi-

fying users with a simple machine learning model using less

than ﬁve minutes of body motion tracking data from a VR

device. Our experiments found evidence of apps collecting

data types that are unique to VR, including biometric-related

data types (see Section 3.2.2). While the numbers we found

are small so far, with the developing VR tracking ecosystem,

it is important to have a system such as OVRSEEN to detect

the increasing collection of sensitive data over time.

Generalization. Within OVR, we only used OVRSEEN to

analyze 140 apps in our corpus. However, we believe that it

can be applied to other OVR apps, as long as they are created

according to OVR standards. Beyond OVR, the network trafﬁc

analysis and network-to-policy consistency analysis can also

be applied to other platforms, as long as their network trafﬁc

can be decrypted, as was the case with prior work on Android,

Smart TV, etc. [35, 45, 51, 64].

5.2 Recommendations

Based on our ﬁndings, we provide recommendations for the

OVR platform and developers to improve their data trans-

parency practices.

Provide a privacy policy.

We found that 38 out of the 140

popular apps, out of which 19 are from the Oculus app store,

did not provide any privacy policy at all. Furthermore, 97%

of inconsistent data ﬂow disclosures were due to omitted

disclosures by these 38 apps missing privacy policies (see

Section 4). We recommend that the OVR platform require de-

velopers to provide a privacy policy for their apps, especially

those available on the ofﬁcial Oculus app store.

Reference other parties’ privacy policies.

Developers are

not the only ones collecting data during the usage of an app:

third-parties (e.g., Unity, Microsoft) and platform-party (e.g.,

Oculus/Facebook) can also collect data. We found that 81

out of 102 app privacy policies did not reference policies of

third-party libraries used by the app. We recommend that

developers reference third-party and platform-party privacy

policies. If they do that, then the consistency of disclosures

will be quite high: up to 74% of data ﬂows in the network

trafﬁc we collected (see Section 4.1.3). This indicates that, at

least at this early stage, the VR ecosystem is better behaved

than the mobile tracking ecosystem.

Notice and consent.

We found that fewer than 10 out of

102 apps that provide a privacy policy explicitly ask users to

read it and give consent to data collection (e.g., for analytics

purposes) upon ﬁrst opening the app. We recommend that de-

velopers provide notice and ask for users’ consent (e.g., when

a user launches the app for the ﬁrst time) for data collection

and sharing, as required by privacy laws such as GDPR [67].

Notifying developers.

We contacted Oculus as well as the

developers of the 140 apps that we tested. We provided cour-

tesy notiﬁcations of the speciﬁc data ﬂows and consistency

we identiﬁed in their apps, along with recommendations. We

received 24 responses (see the details

in Appendix E in [58]

Developers were, in general, appreciative of the information

and willing to adopt recommendations to improve their pri-

vacy policies. Several indicated they did not have the training

or tools to ensure consistent disclosures.

6 Related Work

Privacy in Context.

The framework of “Privacy in Con-

text" [38] speciﬁes the following aspects of information ﬂow:

3802 31st USENIX Security Symposium USENIX Association

(1) actors: sender, recipient, subject; (2) type of informa-

tion; and (3) transmission principle. The goal is to determine

whether the information ﬂow is appropriate within its con-

text. The “transmission principle" is key in determining the

appropriateness of the ﬂow and may include: the purpose of

data collection, notice and consent, required by law, etc. [38].

In this paper, we seek to provide context for the data ﬂows

(

app, data type, destination

) found in the network trafﬁc.

We primarily focus on the network-to-policy consistency, pur-

pose of data collection, and we brieﬂy comment on notice

and consent. Most prior work on network analysis only char-

acterized destinations (ﬁrst vs. third parties, ATS, etc.) or data

types exposed without additional contexts. One exception is

MobiPurpose [22], which inferred data collection purposes of

mobile (not VR) apps, using network trafﬁc and app features

(e.g., URL paths, app metadata, domain name, etc.); the au-

thors stated that “the purpose interpretation can be subjective

and ambiguous”. Our notion of purpose is explicitly stated

in the privacy policies and/or indicated by the destination

domain matching ATS blocklists. Shvartzshnaider et al. intro-

duced the contextual integrity (CI) framework to understand

and evaluate privacy policies [54]—they, however, leveraged

manual inspection and not automation.

Privacy of various platforms.

The research community

has looked into privacy risks in various platforms, using static

or dynamic code analysis, and—most relevant to us—network

trafﬁc analysis. Enck et al. performed static analysis of An-

droid apps [13] and discovered PII misuse (e.g., personal/-

phone identiﬁers) and ATS activity. Taintdroid, ﬁrst intro-

duced taint tracking for mobile apps [12]. Ren et al. [46]

did a comprehensive evaluation of information exposure on

smart home IoT devices. Moghaddam et al. and Varmarken

et al. observed the prevalence of PII exposures and ATS ac-

tivity [35, 64] in Smart TVs. Lentzsch et al. [27] performed

a comprehensive evaluation on Alexa, a voice assistant plat-

form. Ren et al. [47], Razaghpanah et al. [45], and Shuba et

al. [51

–

53] developed tools for analysis of network trafﬁc gen-

erated by mobile apps, and inspection for privacy exposures

and ATS activity. Our work is the ﬁrst to perform network

trafﬁc analysis on the emerging OVR platform, using dynamic

analysis to capture and decrypt networking trafﬁc on the de-

vice; this is more challenging for Unity and Unreal based

apps because, unlike prior work that dealt with standard An-

droid APIs, we had to deal with stripped binary ﬁles (i.e.,

no symbol table). Augmented reality (AR) is another plat-

form the research community has been focusing on in the

past decade [1, 24, 26, 44, 48, 66]. While AR augments our

perception and interaction with the real world, VR replaces

the real world with a virtual one. Nevertheless, some AR pri-

vacy issues are similar to those in VR since they have similar

sensors, e.g., motion sensors.

Privacy of VR.

Although there is work on security aspects

of VR devices (e.g., authentication and attacks on using vir-

tual keyboards) [11, 29

–

31], the privacy of VR is currently

not fully understood. Adams et al. [2] interviewed VR users

and developers on security and privacy concerns, and learnt

that they were concerned with data collection potentially per-

formed by VR devices (e.g., sensors, device being always

on) and that they did not trust VR manufacturers (e.g., Face-

book owning Oculus). Miller et al. present a study on the

implications of the ability of VR technology to track body

motions [34]. Our work is motivated by these concerns but

goes beyond user surveys to analyze data collection practices

exhibited in the network trafﬁc and stated in privacy policies.

ysis in various app ecosystems [4, 5, 19, 55, 65, 68, 69] is be-

coming increasingly automated. Privee [68] is a privacy policy

analyzer that uses NLP to classify the content of a website pri-

vacy policy using a set of binary questions. Slavin et al. used

static code analysis, ontologies, and information ﬂow analysis

to analyze privacy policies for mobile apps on Android [55].

Wang et al. applied similar techniques to check for privacy

leaks from user-entered data in GUI [65]. Zimmeck et al. also

leveraged static code analysis for privacy policy consistency

analysis [69]; they improved on previous work by attempting

to comply with legal requirements (e.g., ﬁrst vs. third party,

negative policy statements, etc.). In Section 4, we leverage two

state-of-the-art tools, namely PoliCheck [5] and Polisis [19],

to perform data-ﬂow-to-policy consistency analysis and to

extract the data collection purpose, respectively. PoliCheck

was built on top of PolicyLint [4], a privacy policy analyzer

for mobile apps. It analyzes both positive and negative data

collection (and sharing) statements, and detects contradic-

tions. Lentzsch et al. also used off-the-shelf PoliCheck using

a data ontology crafted for Alexa skills. OVRSEEN focuses

on OVR and improves on PoliCheck in several ways, includ-

ing VR-speciﬁc ontologies, referencing third-party policies,

and extracting data collection purposes.

7 Conclusion

Summary.

We present the ﬁrst comprehensive study of pri-

vacy aspects for Oculus VR (OVR), the most popular VR

platform. We developed OVRSEEN, a methodology and sys-

tem to characterize the data collection and sharing practices

of the OVR ecosystem by (1) capturing and analyzing data

ﬂows found in the network trafﬁc of 140 popular OVR apps,

and (2) providing additional contexts via privacy policy anal-

ysis that checks for consistency and identiﬁes the purpose of

data collection. We make OVRSEEN’s implementation and

datasets publicly available at [59]. An extended version of

this paper, including appendices, can be found in [58].

Limitations and future directions.

On the networking

side, we were able to decrypt, for the ﬁrst time, trafﬁc of OVR

apps, but the OVR platform itself is closed and we could not

decrypt most of its trafﬁc. In future work, we will explore the

USENIX Association 31st USENIX Security Symposium 3803

possibility of addressing this limitation by further exploring

binary analysis. On the privacy policy side, PoliCheck and

Polisis rely on different underlying NLP model, with inherent

limitations and incompatibilities—this motivates future work

on a uniﬁed privacy policy and context analyzer.

Acknowledgment

This project was supported by NSF Awards 1815666 and

1956393. We would like to thank our shepherd, Tara Whalen,

and the USENIX Security 2022 reviewers for their feedback,

which helped to signiﬁcantly improve the paper. We would

also like to thank Yiyu Qian, for his help with part of our data

collection process.

References

[1]

A. Acquisti, R. Gross, and F. D. Stutzman. Face Recog-

nition and Privacy in the Age of Augmented Reality.

Journal of Privacy and Conﬁdentiality, 6(2):1, 2014.

[2]

D. Adams, A. Bah, C. Barwulor, N. Musaby, K. Pitkin,

and E. M. Redmiles. Ethics Emerging: the Story of

Privacy and Security Perceptions in Virtual Reality. In

SOUPS, Aug. 2018.

[3]

O. Alrawi, C. Lever, M. Antonakakis, and F. Monrose.

SoK: Security Evaluation of Home-Based IoT Deploy-

ments. In IEEE SP, 2019.

[4]

B. Andow, S. Y. Mahmud, W. Wang, J. Whitaker,

W. Enck, B. Reaves, K. Singh, and T. Xie. PolicyLint:

Investigating Internal Privacy Policy Contradictions on

Google Play. In USENIX Security, Aug. 2019.

[5]

B. Andow, S. Y. Mahmud, J. Whitaker, W. Enck,

B. Reaves, K. Singh, and S. Egelman. Actions Speak

Louder than Words: Entity-Sensitive Privacy Policy and

Data Flow Analysis with PoliCheck. In USENIX Secu-

rity, Aug. 2020.

[6]

ARMmbed. mbedtls: x509_crt.c.

https:

//github.com/ARMmbed/mbedtls/blob/

development/library/x509_crt.c, 2021.

[7]

C. Brubaker and Android Security team. Changes to

trusted certiﬁcate authorities in android nougat.

https:

//android-developers.googleblog.com/2016/

07/changes-to-trusted-certificate.html

, July

2016.

[8]

BSDgeek_Jake (XDA Developer). Moaab: Mother of all

ad-blocking.

https://forum.xda-developers.com/

showthread.php?t=1916098, 2019.

[9]

P. Cipolloni. Universal android ssl pinning bypass with

frida.

https://techblog.mediaservice.net/2017/

07/universal-android-ssl-pinning-bypass-

with-frida/, July 2017.

[10]

Disconnect, Inc. disconnect-tracking-protection:

Canonical repository for the disconnect services ﬁle.

https://github.com/disconnectme/disconnect-

tracking-protection, 2021.

[11]

R. Duezguen, P. Mayer, S. Das, and M. Volkamer. To-

wards Secure and Usable Authentication for Augmented

and Virtual Reality Head-Mounted Displays. arXiv

preprint arXiv:2007.11663, 2020.

[12]

W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung,

P. McDaniel, and A. N. Sheth. TaintDroid: An

Information-Flow Tracking System for Realtime Pri-

vacy Monitoring on Smartphones. In OSDI, Oct. 2010.

[13]

W. Enck, D. Octeau, P. McDaniel, and S. Chaudhuri.

A Study of Android Application Security. In USENIX

Security, Aug. 2011.

[14]

Epic Games, Inc. Openssl (unreal version).

https:

//github.com/EpicGames/UnrealEngine/tree/

master/Engine/Source/ThirdParty/OpenSSL

2021.

[15]

Epic Games, Inc. Unreal engine.

https://

www.unrealengine.com/, 2021.

[16]

Facebook. Facebook to acquire oculus.

https://about.fb.com/news/2014/03/facebook-

to-acquire-oculus/, March 2014.

[17]

E. Fernandes, J. Jung, and A. Prakash. Security Analysis

of Emerging Smart Home Applications. In IEEE SP,

2016.

[18]

Google. Android developers - best practices

for unique.

https://developer.android.com/

training/articles/user-data-ids, 2021.

[19]

H. Harkous, K. Fawaz, R. Lebret, F. Schaub, K. G. Shin,

and K. Aberer. Polisis: Automated Analysis and Pre-

sentation of Privacy Policies Using Deep Learning. In

USENIX Security, Aug. 2018.

[20]

S. Hayden. Oculus quest 2 surpasses original quest in

monthly active users.

https://www.roadtovr.com/

oculus-quest-2-monthly-active-users/

, Jan-

uary 2021.

[21]

D. Heaney. The oculus quest 2 ‘jailbreak’ seems to

be fake.

https://uploadvr.com/oculus-quest-2-

jailbreak-seems-fake/, November 2020.

3804 31st USENIX Security Symposium USENIX Association

[22]

H. Jin, M. Liu, K. Dodhia, Y. Li, G. Srivastava,

M. Fredrikson, Y. Agarwal, and J. I. Hong. Why Are

They Collecting My Data? Inferring the Purposes of

Network Trafﬁc in Mobile Apps. In ACM IMWUT,

2018.

[23]

K. Kollnig, A. Shuba, R. Binns, M. V. Kleek, and

N. Shadbolt. Are iPhones Really Better for Privacy?

Comparative Study of iOS and Android Apps. arXiv

preprint arXiv:2109.13722, 2021.

[24]

A. Kotsios. Privacy in an Augmented Reality. Inter-

national Journal of Law and Information Technology,

23(2):157–185, 2015.

[25]

B. Lang. Where to change quest 2 privacy set-

tings and see your vr data collected by facebook.

https://www.roadtovr.com/oculus-quest-2-

privacy-facebook-data-collection-settings/

October 2020.

[26]

K. Lebeck, K. Ruth, T. Kohno, and F. Roesner. Towards

Security and Privacy for Multi-User Augmented Reality:

Foundations with End Users. In IEEE SP, 2018.

[27]

C. Lentzsch, S. J. Shah, B. Andow, M. Degeling, A. Das,

and W. Enck. Hey Alexa, is this Skill Safe?: Taking a

Closer Look at the Alexa Skill Ecosystem. In NDSS,

2021.

[28]

Y. Li, Z. Yang, Y. Guo, and X. Chen. DroidBot: a

Lightweight UI-Guided Test Input Generator for An-

droid. In IEEE/ACM ICSE-C, 2017.

[29]

Z. Ling, Z. Li, C. Chen, J. Luo, W. Yu, and X. Fu. I

Know What You Enter on Gear VR. In IEEE CNS, 2019.

[30]

S. Luo, A. Nguyen, C. Song, F. Lin, W. Xu, and Z. Yan.

OcuLock: Exploring Human Visual System for Authen-

tication in Virtual Reality Head-mounted Display. In

NDSS, 2020.

[31]

F. Mathis, J. H. Williamson, K. Vaniea, and M. Khamis.

Fast and Secure Authentication in Virtual Reality us-

ing Coordinated 3D Manipulation and Pointing. ACM

ToCHI, 2021.

[32]

L. Matney. The oculus quest’s unofﬁcial app store

gets backing from oculus founder palmer luckey.

https://techcrunch.com/2020/09/23/the-

oculus-quests-unofficial-app-store-gets-

backing-from-oculus-founder-palmer-luckey/

September 2020.

[33]

E. McCallister, T. Grance, and K. Scarfone. Guide to

Protecting the Conﬁdentiality of Personally Identiﬁable

Information (PII). Technical Report NIST Special Pub-

lication (SP) 800-122, 2010.

[34]

M. R. Miller, F. Herrera, H. Jun, J. A. Landay, and J. N.

Bailenson. Personal Identiﬁability of User Tracking

Data during Observation of 360-degree VR Video. Sci-

entiﬁc Reports, 2020.

[35]

H. Mohajeri Moghaddam, G. Acar, B. Burgess,

A. Mathur, D. Y. Huang, N. Feamster, E. W. Felten,

P. Mittal, and A. Narayanan. Watching You Watch: The

Tracking Ecosystem of Over-the-Top TV Streaming De-

vices. In ACM CCS, 2019.

[36]

Mozilla Corporation and Individual mozilla.org

contributors. Privacy & security guide: Oculus quest

2 vr headset.

https://foundation.mozilla.org/

en/privacynotincluded/oculus-quest-2-vr-

headset/, November 2020.

[37]

Mozilla Corporation and Individual mozilla.org contrib-

utors. What is ﬁngerprinting and why you should block

it.

https://www.mozilla.org/en-US/firefox/

features/block-fingerprinting/, 2021.

[38]

H. Nissenbaum. Privacy in Context - Technology, Policy,

and the Integrity of Social Life. 2010.

[39]

Oculus. A single way to log into oculus and unlock

social features.

https://www.oculus.com/blog/a-

single-way-to-log-into-oculus-and-unlock-

social-features/, August 2020.

[40]

Oculus. Track your ﬁtness in vr with oculus move.

https://support.oculus.com/move/, 2021.

[41]

Oculus Blog. Testing In-Headset VR Ads.

https://www.oculus.com/blog/testing-in-

headset-vr-ads, Sep 2021.

[42]

Ole André V. Ravnås. Frida - dynamic instrumentation

toolkit for developers, reverse-engineers, and security

researchers. https://frida.re/, 2021.

[43]

Pi-hole. Pi-hole: Network-wide ad blocking.

https:

//pi-hole.net/, 2021.

[44]

P. A. Rauschnabel, J. He, and Y. K. Ro. Antecedents

to the Adoption of Augmented Reality Smart Glasses:

A Closer Look at Privacy Risks. Journal of Business

Research, 92:374–384, 2018.

[45]

A. Razaghpanah, R. Nithyanand, N. Vallina-Rodriguez,

S. Sundaresan, M. Allman, C. Kreibich, and P. Gill.

Apps, Trackers, Privacy, and Regulators: A Global Study

of the Mobile Tracking Ecosystem. In NDSS, 2018.

[46]

J. Ren, D. J. Dubois, D. Choffnes, A. M. Mandalari,

R. Kolcun, and H. Haddadi. Information Exposure From

Consumer IoT Devices: A Multidimensional, Network-

Informed Measurement Approach. In IMC, 2019.

USENIX Association 31st USENIX Security Symposium 3805

[47]

J. Ren, A. Rao, M. Lindorfer, A. Legout, and

D. Choffnes. ReCon: Revealing and Controlling PII

Leaks in Mobile Network Trafﬁc. In MobiSys, 2016.

[48]

F. Roesner, T. Kohno, and D. Molnar. Security and Pri-

vacy for Augmented Reality Systems. CACM , 57(4):88–

96, 2014.

[49]

S. Rogers. Virtual reality for good use cases: From

educating on racial bias to pain relief during childbirth.

https://www.forbes.com/sites/solrogers/2020/

03/09/virtual-reality-for-good-use-cases-

from-educating-on-racial-bias-to-pain-

relief-during-childbirth/, March 2020.

[50]

P. Sarnoff. The vr in the enterprise report: How

retailers and brands are illustrating vr’s potential in

sales, employee training, and product development.

https://www.businessinsider.com/virtual-

reality-for-enterprise-sales-employee-

training-product-2018-12, December 2018.

[51]

A. Shuba, A. Le, E. Alimpertis, M. Gjoka, and

A. Markopoulou. AntMonitor: A System for On-Device

Mobile Network Monitoring and its Applications. arXiv

preprint arXiv:1611.04268, 2016.

[52]

A. Shuba and A. Markopoulou. NoMoATS: Towards

Automatic Detection of Mobile Tracking. In PETS,

2020.

[53]

A. Shuba, A. Markopoulou, and Z. Shaﬁq. NoMoAds:

Effective and Efﬁcient Cross-App Mobile Ad-Blocking.

In PETS, 2018.

[54]

Y. Shvartzshnaider, N. Apthorpe, N. Feamster, and

H. Nissenbaum. Going Against the (Appropriate) Flow:

a Contextual Integrity Approach to Privacy Policy Anal-

ysis. In HCOMP, 2019.

[55]

R. Slavin, X. Wang, M. B. Hosseini, J. Hester, R. Kr-

ishnan, J. Bhatia, T. D. Breaux, and J. Niu. Toward a

Framework for Detecting Privacy Policy Violations in

Android Application Code. In ACM/IEEE ICSE, 2016.

[56]

Software Freedom Conservancy. Seleniumhq browser

automation. https://www.selenium.dev/, 2021.

[57]

Spatial Systems, Inc. Spatial: Virtual spaces that bring

us together. https://spatial.io/, 2021.

[58]

R. Trimananda, H. Le, H. Cui, J. T. Ho, A. Shuba, and

A. Markopoulou. OVRseen: Auditing Network Traf-

ﬁc and Privacy Policies in Oculus VR. arXiv preprint

arXiv:2106.05407, 2021.

[59]

UCI Networking Group. OVRseen project page.

https:

//athinagroup.eng.uci.edu/projects/ovrseen/.

[60]

Unity. The Virtual Room ad: a real way to make money

in VR.

https://create.unity3d.com/virtual-

room-ad, 2021.

[61]

Unity Technologies. mbed tls: An open source, portable,

easy to use, readable and ﬂexible ssl library.

https://

github.com/Unity-Technologies/mbedtls, 2021.

[62]

Unity Technologies. Unity - the leading platform

for creating interactive, real-time content.

https://

unity.com/, 2021.

[63]

Unity Technologies. Unity manual: Managed

code stripping.

https://docs.unity3d.com/Manual/

ManagedCodeStripping.html, 2021.

[64]

J. Varmarken, H. Le, A. Shuba, A. Markopoulou, and

Z. Shaﬁq. The TV is Smart and Full of Trackers: Mea-

suring Smart TV Advertising and Tracking. In PETS,

2020.

[65]

X. Wang, X. Qin, M. Bokaei Hosseini, R. Slavin, T. D.

Breaux, and J. Niu. GUILeak: Tracing Privacy Policy

Claims on User Input Data for Android Applications. In

IEEE/ACM ICSE, 2018.

[66]

B. Wassom. Augmented Reality Law, Privacy, and

Ethics: Law, Society, and Emerging AR Technologies.

2014.

[67]

B. Wolford. What is gdpr, the eu’s new data protection

law? https://gdpr.eu/what-is-gdpr/, 2019.

[68]

S. Zimmeck and S. M. Bellovin. Privee: An Architecture

for Automatically Analyzing Web Privacy Policies. In

USENIX Security, Aug. 2014.

[69]

S. Zimmeck, Z. Wang, L. Zou, R. Iyengar, B. Liu,

F. Schaub, S. Wilson, N. M. Sadeh, S. M. Bellovin,

and J. R. Reidenberg. Automated Analysis of Privacy

Requirements for Mobile Apps. In NDSS, 2017.

3806 31st USENIX Security Symposium USENIX Association