Funding the production of quality online content is a pressing problem for
content producers. The most common funding method, online advertising, is rife
with well-known performance and privacy harms, and an intractable subject-agent
conflict: many users do not want to see advertisements, depriving the site of
needed funding.
Because of these negative aspects of advertisement-based funding, paywalls
are an increasingly popular alternative for websites. This shift to a
"pay-for-access" web is one that has potentially huge implications for the web
and society. Instead of a system where information (nominally) flows freely,
paywalls create a web where high quality information is available to fewer and
fewer people, leaving the rest of the web users with less information, that
might be also less accurate and of lower quality. Despite the potential
significance of a move from an "advertising-but-open" web to a "paywalled" web,
we find this issue understudied.
This work addresses this gap in our understanding by measuring how widely
paywalls have been adopted, what kinds of sites use paywalls, and the
distribution of policies enforced by paywalls. A partial list of our findings
include that (i) paywall use is accelerating (2x more paywalls every 6 months),
(ii) paywall adoption differs by country (e.g. 18.75% in US, 12.69% in
Australia), (iii) paywalls change how users interact with sites (e.g. higher
bounce rates, less incoming links), (iv) the median cost of an annual paywall
access is $108 per site, and (v) paywalls are in general trivial to circumvent.
Finally, we present the design of a novel, automated system for detecting
whether a site uses a paywall, through the combination of runtime browser
instrumentation and repeated programmatic interactions with the site. We intend
this classifier to augment future, longitudinal measurements of paywall use and
behavior.