This is the fifth post in a series of blog posts of excerpts of my paper Ethical and Legal Considerations of reCAPTCHA to be presented at PST 2012. The paper’s primary purpose is to provoke thought and discussion. I’ve signed a document prohibiting me from publishing the final copy of the paper, but I am allowed to post the paper as originally submitted for consideration, so here it is…
Google and Facebook are two companies that employ reCAPTCHA. These companies have large market shares, at least regionally, in one or more businesses. In the former case, users may find it difficult to escape the digital conglomerate’s reach and take advantage of the single sign-in system. In the latter, lock-in is achieved through the value provided via the network effect and the difficulty in porting identities to a new social network.
The use of reCAPTCHA by large companies such as these is an anticompetitive practice. These companies can increase their marketshare in one domain or form a coercive monopoly in another. For example, Google has forced users that wish to participate in its dominant ecosystem (and others, through its reCAPTCHA API) to solve reCAPTCHAs. It is hard to evade Google’s reach as its business includes search, advertising (AdSense), e-mail (GMail), blogs (Blogger and Blogspot), video (YouTube), imaging (Picassa), office productivity software (GoogleDocs), operating systems (Android), and browsers (Chrome). In US v. Microsoft, Microsoft was found guilty of tying Internet Explorer to Windows in order to maintain its monopoly on “Intel compatible PC operating systems” through an attempt “to monopolize the browser market” (United States Court of Appeals 2003). By having users solve OCR reCAPTCHAs, Google can create a coercive monopoly (see below) on digitized texts to further cement its lead in the search and on-line advertising businesses. With a reCAPTCHA implementation rather than GWAP Google Image Tagger, Google could also create a coercive monopoly on image searches, again bolstering its search and advertising businesses.
The coercive monopolies mentioned are created because Google can afford to not charge for other services while minimizing its own R&D costs by exploiting users through reCAPTCHA. Google can thus engage in dumping at minimal cost to itself since the true cost of business is paid by users of its other systems (via reCAPTCHA). Startups are effectively shut out due to inability to compete on cost and brand recognition. Thus, by redesigning reCAPTCHAs for different tasks, large companies can create coercive monopolies in (virtually) any domain where access to a large pool of human intelligence working on tiny tasks provides a large competitive advantage.
There are several reasons why reCAPTCHA falls afoul of labour laws in different jurisdictions. Some people are employed to correct OCR software, so reCAPTCHA does have solvers performing marketable work. Indeed, Human Intelligence Tasks (HITs), of which a reCAPTCHA task would not be out of place, on Amazon Mechanical Turk are taxable; the work is considered to be contract work.
In the United States and other countries, there are laws governing the use of child labour. Often, parental consent is required and a children need to be of a minimum age. Furthermore, at least in the United States, there are restrictions on what hours of the day children may be employed. Solvers of reCAPTCHA, by virtue of the fact that work is being performed is obscured, are unable to give consent and it is likely that children are solving reCAPTCHAs at night when laws prohibit the employment of minors.
Current American case law permits search engines to profit from content generated by the intellectual property of others without compensating them monetarily: documents published openly on the World Wide Web, even with a copyright notice, should have no expectation of protection against being protected (United States District Court for the Eastern District of Pennsylvania 2008). Is the fruits of labour of reCAPTCHA analogous? In Parker v. Yahoo, the courts determined that a webmaster knew he had the option of opting out of caching and crawling, for example, via a robots.txt file or erecting a password-protected pay-wall. However, the plaintiff had chosen not to do so. On the issue of caching, the justices briefly dismissed a claim of infringement on the basis that the other defendant, Google, by including a disclaimer about the original source of information when accessing cached documents, was “merely [providing] an archival copy of the original web page”.
In contrast to web content publishers, reCAPTCHA solvers, in addition to the problem of consent, are not given the opportunity to opt out in the same manner as can be done with a robots.txt file. Furthermore, the work completed by solvers is not attributed back to them. In the case of Google’s reCAPTCHA, the end result is supposedly not for profit. This further puts legal pressure on their reCAPTCHAs. If users are not paying for a service via reCAPTCHA, Google is coercing users to perform labour since it is usually the service that they are attempting to access that is the true revenue generator (perhaps indirectly through ads). In this case, because Wr is not performed as payment for a service, the user is provided with no compensation for his/her efforts and, without the ability to opt out, as with a robots.txt file, this is coercion, much like requiring a restaurant patron to leave a gratuity after paying for a meal.
Moreover, the number of visitors (traffic) being used to value some Internet companies means the audience is being treated a property of the company (although that property can be lost, traded, or purchased). By forcing people to work, reCAPTCHA borders on slavery if Wc is not considered a payment for a service. This second legal argument falls apart if solving a reCAPTCHA does constitute a payment.
If reCAPTCHA is considered a payment via barter trade, then there are tax implications. Even though no money exchanges hands, barter is still taxable in many jurisdictions, including the United States (Internal Revenue Service 2010). Therefore, unless an organization receiving the benefit of reCAPTCHA does not need to pay taxes (e.g., it is a charitable organization), taxes should be paid on the work performed by reCAPTCHA solvers unless Wr is optional (see §Labour laws). When deciding whether an organization that employs reCAPTCHA should receive charitable status, the potential for commercial exploitation should also be carefully considered. Even if OCR reCAPTCHA is not being used, for example, to digitize ledgers for a for-profit accounting firm, OCR reCAPTCHAs could be used to extract training data for commercial OCR software.