Rise of the AI and Data Economy

By Ethan Widlansky (PO’22)

The Senate Small Business Committee, chaired by Marco Rubio, convened a hearing in late February to discuss its latest report “‘The Made in China 2025’ industrial plan.” Spurned by many free-market Republicans, the report considers “not whether states should organize their economies, but how they should organize them.” A self-avowed free-market apostle as a candidate in the 2016 election, Rubio pulled a political 180 in his report. What in the world could have provoked this change? The short answer: artificial intelligence (AI). The long answer? It will follow.

The Small Business Committee report discusses Chinese President Xi Xinping’s flagship “Made in China 2025” policy agenda, which targets 10 high-value industry sectors for “global dominance.” The report’s authors fret most over China’s growing interest and investment in AI; they don’t want to see China douse the American tech sector’s dynamism in an emerging sector that many analysts see as the future of computing. By some liberal estimates cited in Rubio’s report, “China currently has more than 30 times more capital invested in quantum research than in the U.S.” Ultimately, Rubio concludes: “comparing areas of China’s success to America’s relative decline can help identify areas for creative reform.” Reform of what? “Infonomics.”

What is infonomics—or more simply, the information economy—and where did it come from?

Infonomics is defined as an “emerging discipline of managing and accounting for information with the same or similar rigor and formality as other traditional assets (e.g., financial, physical, intangible, human capital).” The term has its roots in dot-com bubble burst fallout of the early 2000s, when companies “badly needed a way to make money… [and] gathering data for targeted advertising was the quickest fix.” We generate data and metadata (the data about data) with our every digital interaction: these keystrokes, your online article view, and that fun hyperlink you clicked in the first paragraph were (probably) all tracked. And that’s just a tiny fraction of the data we generate in a day. You might share this article with your friends after you read it, take a couple of Snapchat selfies, mindlessly scroll through Facebook until your class or shift is over, and then navigate home. Firms collect all of your activity automatically—in part because firms are lazy and don’t want to catalog and delete extraneous data and in part because data are a new kind of money; data are the newly-minted currency of our information economy. For example, companies like Google, Facebook, Microsoft, Tencent, Alibaba, and Amazon discovered not long ago that big data could feed artificial intelligence algorithms and create lucrative services, like “suggested for you.” The more data fed into an algorithm, the better model said algorithm can generate—so companies are always pining for more information.

Uber isn’t worth $68 billion only because its current services are profitable; they’re worth $68 billion also because they collect reams of data from drivers and passengers – from suppliers (drivers) and consumers (passengers)—and use that data to create more and better software, algorithms, advertisements, and services. In the same vein, most people see Tesla as a fancy electric car-maker. But the company is far more: “its latest models collect mountains of data, which allow the firm to optimize its self-driving algorithms and update the software accordingly.” Investors aren’t just purchasing a share of a firm and their tangible products—they’re purchasing possibilities (and, maybe, big returns on their investment). States like Illinois, Ohio, Texas, and Florida have capitalized on this new market, too. They sell driver’s license data, including photos, to private buyers. Some states even sell voter registration data (not a good look, post-Russia hack).

What are the determinants and dynamics in this new market? How is it different from existing markets?

There is no stock exchange for what is estimated to be more than 180 zettabytes (180 followed by 21 zeros) of data before 2025 to be sold and traded. There are, however, a couple of properties of “data” that make it fundamentally different from other more-traditional assets[1] and thus tough to trade in existing markets. For one, it’s unclear to whom data belong. Consider an autonomous car: do the navigation and sensor data collected belong to the driver, the carmaker, sensor supplier, or the vehicle itself? Data effectively corrupt Adam Smith’s clear distinction between supply and demand.

Right now, tech firms have a monopsony on data; they act, somewhat-paradoxically, as monopolistic consumers, “purchasing” and profiting from data from its end users (who somehow both consume the firm’s service and sell involuntarily). Pending legislation, which I will discuss in the conclusion, may change this lopsided relationship.

Data’s second aberrant economic property is that they centralize. Whereas many technologies democratize (think of the internet, which is widely accessible), firms pack as many data in one place, usually proprietary hardware and software systems; more data in one place make better services and models. This practice is more profitable than selling data on an open market that doesn’t yet exist. There are some major exceptions. China, for example, requires that tech companies store their critical data on government-operated servers.

Is data the new oil? And how might this market evolve?

The hype around the information economy stems largely from economists’ and journalists’ analogies to oil, an industry that fueled Western economies during the 19th, 20th, and now, 21st century. Some economists argue that we should let the new “digital” invisible hand establish platforms and conventions in this new economic sector. After all, highly-centralized Standard Oil had standardized speculation, extraction, refining, and production in the 19th century by monopoly.

The Economist pronounces that data “are to this century what oil was to the last one: a driver of growth and change. Flows of data have created new infrastructure, new business, new monopolies, new politics and—crucially—new economics.” Wired author, Antonio Garcia Martinez, counters, however, that “oil is literally a liquid, fungible, and transportable commodity… Data, by contrast… are functionally static.” Martinez has a point; bits and bytes aren’t barrels. But quibbling over an imperfect analogy won’t get us anywhere; we have entered an uncharted and undoubtedly important stretch of the digital age and need to find a way to deal with it—not just describe it.

Some reformers want proactive and preemptive government action and regulation; markets, even free ones, by definition cannot exist without rules. American private industry supports government involvement so long as it accelerates centralization. Intel—the American silicon chip-maker—implored Congress to mandate government adoption of AI systems, investment in research, and greater public-private integration.

Some futurists envision solutions coming from the private sector, like data banks or services powered by blockchain technology (the same thing that’s behind Bitcoin), by which consumers can keep track of their data and “invest” them like a portfolio. Policymakers like Rubio, however, aren’t convinced that more private platforms and technology will enable a fair, efficient information economy.

What can we do?

We can’t do a whole lot. There are services that advertise control over your online data like Citizen Me, but these services promise a little too much too soon; the world of data production and collection is too large and complicated for any one service to navigate. As students and citizens, the most we can do is educate ourselves and make our voices heard by the powers that be.

Regulators, for their part, must be “as inventive as the companies they keep an eye on.” And, more importantly, they must look out for everyone, not just Silicon Valley. “Inventive” might look like running algorithms and simulations to keep pace with the private sector, setting technical standards to which firms large and small must accede if they want access to a pot of public data (an extreme example being China’s “National Credit Information Sharing Platform”), and creating data cooperatives in which consumers can decide what they want to do with their data. But some progressives go even further and offer a solution that strikes at the heart of the unfair covenant between end-user and firm.

Possible Solution: Data as Labor

Today, data is treated as an asset. But tomorrow, we might classify it as labor. California governor Gavin Newsom endorsed legislation that forces companies like Google and Facebook to pay a “‘data dividend” to customers if they use their own personal information. The idea being that if data is the new oil, then the people now who own it… should have a piece of the huge profits being made from it.” Embedded in Newsom’s favored policy proposal is customer ownership over the digital work they do for firms. In their paper on “Treat[ing] Data as Labor,” authors Ibarra, Goff, Hernandez, Lanier, and Weyl insist that we—the end users—play a big role in developing targeted ads, better services, and AI algorithms, and should get paid for it. Data as labor may seem a little outlandish; Instagram likes and YouTube views don’t feel like work. If past is prologue, however, argues Weyl, the 19th century might inform an evolution of data from asset to labor: individuated labor—like work completed by farm hands and factory workers—was once thought of as owned by a foreman, master, or firm before it metamorphosed into compensated labor as we know it.

Firms may argue that they use those Instagram likes and YouTube likes make your experience better—your “compensation” is in a better service. But many of us never asked for these changes; firms just told us what we wanted. Additionally, once Google and Facebook have our data, they don’t limit its use to improving services; firms’ data make our digital life easier and make money by furnishing algorithms for other conglomerates, governments, and militaries. It doesn’t seem unreasonable to ask that we get a small piece of that billion-dollar pie.

Putting people in charge of their data may even strengthen “essential” labor market stability and boost productivity. Customers will care more about how and what data they “produce” if they’re paid for it. We’ll self-educate and produce more and better data. Newly-empowered customers may also put effective moral pressure on firms to not enter into nefarious AI contracts with their data.

Data as labor stands in contrast to many white papers and policy proposals that, to date, focus on what we can do to make the supply side more competitive. This is a democracy and our strength emanates from the whole—not just the government and cash cows. Democratizing data—involving the rest of us “consumers” in our data being transparent about where and how it’s used—may be the edge we need to fight against creeping data centralization and its accompanying authoritarian impulse, and competition from abroad.

Of course, things are a decidedly more complicated how I previously described. Firms, argues Martinez, don’t “owe” us anything, reminding readers that “it’s not like Zuckerberg the Paparazzo snapped a photo of you and monetized your image.” Martinez locates consent at the center of his claim; we agree to use Google and Facebook and that contract allows them to use our data. I didn’t have to get Facebook at Pomona College, but I wouldn’t be able to access campus events if I hadn’t signed up. I don’t have to use Google for my internet searches, but professors and peers alike expect that I be able to retrieve near-immediate information whether I’m doing research or need to figure out how old the Jonas Brothers are. And then there’s the much-talked-about “fear of missing out” (FOMO) driving up social media use in general. Let’s not kid ourselves – data as labor would be a grueling legislative campaign.

Problems await even if we do succeed in enacting data as labor. For example, Weyl and his cohort envision customers, newly responsible for their data, self-organizing into data unions. Codifying these digital collectives and contracts is a daunting legal task. As the Harvard Business Review reports, AI is already complicating law and agreements. Like these unions and their concomitant laws, data as labor would likely create other difficult externalities.

In perspective with China

Kai-Fu Lee, a venture capitalist and former head of Google China, asserts that if the future of AI is “technical—major improvements to core algorithms—then the advantage goes to the United States. If it’s implementation—smart infrastructure or policy adaptation—then the advantage goes to China.” Solutions like data as labor, however, don’t fall into this deterministic binary; Lee fails to consider innovation in policy, not just programming. Ideas like democratizing data joined with Rubio’s more-pragmatic, supply-side-oriented plan for more government involvement and investment may be the edge the United States needs in their AI contest with China. The Chinese may have more data, but we’d have better data. The Chinese may have the “leapfrog effect” and authoritarian macroeconomic control, but we have democracy. But who’s counting?

[1] Davos 2011 classified data as “a new asset class”

Leave a Reply