Governments and technology companies are increasingly collecting massive amounts of personal data, which has resulted in new laws, countless investigations, and calls for tougher legislation to protect privacy.
But despite these issues, economics tells us that society needs more data sharing than less; Because the benefits of publicly available data often outweigh the costs. The public’s access to accurate health records has accelerated the development of life-saving medical treatments such as the acids for corona virus vaccines, produced by Moderna and Pfizer. Better economic data could also substantially improve political responses to the next crisis.
Data is increasingly fueling innovation, and it needs to be used for the common good, while simultaneously being individual privacy. This is a new area of policymaking, and it requires careful approach.
This epidemic has led to a sharp focus on the growing dominance of big tech companies that eat up data.Companies following the digital life, from those specializing in online retailing to home entertainment companies, collect and use data to forecast demand for their products, set prices, and cut costs. , And outperform the traditional competitors.
The data also provides a record of what actually happened. But its main value comes from improving forecasts. Companies like Amazon choose products and prices based on what was purchased. Your data improves their decision-making process and increases corporate profits.
Private companies also rely on data to support their businesses. Redfin and Zillow have transformed the real estate industry thanks to their access to public domain databases. As for investment banks and consulting firms, they create economic forecasts and sell them to clients using unemployment and profit data collected by the Ministry of Labor. By 2013, one study estimated that public data contributed at least $ 3 trillion annually in 7 sectors of the economy around the world.
The recurring passage of the digital age is that “data is the new oil”; But this metaphor is imprecise. Data really is the fuel of the information economy; But it is more like solar energy than oil, it is a renewable resource that can benefit everyone simultaneously, without diminishing itself.
One of the best examples of the transformative power of open data is the US government-led project for the human genome, a project that began in 1990 as an attempt to map the entire chain of human DNA by 2005. Prior to that, private laboratories targeted specific genes and patented them for research purposes. Or for commercial applications such as developing drugs to treat genetic diseases. Instead of protecting their discoveries, laboratories participating in the Human Genome Project posted their data on a public website within 24 hours of having mapped that sequence and made it available free of charge, an arrangement known as the Bermuda Principles.
This commitment to open data saved lives and ushered in an era of scientific advances in genetics. A study by economist Heidi Williams compared the human genome project with contemporary gene sequencing efforts by Celera. When Celera first mapped the gene, it protected its intellectual property by requiring other companies to negotiate licensing agreements or pay high fees before using the data.
Years later, the genes Celera mapped out led to far fewer innovations and commercial products than those immediately put into the public domain. One study estimates that public investment in the human genome project worth $ 3.8 billion has generated benefits of $ 796 billion, and in 2010 alone, it led to 310,000 new jobs.
Data sharing standards set by the Bermuda Principles have accelerated the development of coronavirus vaccines. Where a Chinese laboratory announced the discovery of the new Corona virus on January 9, 2020, revealed its genetic sequence during the following weekend, and announced the genome sequence to the public immediately after that. By the end of January, labs were developing vaccines based on that genome sequence, although no sample had already been found. Without a commitment to publish the data, the Coronavirus vaccines would have been months away.
Undoubtedly, the use of consumer genetic data raises serious privacy concerns, while it is common practice to remove identifiers such as surnames from genetic data before they are released to the public; However, researchers have sometimes been able to identify individuals anyway by combining unknown gene sequences with genealogical databases and other public information. Like age and state of residence. These problems can be solved with more protection; But it requires constant vigilance.
Privacy cannot be guaranteed with complete certainty, and risks must be minimized and balanced with the benefits of innovations, which may arise from increased data availability.
A similar logic applies to economic data. So see, for example, the American political response to the Corona virus, where the salary protection program, which is part of the Aid, Relief and Economic Security Act for the Coronavirus, has provided hundreds of billions of dollars in the form of loans that some small companies can be exempted from later. Despite the large amount of aid available, demand for loans has greatly exceeded supply. Ideally, the loans could have been based on anticipated need; But the Treasury had no information on the companies’ financial health.
In the absence of good data, loans were based on harmonization, using local banks as intermediaries, and they provided loans disproportionately to companies with which they had strong connections. Economists estimate that the program spent from $ 150,000 to $ 377,000 per job created, a high price for a program that was only guaranteed for a few months.
But a better program was aimed at helping businesses and geographies most in need, using current data from the companies themselves. This data is already there; But only behind corporate walls. The data, then, must be anonymized as accurately as possible, then aggregated for public use, so that policymakers and local businessmen can direct relief to those who need it most.
Federal data legislation needs a dual mandate, balancing concerns over privacy along with the social benefits of increasing access to data, and two legislative proposals from Congress, presented by Senators Kirsten Gillibrand and Sherrod Brown, called for the creation of a federal agency dedicated to protecting consumer data. This agency receives complaints, conducts investigations and closely monitors emerging technologies that threaten individual privacy.
This data protection agency could be combined with Data.gov, a government website created in 2009 that collects and hosts hundreds of thousands of data sets for public use. Together they could form a kind of federal data library, to democratize knowledge in the digital age.
Just as traditional libraries coordinate and organize their collections, so can a digital library be created, as new data sources are added, arranged, and compiled for public use. The Federal Data Library could also take the lead in developing and using new tools such as differential privacy, a technology designed to preserve important features of data while protecting individual identities.
The increasing value of data as an economic resource requires a new way of thinking. Strict privacy safeguards are needed to make data of social value available for the public good.
© The New York Times Foundation 2021