U.S. employers have a growing demand for so-called data scientists who can analyze and manipulate the mountains of information generated and stored in the Internet age.
Harvard Business Review last year called this profession "the sexiest job of the 21st century."
One measure of demand: Hours billed for work in statistical analysis grew by 522 percent in the first quarter compared with the same period in 2011, according to data from oDesk Corp., which runs an online service connecting employers with remote freelancers. Time billed on oDesk for all categories of work in the same time span grew by 135 percent.
Jobs centered on data have been falling into Ana Bertran Ortiz's lap since she finished her electrical engineering Ph.D. in 2007.
The jobs all come from her command of statistics. She got her start designing algorithms for a NASA lab to measure sea levels from outer space and orchestrate the landing of the Mars rover. She's researched how to design flight paths to get more information from radar signals, and helped hone a mobile application that forecasts weather in 10-minute increments.
She's now working on software that automatically diagnoses glitches in the networks that house the world's ever-expanding trove of information.
She earns more than $100,000 at Virtual Instruments, a San Jose-based company that monitors the health of data storage networks.
Unlike statisticians of a previous generation, data scientists work with information sets so big - far too large and unwieldy to fit into an Excel spreadsheet - that they need to write extensive computer code to extract the right segments.
Often, this data is on a scale that requires multiple servers to even access the numbers. After that, the analysts run calculations - correlations, regressions, t-tests, machine learning algorithms - to discover the patterns they're looking for.
Some freelancers bill more than $100 an hour for these services.
The scope of data collection is widening in the private and public sectors, a shift that was highlighted recently when the Guardian and Washington Post disclosed the existence of secret U.S. government programs that collect data on U.S. residents' telephone calls and foreign nationals' Internet activity. James R. Clapper, the director of national intelligence, subsequently confirmed the existence of the programs.
The national security industry is among the biggest employers of big-data professionals, says an analysis by Burning Glass, a Boston-based job-matching company. One of the best-known companies specializing in big-data analysis is Palantir Technologies Inc., which made its name offering terrorism analysis software to the Pentagon and the Central Intelligence Agency.
The challenge for employers is that there aren't enough data scientists out there, with the multiplying trove of information likely to further exacerbate the shortage of these analysts. By 2020, all the digital data created, replicated and consumed in a single year will grow to 40,000 exabytes, or 40 trillion gigabytes, according to a December study by technology research firm IDC. That's a 300-fold increase from the 130 exabytes in 2005.
By 2018, the U.S. may face a shortage of as many as 190,000 people with deep analytical skills and 1.5 million managers and analysts who know how to use big data to make decisions, McKinsey Global Institute said in a report in 2011.
"It's so cross-functional and you need multiple skills - you need programming, you need statistics, you need visualization, you need database skills," said Harpinder Singh Madan, co-founder at Slice, a Palo Alto-based startup that helps consumers track and analyze their e-mailed receipts. "The bottom line is that there's no institution that trains for this."
Businesses are improvising, pulling people from all kinds of backgrounds that require an understanding of statistics. They're former nuclear physicists to neurosurgeons and marine biologists, many of whom hold doctorates in their previous domains.