- Associated Press - Saturday, October 11, 2014

PITTSBURGH (AP) - Imagine a stack of Bibles, more than 5 billion of them, reaching from the Earth to the Moon: That’s roughly the amount of patient medical data that will be stored by UPMC’s health information systems come the end of 2016.

Even today, big data presents big challenges for health care systems, not the least of which is how and where to store all the computerized patient data that has been collected over the last decade. But greater challenges loom.

“Genomic data will be overwhelming, from a data storage perspective,” said Christian Carmody, senior vice president of UPMC’s information services division and president of the UPMC-led health information exchange known as ClinicalConnect.

One full human genetic sequence translates into more than 1 terabyte of data (or 244,000 King James Bibles’ worth of information). As genomic sequences move out of the university research setting and into everyday patient care, hospitals are developing methods of storing those sequences so they are readily accessible to clinicians, as well as researchers.

That means that down the road, DNA data might be stored in places where it can be retrieved quickly by clinicians - in the front of the store, so to speak, and not in the stockroom. (“Stockroom” storage might include linear tape magnetic filing systems, kind of like an 8-track, and back-up hard drives. Front-room storage includes spinning-disc hard drives and high-performance, solid-state flash memory drives.)

Beyond DNA sequences, digital images - X-rays, MRI scans, video clips and other forms of unstructured data - soon will be virtually tethered to every computerized patient record. As the resolution and clarity of those images improve, the storage requirements increase as well, Carmody said.

The third big storage challenge lies in portable medical devices, phone apps and wireless, wearable sensors that collect all sorts of health data and transmit it remotely to a patient’s doctor and to a patient’s own medical record.

“We’re going to have new data sources coming at us from all over the place,” Carmody said. “They’re going to start flooding health care.”

To manage the coming flood, the Pittsburgh health system has partnered with IBM to improve its data storage system, building a private cloud to help it store, sort and - most importantly - compress and transmit the data. UPMC also leans on IBM for the actual hardware and software; the leasing partnership means UPMC doesn’t have to absorb the entirety of the capital costs associated with buying such a massive data farm.

UPMC and IBM (and Oracle, Informatica and dbMotion) feed and build the storage farm, which is housed in UPMC’s Forbes Tower in Oakland and at back-up sites in Shadyside and Bloomfield.

UPMC also is spending $100 million to help organize that data so clinicians and researchers can mine it for trends. The ambitious analytics effort should help improve population health at UPMC, plus the techniques and technologies could be spread to - and sold to - other hospital systems.

In other words, health information technology and data storage is not just a sunken cost but a potential revenue generator for both UPMC and its tech partners, if the effort proves cost-effective.

And it has, according to IBM.

“UPMC has leveraged new technologies (to) create efficiencies in the way they store data,” said Jeff Barber, IBM’s vice president of software-defined storage. “This has resulted in tremendous savings, by utilizing the storage they’ve got, (better) than the industry average.”

Already the data partnership shows promise. Last year, for example, UPMC announced that, through its data analytics effort, its researchers had been examining the genomic information of 140 patients previously treated for breast cancer, trying to figure out if there is a molecular difference between pre-menopausal and post-menopausal breast cancers.

Performance-wise, the news is good, too.

One of the tasks of the UPMC-IBM team is slowing the growth of the hospital system’s data needs by finding efficient ways to compress data, then decode it instantly when needs to be read by a clinician.

“We have not seen any data-quality issues,”Carmody said. “We’re in the process of testing and validating this compression system across the board. If we can prove that it doesn’t degrade the performance of online transaction,” it will be adopted more fully.

In a clinical setting, that retrieval and translation process must take no more than a few milliseconds. For most companies, “the response time isn’t as big of a deal,” Barber said. But “when you’re bringing somebody into the operating room who has been hit by a car, it’s kind of a big deal.”

So far, the compression technique is working, and UPMC hasn’t added any data storage capacity in the last 10 months. That’s a huge achievement in the health care universe, where storage needs can double every year or two.

Kevin Muha, UPMC’s director of enterprise architecture and strategic technical manager, said before this year, the health system had been ordering new data storage drives every quarter, “like clockwork.”

Right now, UPMC’s health data - not including human genome records - consumes about 6 petabytes worth information. But not all storage is created equal, or used equally, Muha said. While data compression is a relatively new technique, UPMC and IBM have been spending years perfecting other methods of reducing storage needs.

The most obvious method is to use existing storage more efficiently. It’s called “data virtualization,” and it allows UPMC to make better use of the fallow, unused bits of memory and storage. Instead of keeping a patient record in one spot, on a specific drive, data is scattered throughout the storehouse, yet easily aggregated by the end user when it comes to every-day transactions.

And about those 5 billion or so Bibles - one Bible equals 4.3 megabytes of data. That means there are 238 Bibles in one gigabyte, 244,000 in one terabyte, 250 million in one petabyte and 5.2 billion in 21 petabytes, which is the amount of storage UPMC’s private cloud might need by the end of 2016.

It’s not an impossible challenge: Facebook stores more than 300 petabytes of data, much of it photos and video. EBay’s data warehouse, as of last year, held 90 petabytes of customer and transaction information.

But bigger does not mean better, and the goal - particularly in health care - is not to not to build the world’s biggest data warehouse.

In fact, the goal is just the opposite: less storage, less hardware, less space consumed, all of which leads to lower costs.

Carmody estimated that UPMC, thanks to its fine-tuning, has avoided $40 million in costs over the last decade, money that would have been spent on new storage.

“If you look at health care, you have two dynamics that are working at cross purposes,” Barber said.

On one hand, new federal regulations concerning health IT adoption are “requiring that a lot more data be stored than ever before,” he said. On the other hand, UPMC and all other health providers are being squeezed by insurers, the federal government and employers to reduce costs.

So building the biggest, most expensive data farm isn’t necessarily the best option for a hospital in the way that it might be for Facebook.

“Data is going to continuously explode,” Carmody said. “It’s a fun and interesting challenge we have ahead of us, managing this data. It’s our job to ensure that it’s there when they need it, and it’s complete and it’s accurate.”





Information from: Pittsburgh Post-Gazette, https://www.post-gazette.com



Click to Read More

Click to Hide