Despite the vast amount of medical data available today, researchers must recognize that EHR data can present various limitations and isn't always helpful in improving patient care, Google Research Scientist Kathryn Rough said during a presentation at DATAx San Francisco May 14-15, Innovation Enterprise reports.
Five exabytes of data "would contain all the words ever spoken by everyone on Earth," Ms. Rough said. In 2011, health data in the U.S. alone reached 150 exabytes, she added.
"[EHR data] is messy and complex — and it was not intended for research purposes — and as much potential as there is there, we have to be careful in how we use it," Ms. Rough said.
Some of the challenges medical data presents, which the healthcare industry should be aware of, according to Ms. Rough:
1. Data quality can affect the accuracy of a medical dataset due to issues like data entry errors and important information getting locked in unstructured text.
2. Issues with rule-out diagnoses as well as errors in data processing and upcoding.
3. Losing patients to follow up and an overemphasis on statistical significance of data.
4. When reporting medical data, it's crucial that researchers are transparent, Ms. Rough said.
"At the bare minimum, when we are analyzing medical data we need to really explain what's been done, so that the reader can believe the analyses that has taken place," she said. "It's crucial to transparently report, thoughtfully address limitations and not exaggerate findings."