The discovery that cigarettes cause cancer greatly improved human health. But that discovery didn’t happen in a lab or spring from clinical trials. It came from careful analysis of mounds of data.
Imagine what we could learn today from big-data analysis of everyone’s health records: our conditions, treatments and outcomes. Then throw in genetic data, information on local environmental conditions, exercise and lifestyle habits and even the treasure troves accumulated by Google and Facebook...
So why isn’t it already happening?..., the full potential of health-care data analysis is blocked by regulation... medical-data regulations go far beyond what’s needed to prevent concrete harm to consumers, and underestimate the data’s enormous value....I'll post whole thing in 30 days. In addition to Roam, Tafi and Datavant are two other companies I'm aware of working on this issue.
Bob Borek, Head of Marketing, Datavant wrote to describe their effort to keep lots of data while protecting privacy:
We connect de-identified patient data. In short, as part of the process of de-identification, we create encrypted tokens that are built from the underlying PHI. The encrypted tokens allow patient records to be joined across data sets on a de-identified basis, without ever revealing the underlying PHI. In contrast to the Safe Harbor method, which - as you correctly point out - removes much of the information that would make data analytically valuable, our approach can be certified under HIPAA's Expert Determination method, allowing our clients to both join data for analysis and respect patient privacy. We're already seeing exciting new use cases, from rare disease patient finding to designing real-world evidence trials; from payers and providers building targeted intervention programs to life sciences companies forming go-to-market strategies around intelligent physician targeting.
Update 2 the FDA sentinel initiative implements one approach to these issues. The data stays secure, but you're allowed to make queries, i.e. basically to run regressions on the FDA server.