A Distributed Computing Solution for Privacy-Preserving Genome-Wide Association Studies

No Thumbnail Available
Date
2024
Authors
Pedro Gabriel Ferreira
Cláudia Vanessa Brito
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
<jats:title>Abstract</jats:title><jats:p>Breakthroughs in sequencing technologies led to an exponential growth of genomic data, providing unprecedented biological in-sights and new therapeutic applications. However, analyzing such large amounts of sensitive data raises key concerns regarding data privacy, specifically when the information is outsourced to third-party infrastructures for data storage and processing (<jats:italic>e</jats:italic>.<jats:italic>g</jats:italic>., cloud computing). Current solutions for data privacy protection resort to centralized designs or cryptographic primitives that impose considerable computational overheads, limiting their applicability to large-scale genomic analysis.</jats:p><jats:p>We introduce G<jats:sc>yosa</jats:sc>, a secure and privacy-preserving distributed genomic analysis solution. Unlike in previous work, G<jats:sc>yosa</jats:sc>follows a distributed processing design that enables handling larger amounts of genomic data in a scalable and efficient fashion. Further, by leveraging trusted execution environments (TEEs), namely Intel SGX, G<jats:sc>yosa</jats:sc>allows users to confidentially delegate their GWAS analysis to untrusted third-party infrastructures. To overcome the memory limitations of SGX, we implement a computation partitioning scheme within G<jats:sc>yosa</jats:sc>. This scheme reduces the number of operations done inside the TEEs while safeguarding the users’ genomic data privacy. By integrating this security scheme in<jats:italic>Glow</jats:italic>, G<jats:sc>yosa</jats:sc>provides a secure and distributed environment that facilitates diverse GWAS studies. The experimental evaluation validates the applicability and scalability of G<jats:sc>yosa</jats:sc>, reinforcing its ability to provide enhanced security guarantees. Further, the results show that, by distributing GWASes computations, one can achieve a practical and usable privacy-preserving solution.</jats:p>
Description
Keywords
Citation