Award Abstract # 2028008
SBIR Phase I: Accelerating Understanding of COVID-19 Biology and Treatment Via Scaled Medical Record and Biosimulation Analytics

NSF Org: TI
Translational Impacts
Recipient: ONAI INC.
Initial Amendment Date: May 21, 2020
Latest Amendment Date: May 21, 2020
Award Number: 2028008
Award Instrument: Standard Grant
Program Manager: Anna Brady
abrady@nsf.gov
 (703)292-7077
TI
 Translational Impacts
TIP
 Dir for Tech, Innovation, & Partnerships
Start Date: June 1, 2020
End Date: November 30, 2020 (Estimated)
Total Intended Award Amount: $256,000.00
Total Awarded Amount to Date: $256,000.00
Funds Obligated to Date: FY 2020 = $256,000.00
History of Investigator:
  • Guha Jayachandran (Principal Investigator)
    info@onai.com
Recipient Sponsored Research Office: Onai Inc.
7291 CORONADO DR
SAN JOSE
CA  US  95129-4582
(650)429-8622
Sponsor Congressional District: 17
Primary Place of Performance: Onu Technology, Inc.
7280 Blue Hill Dr., Suite 10
San Jose
CA  US  95129-3624
Primary Place of Performance
Congressional District:
17
Unique Entity Identifier (UEI): CLR5VEPH3MC1
Parent UEI:
NSF Program(s): SBIR Phase I
Primary Program Source: 010N2021DB R&RA CARES Act DEFC N
Program Reference Code(s): 096Z, 8032
Program Element Code(s): 5371
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.041
Note: This Award includes Coronavirus Aid, Relief, and Economic Security (CARES) Act funding.

ABSTRACT

The broader impact/commercial potential of this Small Business Innovation Research (SBIR) project is to address information needs of the COVID-19 crisis by rapidly integrating research findings describing the chemistry of the virus and its treatment. The proposed project will deploy advanced computational methods at participating medical institutions to make patient records immediately available for study while maintaining institutional and patient privacy. While the initial focus is on ameliorating COVID-19, the proposed solution can be applied more generally to accelerate epidemiological studies, improving scientific knowledge and public health with faster timelines and lowered costs for personnel, computing capabilities, and data storage.

This SBIR Phase I project proposes to rapidly expand and accelerate the accessibility of clinical and computational data to improve understanding of COVID-19. The proposed innovation will use cryptographic techniques, notably multiparty computation, to facilitate privacy-preserving cross-institutional querying of COVID-19 medical records. Improved access to petabytes of computational (simulation and model) data will speed research by allowing researchers around the world to probe the data. The effort will adapt and deploy decentralized computation techniques to enable distributed storage of many petabytes of virus molecular dynamics simulation data across computers around the world, in a verifiable manner that enables data analysis at the data location. The proposed dashboard will allow for secure queries of a combined dataset of participating institutions to quickly yield insight about the effect of various pre-existing conditions and medications on COVID-19. The effort will include verification and validation.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The project developed approaches to enable improved access to, and storage of, distributed data associated with 2019-nCoV and COVID-19. Notably, it successfully developed technology to enable cryptographically secure querying of COVID-19 medical records across multiple institutions without any participating institution ever revealing any sensitive information or results to any other participant. Also notably, it successfully created technology to enable decentralized storage of datasets, customizing it for the storage of molecular dynamics simulations of 2019-nCoV. 

Medical records may be analyzed to gain insight into what factors impacted patients? prognosis. For example, what is the impact of age? Blood type? Pre-existing conditions? Traditionally, such studies are performed over extended periods of time within a single institution. If collaboration across institutions were desired, researchers or staff would have to form a collaboration agreement with each other institution and manually and painstakingly compile data sufficient to answer each question of interest. Furthermore, such studies risk exposure of sensitive patient records and necessarily expose at least aggregate information from each institution, such as its number of cases, age distributions, or more. With the technology developed in this project, on the other hand, participating institutions only learn the overall global result of the query. Overcoming various challenges, the solution ensures there is no exposure of sensitive information from each institution; any given institution's data remains local and private within that institution. A program called RECovER (Records Evaluation for COVID-19 Emergency Research) was initiated; any healthcare institution is welcome to join this collaborative COVID-19 research effort. The project also detailed how distributed ledger (blockchain) technology not only may be used to facilitate and secure a multiparty computation between a limited number of hospitals, but also can potentially serve in scenarios with large numbers of data owners. 

The decentralized data storage solution, too, focuses on applying cryptography to enable greater data accessibility and faster research. With the technology developed, anyone in the world can provide storage space and prove that they are retaining assigned data. Distributed computing efforts (including those of collaborators on the project) have generated massive amounts of 2019-nCoV data, raising the issue of where to store it. Centralized data storage solutions do not scale cost-effectively to keep up with the data generated. Multiple data integrity techniques, with and without error correcting codes, were explored in the project, and fully functional distributed storage software was developed. Differing manners in which distributed ledgers could interact with various integrity methods were formulated. In addition to tackling the challenge of storing 2019-nCoV simulation through decentralization, benchmarks were performed to probe the types of data analysis that can be performed remotely, where data is stored.

In a relatively short period of time, a system was implemented to enable secure querying of COVID-19 records across institutions and a system was implemented for verifiable decentralized data storage. The team expresses gratitude to all institutions that helped form and test these solutions with urgency. The greater data accessibility these technologies enable are not only of value for COVID-19 research but can also offer benefits in biomedical research more broadly and other areas involving distributed data.

 


Last Modified: 12/31/2020
Modified by: Guha Jayachandran

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page