Deeper Dive
I first heard about the glymphatic system – the brain’s waste clearance system – while browsing neuroscience articles. A body system with links to a plethora of neurological disorders, ranging from epilepsy to Alzheimer’s disease, seemed like the sort of thing that scientists everywhere should be working on. As someone who has been reading neuroscience textbooks since childhood, I was surprised that I hadn’t heard of it before. Upon further investigation, I learned that the glymphatic system was so newly discovered that it wasn’t in my textbooks, and scientists were currently researching it under new grants and projects worldwide. However, there is a long path from discovering a new system to developing drugs to aid it. Since few drugs are known to treat irregularities in the glymphatic system, drug discovery in this area has the potential to make a meaningful impact on the treatment of a wide array of neurological disorders. I had spent the previous year developing research at the intersection of machine learning and neuroscience for concussion diagnostics, and I wanted to be able to apply my machine learning background to aid scientists in this endeavor. Over the next year, I developed GlymDS – the first glymphatic-specific database of proteins and genes associated with the glymphatic system and drugs known to interact with them – and GlymRx – a novel, multi-model machine learning system designed to find new drug-gene interactions in the glymphatic system.
One of the biggest challenges in developing my project was finding the right data for my machine learning models to train on. After spending weeks scouring for glymphatic-specific databases, I realized that there simply weren’t any. It became clear that I would have to create my own, and so I developed GlymDS, a database containing 400 newly assembled protein and drug structures. GlymDS not only made my project possible, but my hope is that it can be a resource for other glymphatic system researchers in the future, both independently and connected with my machine learning system GlymRx. In the world of data-driven biomedical science, access to clean, high-quality data is crucial for computation.
The importance of my GlymRx machine learning system is twofold. First, it identified drugs never linked to the glymphatic system before that could be future treatments for glymphatic-related neurological disorders such as Alzheimer’s disease or Parkinson’s disease. Second, the GlymRx model provides proof-of-concept of an early-stage drug discovery machine learning system in a young field of study. The machine learning methods of GlymRx could be modified and implemented for early-stage research targeting a variety of biological systems. As GlymRx narrowed the field of available drugs by ~98%, a process that typically can take up to six years, methods such as these could be very valuable in early-stage drug discovery across disciplines.