Intel and the Michael J. Fox Foundation collaborate on a Parkinson’s research solution using wearable technology, Intel algorithms, Big Data analytics, and the Cloudera distribution of Hadoop.
Parkinson’s disease (PD), the second most common neurodegen- erative disorder after Alzheimer’s, is estimated to affect one million people in the United States and perhaps as many as seven million globally. There is currently no cure; medications, surgery, and multidis- ciplinary management can provide relief, but they address only some of the symptoms patients face, and are effective only for a limited time. Many also introduce serious side effects that can be as disabling as the disease itself.
Intel is working with the Michael J. Fox Foundation for Parkinson’s Research (MJFF) on tech-enabled solutions to gather relevant data about PD, analyze that data to iden- tify patterns and make generalizations, and use insights gained to accelerate the development of therapeutic breakthroughs, and poten- tially even a cure for the disease.
When seeking a cure for an “incurable” disease, we often face a knowledge deficit. It’s difficult to solve a problem we don’t fully understand, and there is still so much about Parkinsonism that remains a mystery. So the first step is to gather as much data as we can. That is why MJFF is actively seeking volunteers for many clinical trials. Data science tells us, however, that in order to extract meaningful knowledge from data, we must have relevant data to work with. In other words, we have to sort through a lot of haystacks to find a few needles.
The goal for researchers is to identify patterns—to create “order” from the massive chaos of raw data that such an endeavor generates. Identifying patterns, making generalizations about those patterns, and translating them into quantifiable symptoms of the disease is part of the big data analysis procedure.
Today Parkinson’s research lacks an objective way to measure symp- toms, so a second challenge is cata- loging and quantifying measurable, observable symptoms for analysis.
In the early stages of PD, patients may experience sleep disorders and olfactory loss, but the most obvious symptoms are movement-related— shaking, rigidity, slowness of movement, and difficulty walking. These motor symptoms result from the death of dopamine-generating cells in the brain, the cause of which is still unknown. In later stages of the disease, cognitive and behavioral problems may arise, including dementia and depression.
To better understand the conditions under which these symptoms manifest themselves, Intel is developing a configurable solution for wearable monitors that will passively track a patient’s motor functions along with self-reported information the patient enters via a smartphone app called Fox Insight Mobile. Created by Intel, Fox Insight Mobile tracks movement and provides an electronic “diary” that patients use to enter medication times/dosages and how they are feeling throughout the day. Unlike a paper diary, the electronic diary engages the patient by providing useful feedback and information.
Gathering data automatically from wearable devices is one thing, but expecting patients to provide input manually is another. Because few doctors can see their PD patients every day, active user involvement is critical. How can we keep patients actively involved?
In the past, patient contribution of data has required a tedious paper-and-pencil solution, where patients keep a diary of their feelings and medication dosages throughout the day, for weeks or months. These diaries often are criticized as unreliable because patients tend to lose interest entering information, which leads to sporadic information-gathering from one patient to the next.
To entice patients to voluntarily and consistently provide us with infor- mation about how they are feeling and when they are taking medications—information we cannot automatically acquire from a wearable device—we must make this process easy and personally useful to the patient. Because the Fox Insight Mobile app reminds patients to take medicine and provides information that tracks their progress, they’ll be more likely to use it, which adds a positive side benefit: The more valuable we can make the app to the end-user, the more end-users we might be able to attract to using it, which means more data.
More data is good, but it presents a technical challenge: The sheer quantity we have to work with— from acquisition to storage to analysis. Tracking patients for long periods—twenty-four hours a day, seven days a week, for months, maybe years—requires a system that can collect a massive amount of data...and make use of all the data once it is gathered and stored.
The monitoring devices
An inertial measurement unit (IMU) is an electronic device that measures velocity, orientation, and gravitational forces, using triaxial accelerometers and gyroscopes. Traditionally installed in boats and aircraft, IMUs have been adapted for monitoring human movement. Many wearable IMUs on the market, for example, help users with athletic performance tuning and physical therapy. These devices can determine whether a person’s motion is intentional (as from walking or run- ning), accidental/incidental, or caused by involuntary tremors.
For our latest clinical trial, we are employing an “off-the-shelf” wristworn smartwatch with a triaxial accelerometer and an application we built for it, and a smartphone with the Intel-developed Fox Insight Mobile app installed. We calibrate the devices, with each patient per- forming a series of normal activities. These devices automatically pair with each other and share their data with MJFF servers.
These IMUs serve two functions: For the patient, they help track activity level and medication usage, provide reminders, and monitor tremors. For research purposes, they gather the data, both automatic (from the wearables performing calculations intrinsically) and manual (from the end-users entering information via the Fox Insight Mobile app) and pass the data to an enterprise data hub (EDH) for Big Data analysis. This central repository of information is available to researchers worldwide.
Our solution’s flexibility allows researchers to tailor data collection to the specific requirements of each project. One can configure which sensor (accelerometer or gyro) to collect data from, how frequently to collect data, and most importantly whether to collect all of the raw data or just the calculated data (activity levels, tremors, etc.). For long-term studies, for example, where patients may be wearing the devices for months, using only cal- culated data for tremor, activity levels, and other algorithms might be more appropriate. For short- term clinical trials, however, where we are trying to develop new algorithms or gather data for later com- prehensive analysis, we would probably collect all of the raw data.
Intel has developed several algorithms for these devices, including activity level, tremor, nighttime tracking, and gait detection.
• Activity level. This algorithm measures the intensity of a wristworn device’s movement, computed as the average of absolute values of acceler- ation, over intervals of 30 seconds, (after filtering out frequencies typical to tremor). The Fox Insight Mobile app shows users their activity levels on a graph that depicts activity over time. It also provides a daily summary of active time.
This algorithm does not capture the physical intensity of activities like bicycle riding or walking on a tread- mill, but we can augment the mea- surements with additional sensors, such as a heart rate monitor or gyro- scope, to quantify these activities.
• Tremor. We recognize and quantify hand tremor through frequency anal- ysis, particularly in amplitudes within the 4 to12 Hz range, and subtract these typical tremor frequencies from the activity level measurement. A 5-second segment with a high average difference between these values is considered a tremor point. We aggregate these occurrences into “tremor minutes” and provide the user with a graphical overview of daily tremor symptoms.
Very weak tremors and short tremor episodes are difficult to detect, and some activities, such as driving over a bumpy road or brushing one’s teeth, can be misclassified as tremor. To reduce these type I and II errors, we rely on the controlled data we collect when we calibrate the devices for each patient at the commencement of the trial.
• Gait detection. This algorithm is based on supervised learning of labeled accelerometer data collected from patients. The data is trans- formed into aggregative features in the time and frequency domains, and a decision tree model is used to categorize 5-second segments into walking/nonwalking groups. The output is used to calculate a personalized threshold for high activity level, and as an input to the nighttime tracking algorithm.
The model accuracy on the validation set is 98.5%, where false-positive cases usually have periodic movements with similar frequencies as walking, and false-negative cases are usually affected by extensive hand movements that impact the ability to detect walking periodicity.
• Nighttimetracking. PD patients commonly have difficulty falling asleep and staying asleep, and they experience motor symptoms, such as rapid eye movement (REM) and periodic limb movement (PLM).
Existing sleep-tracking apps don’t always fit the PD patient’s needs, as most of them are designed for people who do not suffer from sleep disorders. The Fox Insight Mobile app pro- vides an analysis of sleep quality based on the movements of the user during the night. We distinguish between three levels of movement during sleeping or waking states: minimal movements, such as when the person is at full rest or is lying still; moderate movements, including tossing/turning, periodic limb move- ments, and sipping/drinking; and extensive movements, such as get- ting out of bed or performing strong or violent movements during sleep. These calculations are based on the quantity, duration, and type of movements, in addition to activity level.
We will continue to refine these algorithms and develop others that will help Parkinson’s research.
The Fox Insight Mobile app
Another important piece of this study is Intel’s Fox Insight Mobile smartphone application (Figure 1), currently available on the Android* platform. In addition to being the conduit to the data warehouse, Fox Insight Mobile brings value directly to patients on a daily basis, showing them their activity levels and helping them with nonanalytical features like medication reminders.
Patients create reminders for each medication they are taking, with specific days, times, and amounts, but Fox Insight Mobile also provides feedback that could prompt optimum medication dosages and times, based on analysis of the individual’s symptoms and responses (recorded from the wearable IMUs).
Armed with personalized information and graphs about their activity levels and medication history, patients can compare medication dosages/frequencies to physical activity, allowing them to manage their regimen to suit their personal preferences and needs. To motivate them to increase their physical activity, data summaries reveal low activity cycles and help users visualize their exercise regimens.
Once they log on to Fox Insight Mobile, users will be able to use the application to report activities (such as taking a dosage of a specific medicine) or log how they feel. This electronic diary simplifies reporting and reduces patient subjectivity by limiting entries to four emoticons (poor, fair, good, or very good). Our approach makes this data more objective, and standardizing this way allows better global analysis.
Coupled with empirical data from multiple triaxial sensors, these timestamped records of behavior will help researchers correlate patients’ activities, feelings, and medications, to devise meaningful hypotheses that can later be tested through normal scientific methods.
Figure 1 Fox Insight Mobile screens. In addition to standard icons for time, battery life, signal strength, etc., the Fox Insight Mobile app adds a few other icons to the black bar at the top to show whether the wristworn IMUs and smartphone are transmitting data to the cloud. ➊ The welcome screen tells you how many hours of data you’ve contributed to our research and offers an easy navigational tool to edit your medication schedule, view activity levels, and record data. ➋ The Record screen lets you report your medication use and how you feel, providing an electronic diary that logs your personal subjective overall state, which researchers can later correlate with IMU sensor data from wristworn smartwatches. ➌ The Medication Schedule lets you add medications and times of the day Fox Insight Mobile should remind you to take them. ➍ My Stats is a read-only screen that displays a graphical representation of your activity levels, aligned with medication times, so you can see how your behavior aligns with medication. This screen also creates graphs for tremors and nighttime activity levels, so you can see how medications
Using Wearable Technology to Advance Parkinson’s Research
Figure 2 System architecture. This figure shows Fox Insight Mobile’s basic blocks, specifically for the clinical trial we are currently conducting, but also applicable and customizable for other trials. 1 User. There are two types of users: the patient and the researcher/clinician or data analyst. These users log onto the system through different APIs to perform different functions. The patient interacts with the Fox Insight Mobile phone app, while the researcher analyzes the Big Data or downloads it for local analysis.
2 IMUs. Inertial measurement units provide the raw data. In this example, we are using a smartphone and a smartwatch, although we can use any quantity and type of device. These devices communicate with each other and one—in this case, the smartwatch—sends the captured data to the EDH via MQTT messaging.
3 WebAPIs. Web-based application programming interfaces allow patients participating in a clinical trial to log on and interact with the tools, and allow researchers to log on and make calculations, perform analyses, and export data.
4 Stream. The messaging framework for transmitted data is based on MQTT (Message Queue Telemetry Transport), a lightweight publish-subscribe messaging protocol that travels on top of TCP/IP. We use Mosquitto*, an open source message broker that implements the MQTT protocol.
5 Intelalgorithms. Our algorithms make use of an Intel-developed Java library and the Akka* open source toolkit, which allows us to process data locally on a smartphone or in transit within the messaging framework.
6 Enterprisedatahub. Cloudera provides the secure data warehouse and staging area. For this project, we store metadata in one database, and all other “Big Data” (raw accelerometer data, measurements, tremors, etc.) in a distributed Apache* HBase database. Commodity server storage allows virtually unlimited number of patients streaming unlimited amount of data. In addition to the SQL tools included with Cloudera Enterprise, we are also using Phoenix* on top of HBase, which allows us to query over SQL.
A key reason to use Big Data is for its analytics capabilities. Cloudera’s parallel and batch processing support lets hundreds of concurrent users analyze the data, performing near real-time big data analytics.
In addition, Cloudera Enterprise provides industry-leading data security and compliance-readiness.
Using Wearable Technology to Advance Parkinson’s Research
Architecturally, our solution divides into two role-based categories: One for the patient and one for the researcher. (See Figure 2.)
On the patient side, users log on to the Fox Insight Mobile app on their smartphones through a mobile API and manually add data to the raw or calculated data stream that is already automatically transmitted by the IMUs they are wearing.
On the research side, researchers, clinicians, and data analysts can pull data from the platform using secured RESTful APIs1 to view and analyze the data. SPARK is used to manipulate the data before export.
In between the IMUs and the EDH is where we interpret and analyze this information.
Our messaging framework is based on MQTT (Message Queue Telem- etry Transport), a lightweight publish-subscribe messaging protocol that travels on top of TCP/IP. MQTT is ideal for wearables because of its small code footprint and economic use of network bandwidth. Because its “pub-sub” messaging pattern requires a message broker to dis- tribute messages to interested cli- ents, we use also Mosquitto*, an open source MQTT broker.
Another advantage of using this messaging method to transmit data is that it allows us to process the raw data either locally on the IMU or in transit—within the message packet itself. We are able to do this because we use a standalone Java library for our algorithms and Akka* (an open source toolkit for building message-driven applications on a Java virtual machine), which allows us to make calculations on the data while it is in the stream, rather than on the smartphone or smartwatch. This flexibility allows project leaders to customize their research projects—to design a trial, for
1. REST=Representationalstatetransfer.ARESTfulAPIis an application program interface (API) that uses HTTP requests to GET, PUT, POST, and DELETE data.
example, where the phone app con- verts raw data and sends only the processed data on to the database; or sends the raw data, to be ana- lyzed and forwarded on the fly within the protocol stream; or sends the data without any calculations at all, to be stored raw in the EDH.
The enterprise data hub sits at the other end of this data stream. This is the secure data warehouse and staging area where data is stored and advanced analysis takes place.
Data acquisition and analysis
Using a series of Java library tools, we have developed several algo- rithms to interpret raw data from the IMUs. We constantly refine these algorithms and create new ones, as analysis findings indicate promising areas for future research.
We currently identify more than 100 features for data acquisition from the IMUs—such as average acceleration (every 5 seconds), range of accelerometer, variance, zero-crossing, variables that describe the spectrum of the signal, and so on—and we perform calculations for all of them. We use walking sessions as an agreed upon “active” activity baseline, and use a band-pass filter to remove noise, particularly tremor, which we don’t want erroneously interpreted as a high level of activity.
For example, for a recent clinical trial on the effectiveness of L-DOPA (a common PD drug), patients equipped with Fox Insight Mobile IMUs take the medication and per- form a series of physical tasks eight times, to measure the impact the medication has over time.
The Fox Insight Mobile platform’s flexibility will allow researchers conducting subsequent trials to add sensors or include specialized algorithms, as such needs arise. To date, we have implemented activity level, activity recognition, and tremor rec- ognition algorithms into the moni- toring system. If we later decide to expand monitoring to include sleep analysis, dyskinesia (detection and quantification), or nonmotor symptoms, all we would need to do is add the necessary algorithms in a subsequent release, and include whatever additional sensors these algorithms may require for the next trial. For example, we might need to add a heart rate sensor to include sleep, REM sleep, and deep sleep pattern analysis. Other algorithms might take advantage of physiolog- ical metrics such as skin tempera- ture, perspiration, calorie burning, blood flow, etc.
In later data modeling and evalua- tion phases of the project, analysts will connect these explaining variables with the actual activity that was recorded at the time. Evaluation involves examining the validity of the extracted patterns.
Using Big Data to find these pat- terns and make generalizations can lead to insights about PD (and other diseases, for that matter), but they can also lead to Type I “false positives” as a result of what is known as the “multiple comparisons” problem. Because even randomly generated data can sometimes produce interesting patterns, independent of causality, researchers and clinicians will have to pursue these “hunches” with controlled experiments, to verify that the results are reproducible. As the size of the data set increases, the chance of encountering Type I errors diminishes, but researchers must follow up with scientific methods to verify that the insights were indeed valid and the relationships causal.
The Big Data picture
Looking at the sheer volume of information we are able to gather, it becomes clear why Cloudera is a critical piece of this project. The wearable IMUs we are using are capable of recording tremors, sleep patterns, gait, and balance— between 150 and 300 samples per second per device in terms of raw data. With 1,000 concurrent users, each wearing two or more devices, we require a data warehouse capable of handling a large volume of data every day. Cloudera Enterprise can manage such a heavy workload for several reasons.
For one, Cloudera uses Apache* HBase, a distributed, scalable data- base that runs on top of HDFS (Hadoop Distributed File System), a fault-tolerant and self-healing file system. HDFS stores data on commodity machines, which makes it affordable and easy to add new servers or hotswap failed drives on the fly as the volume of data increases and the need for more space grows. in short, a Hadoop-based solution offers lower costs and faster conclusions.
Cloudera Enterprise also employs YARN and Apache* Spark to divide heavy workloads into multiple tasks that can be offloaded to multiple servers and executed in parallel. This can reduce processing time significantly on large volumes of data—from days to hours. Apache* Hive data warehouse software facili- tates querying and managing large datasets in this type of distributed storage, and provides a mechanism to query the data with HiveQL.
Third, within an EDH, storage and computation coexist on the same physical nodes in the cluster, so data doesn’t need to travel to the compute location for execution. This data proximity allows unified Big Data applications combining batch, streaming, and interactive analytics to process exceedingly large amounts of data unencumbered by traditional bottlenecks like network bandwidth.
For this stage of our research trials, researchers will be exporting the data from the EDH to analyze locally, but later stages could take full advantage of Cloudera’s Big Data analytics functionality. In terms of analysis, Cloudera tools like Search and Impala would help researchers discover new patterns and associations. Impala is a fully integrated analytic database that collects and ingests any data type or volume of data, in full fidelity. Impala discovers patterns in new data and lets analysts collect and access more diverse data, opening up the possibilities of what the data can divulge, without compromising system performance.
Thanks to these features, an EDH will accommodate the terabytes of data such research might generate, while allowing analysts to perform Big Data analytics in near-real-time.
Because it is Cloudera, all data may be secure and encrypted, which prevents privacy violations and hacking. The data is anonymized at the smartphone source with a confidential ID and is encrypted in real-time in transit, at rest, and during analysis, thanks to AES encryption optimization built into the latest Intel® Xeon® Processor family. Cloudera’s Navigator Encrypt provides high-performance transparent encryption for regulatory compliance (such as HIPAA). Navigator Encrypt only allows authorized database accounts with assigned rights, via applications on approved network clients, to access cardholder data on a server. Operating system users without access to Navigator Encrypt keys cannot read the encrypted data.
Navigator Key Trustee—Cloudera’s enterprise-grade, software-based universal key management solution for managing encryption keys, certificates, and other sensitive Hadoop security assets—keeps keys separate from encrypted data and prevents cloud/OS administra- tors, hackers, and unauthorized personnel from accessing cryptographic keys and sensitive data.
Cloudera’s comprehensive security package includes complete governance—data protection, integrated authentication, authorization, encryption, key management, audit, and lineage—allowing you to track data and manage user interactions.
Healthcare is arguably the world’s last major industry to fully adopt information technology practices. As the largest single slice of GDP in the US and in many western nations, the healthcare industry could reap hundreds of billions of dollars annually in cost savings by going digital. Piloting a Big Data approach to treatment discovery will highlight the possibilities to the healthcare industry. To do that, we need more data. And to get more data, we need more patients.
We want to fit 1,000 concurrent users with these devices by the end of 2015, making sure that the data they produce is valid and valuable. Our latest trial, taking place in the Netherlands, marks the fourth col- laborative effort worldwide between Intel and the Michael J. Fox Foundation, which is spon- soring this research project and many others. Through these trials and others to follow, we hope to enable breakthroughs in Par- kinson’s disease research through Big Data analytics technologies.
Analysts rarely discover new insights the first time they examine their data. Usually an initial inspec- tion may hint at a more promising approach. With a few adjustments to the computations, the informa- tion may begin to look more mean- ingful on subsequent runs.
Hundreds of skilled neurologists, mathematicians, and data analysts across the globe are looking for rich data sets like ours on which to exer- cise their knowledge expertise and come up with innovative ideas...the first step in a long journey toward better understanding of the dis- ease and ultimately a cure.
At Intel, we hope that providing a secure archive of such data and our contributions to algorithm design help accelerate those first-stage discoveries. As we delve deeper into the patterns of PD data analysis, we expect to create novel, PD-related algorithms (for example, on/off detection and objective measure- ments). With a wider arsenal of algorithms, we should also be able to monitor symptoms in nonclinical trials (analyzing raw data while transmitting processed data).
Although the immediate goal is to improve the quality of life for Parkinson’s sufferers and lead clinical research scientists to potential cures, the information we learn from these trials will undoubtedly also help people with other Parkin- sonian disorders, and the tools, methods, and algorithms should be applicable to clinical trials for other afflictions and for other scientific discoveries in general.
For more information about this innovative research partnership, including a webcast with Diane Bryant (VP of Intel’s Data Center Group) and Todd Sherer of MJFF, see our Intel Newsroom article “The Michael J. Fox Foundation and Intel Join Forces to Improve Parkinson’s Disease Monitoring and Treatment through Advanced Technologies” (http://newsroom.intel.com/community /intel_newsroom/blog/2014/08/13).
For more information on participating in clinical trials like this one, visit foxtrialfinder.org and sign up.