Summary
Lyme disease is expanding geographically in the US. Johns Hopkins researchers utilized Google Health Trends data to analyze internet Lyme disease searches to identify geographic patterns over time to better identify Lyme disease geographic risk. This approach was shown to be an effective tool that could complement traditional disease surveillance (case reports, lab test results, insurance claims data, electronic health records) to help inform the public and health care providers of high incidence, underreported, and emerging regions that form the leading-edge of disease expansion.
Why was this study done?
The incidence of Lyme disease has been significantly underreported (by a factor of 8-10x) by traditional surveillance methods. Additional methods are needed to identify existing endemic areas that are underrecognized and emerging endemic regions of increasing Lyme disease risk.
Lyme disease is expanding geographically and is the number one vector-borne disease in the US with around 476,000 annual new cases. Misdiagnosis, delayed diagnosis, illness invalidation and treatment delays are common, especially in regions where incidence is considered minimal and health care awareness among clinicians likely correspondingly low. Delays in treatment may increase the risk for chronic illness. The incidence of Lyme disease is significantly underreported by traditional CDC surveillance methods, and innovative methods are needed to identify underreported and emerging endemic areas and better inform health care providers of present and increasing risks.
How was this study done?
Researchers at our Center in the Johns Hopkins University School of Medicine, in collaboration with Frank C. Curriero, PhD, and Cara Wychgram, MPP, in the Johns Hopkins Bloomberg School of Public Health, used data from Google Health Trends to compare geographic locations of Lyme disease queries in the 2011-2021 timeframe.
The team investigated designated market area (DMA)-level trends in searches for “Lyme disease” and “tick bite” for the 2011-2019 and 2020-2021 (COVID-19 pandemic) periods. A longitudinal ecological analysis (with modified Poisson generalized estimating equation regression models) was utilized to assess the predictive power of searches on CDC-reported Lyme disease incidence rates between 2011 and 2019.
What were the major findings?
The research showed that google searches for ‘Lyme disease’ were a significant predictor of CDC-reported cases. “Each 100-unit increase in the search rate was significantly associated with a 10% increase in incidence rates” after controlling for environmental factors. The Google Health Trends tool picked up Lyme disease risk years in advance of other methods of surveillance.
Results revealed an expanding area of higher Lyme disease inquiries occurring along the edges of the northeastern regions of Lyme disease (e.g., western New York, western Pennsylvania, West Virginia, Ohio, Kentucky) as well as stronger queries than expected when compared to CDC-reported incidence in Michigan, Ohio, Virginia, West Virginia, Kentucky and North Carolina indicating incidence is likely underreported. Several of the emerging regions are not presently considered “high risk states” based on reported case data. For example, North Carolina is still considered a low-incidence state based on reported case numbers but is trending towards being endemic. The Google Health Trends data is picking up this leading-edge risk ahead of other metrics.
In southern Oregon and northern California, the strong regular Lyme disease internet query patterns do not suggest a leading-edge phenomenon but rather consistent underreporting of high-incidence in these regions within what are considered low-incidence states.
What is the impact of this work?
Cases of Lyme disease are significantly underreported to the CDC. Over 63,000 cases of Lyme disease were reported to the CDC in 2022 and yet other widely accepted methods based on insurance claims data and electronic medical records reveal 476,000 annual cases. Reported incidence data not only substantially underestimates the annual cases but also lags by years. This likely contributes to a lack of awareness of disease risk in many regions which can lead to misdiagnosis and delayed treatment. As the burden of Lyme disease expands geographically in the US, additional methods are needed to complement traditional disease surveillance avenues (case reports, lab test results, insurance claims data, and electronic health records).
The study finds that geographic Lyme disease risk, including leading-edge areas, can be identified in near real time using Google Health Trends data. This dynamic tool can be used in conjunction with other surveillance metrics to inform the public and health care providers about high incidence, underreported, and emerging regions of disease risk.
The CDC revised its case definition for high-incidence states surveillance in 2022, requiring only a positive laboratory test and no longer calling for additional clinical follow-up. However, low-incidence states still are required to provide clinical information along with a positive lab test. The reporting in high incidence states jumped after the CDC revision, but despite this improvement overall cases are still woefully underreported. Ongoing stringent reporting requirements for low-incidence states likely contribute to undercounting, including high or increasing incidence regions, and perpetuate a lack of physician recognition of Lyme disease risk in those areas.
Google Health Trends can identify signals in these regions, such as southern Oregon and northern California, that passive surveillance may not be optimal for capturing. Google Health Trends findings of underreported regions (in southern Oregon and northern California) and leading-edge areas (in Northeast, Mid-Atlantic, Upper Midwest) highlight regional opportunities to improve surveillance and increase public and physician Lyme disease awareness.
“The public health contribution of Google Health Trends data may be significant in informing the public and health care providers about emerging risks in their geographic regions.” Integrating this data along with other surveillance sources (case reports, lab test results, insurance claims data, electronic health records) should help improve the identification of regions where Lyme disease is prevalent and spreading.
This research was supported by:
This work was funded by a gift from the David P. Nolan Charitable Fund (to FCC, no grant number). The funder was not involved in data collection and analysis, decision to publish, or preparation of the manuscript.