Sunday, August 16, 2009

Report My Flu

Having worked in public health informatics for many years. I have often grappled with the problem of collecting timely information needed to make decisions about current health problems.

The current and coming flu epidemic is a 'perfect storm' in this respect. We have some information about the flu but there is much we don't know. We don't know why some people have mild illness while others severe illness. We don't know how to treat the flu beyond the fact that some antivirals seem to be effective if taken early in the illness. There are many more potential treatments, both supportive and curative, that may be effective but we don't have good information. There is also the problem that flu viruses tend to mutate frequently and the illness can quickly change character. The public health community desperately needs real time information on the flu and there is no good source of this data. There are also several potentially valuable treatments that can be used beyond antivirals but we don't have information about how well these work. Better case reporting information can help sort out useful treatments.

Normal flu reporting takes place through various organizational mechanisms such as the US CDC, state and local health departments, and research projects. Internationaly, the WHO collects some information and various government and research organizations collect case reports. This information is of widely varying quality and completeness and the timeliness tends to be slow.
I have started a project to 'crowd-source' the collection of flu case reports. This will allow individuals to report flu cases. This has the potential to provide a valuable record of flu cases and to provide information that can be useful for guiding treatment and severity as well as looking for changes in the virus.
There are many potential problems with this crowd-source approach and I will be the first to admit that it may not yield useful information. Some of the problems can be addressed by good system design and I will attempt to incorporate the best of my knowledge into the initial data collection design. However, in true crowd-sourcing, I expect that the best design will evolve through suggestions from many of the very smart people out there.
You have probably noticed that I am using the singular in this reference. That is because at this time I am the only person involved in this project. I hope to attract others to the project in time. I should also note that although I am employed by a large public health organization which must remain nameless, this project is not officially sanctioned by that organization. I doubt that any large public health organization could undertake this type of project due to the many uncertainties and the unusual mode of data collection and analysis.

This project will operate with a few basic underlying principles:
- privacy and security
- open access to anonymous data

First, privacy and security.
The first principle is that individual patient information and information on the submitters of data will be kept private and secure. This is an absolute requirement. The data will be hosted in the US which has very strict health data privacy and security regulations and we have an obligation even beyond these to ensure that no personally identifiable information is released.

Second, open access to anonymous raw data
Besides crowd-sourcing the collection of data, I also plan to crowd-source the analysis of the data. This will hopefully attract bright minds to this task. We will not have any restrictions on access to the anonymous raw data. This also is very unconventional in the research world. Most researchers guard their raw data jealously and only rarely release data. It is almost unheard of to release raw data. I think this will improve the quality of analysis. Most researchers spend a lot of time cleaning and adjusting their data to 'improve' it. Unfortunately, this process often has the unintended consequence of distorting data and hiding or obscuring findings. (I adopted this policy after listening to this persuasive lecture by Tim Berners-Lee).

As should be clear from this post, this project has just started and now only consists of a web site place-holder www.reportmyflu.org which I hope to update soon with more information and data collection software. In true crowd-sourcing style, I hope to attract help in this task. Please comment on this post.