Tuesday, April 22, 2008

Personal Health Profiler™: Part 3

An article last week in ZDNet Healthcare, titled Creating personal health record value from the bottom up, focused on my recent posts about our Personal Health Profiler™ (PHPro) system. It describes the PHPro as a spreadsheet-based system in which “data is nested so you can drill into the detail. Links and databases can be added automatically so that when someone clicks on a condition the data says they have, they get real advice on what to do. … The goal [is] to link personal data to actionable information so you become a better-informed health consumer.” That’s true.

The article then goes on to say that a major difference between the PHPro and current day personal health records (PHRs) is that we build the PHPro “from the bottom-up, rather than the top-down” and, historically, “…top-down solutions usually get built-out first, because there’s motivation to build them. And bottom-up solutions challenge them later.” While one could argue that the PHPro came first (since its development began over two decades ago), this notion of a top-down/bottom-distinction caught my attention.

Top-Down / Bottom-Up

According to Wikipedia: “Top-down and bottom-up are strategies of information processing and knowledge ordering, mostly involving software … A top-down approach is essentially breaking down a system to gain insight into its compositional sub-systems…[that are] then refined in yet greater detail…until the entire specification is reduced to base elements. … A bottom-up approach is essentially piecing together systems to give rise to grander systems. … In a bottom-up approach the individual base elements of the system are first specified in great detail. These elements are then linked together to form larger subsystems, which then in turn are linked, sometimes in many levels, until a complete top-level system is formed. This strategy often resembles a ‘seed’ model, whereby the beginnings are small but eventually grow in complexity and completeness.”

Based on these definitions, a top-down approach to PHR development focuses on defining the main components of the overall system and then defining the smaller parts needed to make it work. A convenient way to do this is by examining existing top-down PHRs (and even electronic medical records) to determine what data are typically collected, what user interfaces are typically used, what types of reports are typically generated, what technology standards are typically followed, etc. Differentiating one PHR from another can be done by making modifications to certain parts. The result is that all top-down PHRs closely resemble each other and evolve gradually over time through series of relatively minor changes. In other words, they are “continuous” (“non-disruptive”) technologies offering small incremental improvements to the status quo.

A bottom-up approach to PHR development, in contrast, is a process focusing on defining the fine details first, and then building up from those details to create the complete system. For me, the bottom-up process went something like this:
  • The first thing we did was to research, define, organize (categorize), and compile lists of data likely to be necessary for understanding understanding the whole person fully. I reasoned this would be an ongoing process since these lists would have to evolve considerably over time as health science generated new knowledge and healthcare professionals across all healthcare disciplines provided their input. That is, before architecting the PHPro’s technology, I wanted to be sure that whatever the technology would ultimately be, it must be able to:
    • Collect every possible piece of information that could help gain deeper knowledge and understanding of how a person’s mind (psychology), body (physiology), and environment (both social and physical surroundings) interact and affect one’s physical and mental/emotional health and wellbeing. The biggest challenge here, by the way, was in the defining the information required for comprehending a person’s psychosocial and mind-body functioning, since understanding a person’s thoughts, emotions, behaviors, social interactions, and environmental influences--and how this all relates to one’s biology and physical health--requires a great deal more information than focusing solely on understanding a particular medical condition.
    • Enable people to gain and use this knowledge and understanding to help prevent and treat biomedical, psychological, and mind-body problems.

  • As these lists evolved, we began determining how best to present this information, through interactive reports, in ways that increase awareness and understanding, and help support decisions. We reasoned that there would have to be a wide variety of reports, each focused on the knowledge needs and decision needs of people with different roles and responsibilities. That’s because the knowledge needs of people trying to understand and receive support for dealing with an existing health problem or personal life crisis differ significantly from what different healthcare professional needs. In other words, there are big differences in the information needed by a consumer interested in self-help for a stressful life event, a person who is working with a wellness coach for help managing a chronic condition through lifestyle change, a patient looking for guidance in deciding on the best treatment option for a medical problem, a primary care physician trying to coordinate a patient’s care, a medical specialist (e.g., cardiologist or oncologist) treating a particular physical condition, a mental health professional treating a behavioral problem, etc. Thus, the reports generated by the PHPro would have to come in a wide variety of types that would have to evolve considerably over time.
  • As the reports were being defined, we began developing technical software processes for collecting, storing and distributing the information in a secure and cost-effective manner, and for generating the reports described above.
One thing this bottom-up approach taught me early on is that the data collected, reports generated, and technical methods used must be able to evolve continually as health knowledge grows, new technologies emerge, and standards change. That meant the PHP had to be a very flexible and adaptive system.

Furthermore, since I began this process in 1981, there was no Internet, the first personal computers were just coming to market, and it was decades before the ideas of a PHR was even being discussed in the healthcare industry. That meant we had to discover an original way to build the PHPro. It also meant that we had to find a very cost-efficient way to operate the system since, back then, computer memory, speed and data storage capacity were tiny compared to today.

Having been intrigued by spreadsheet’s power, efficiency, ease-of-use, and “plasticity” (like molding clay into different forms), we began building the PHPro using spreadsheets in unique ways.

Disruptive Innovation

I've referred to the PHPro as a “discontinuous/disruptive” innovation, This means it uses a radically different technological approach to developing personal health records, compared to existing dominant technologies or status quo products in a market. Unfortunately, disruptive innovations often go unnoticed, or they are ignored for many years. When they are finally recognized, businesses with a stake in maintaining conventional technologies tend to see them as threats and try to lock them out of the market. In fact, my idea of using spreadsheets as the foundation of a PHR application has been ridiculed and dismissed by conventional software developers in the past. This could be because they don’t realize how spreadsheets can be used in novel ways, they are fearful they might lose business to a simpler and less expensive technology, they don’t want to learn a new of developing software, or for other such reasons.

Nevertheless, I’ve persisted … and here’s why …

Why Spreadsheets?

There are many huge advantages to using the spreadsheet for PHRs and other health information technologies, as long as you know how to handle the challenges. Spreadsheets, after all, have been around for decades, making them one of most sound and solid software ever created. They are efficient, low-cost, easy-to-use, and infinitely flexible (i.e., they can be molded into unlimited types of applications). In addition, spreadsheets have powerful data collection and sharing, computation, model-building, reporting, and automation capabilities. In other words, they offer a quick and easy way to obtain, organize, synthesize, analyze, evaluate, distribute, and display information.
On the down side, spreadsheets must be examined and controlled in order to prevent errors and unauthorized changes. The PHPro does all this in innovative ways.

One thing most people fail to realize is that a spreadsheet is much more than just a big electronic grid with charts. The truth is, a spreadsheet application has three major components:
  • One component is its electronic user forms that enable a person to input information manually, displaying it, and update it.
  • A second component is its code modules (“macros”), which automate processes for such things as:
    • Obtaining data from databases (i.e., running “queries”) and sending data to databases
    • Extracting data from data streams (transmitted packets of data), from XML documents, and from other text-based files using custom “parsing” modules
    • Performing computations
    • Formatting (“rendering”) information for presentation
    • Transmitting information over the Internet (e.g., using encrypted e-mail attachments)
    • Connecting to other software applications
    • and more.
  • The third component is its sheets (grids) of interconnected spreadsheet cells that work in conjunction with the code (macro) modules. A spreadsheet cell is an electronic “container” that stores, uses and displays numbers, text (up to 10 pages worth), pictures, hyperlinks (to documents and the web sites), mathematical and logical formulas, and more. In addition, the contents of any cells can be copied, shared, moved, sorted and filtered, hidden or displayed, and formatted in many different ways (e.g., the color, type and size of the text and numbers in a cell can be set, as can the color and style of a cell’s interior and borders).
These cells and code modules make spreadsheets excellent vehicles for developing robust health information applications from the bottom up, as I describe next.

Examples of how the PHPro uses Spreadsheets in Novel Ways

To exemplify what I’ve just written about spreadsheets, following are four groups of screen shots showing how the PHPro uses spreadsheet forms, grids and modules in innovative ways that deliver a unique range of capabilities and benefits. They explain these processes:
  1. Data definition and collection
  2. Data organization and analysis
  3. Information storage and sharing
  4. Report generation.
1. First comes data definition and collection. As I said earlier, when I started developing the PHPro, my goal was to create a software application able to manage every piece of relevant information over people’s lifetimes. This information would have the potential to help consumers/patients and their healthcare professionals develop a full and deep understanding of the person’s strengths, weaknesses, risks, problems, health trends (changes over time) and preferences, as well as the suitable options for prevention and treatment. I also wanted the information to be “self-actionable” by providing instruction, insight & guidance and warnings & alerts, which would promote a better quality of life by motivating and enabling the person to help him/herself deal with physical, emotional and behavioral concerns.

To accomplish this monumental task, I spent many years researching the healthcare literature and examining health questionnaires. During this process, I built “evergreen” (continually evolving) spreadsheet grids containing lists of questions to be answered by a consumer/patient, as well as by healthcare professionals. The PHPro was then designed to manage people’s answers to these questions, along with data obtained from databases, web sites, data streams and electronic documents (including XML files). Figure 1 (below) shows a small section of one of the PHPro’s Question Grids.

This bottom-up path led to the invention of my patented CP Split™ technology and other “discontinuous/disruptive” innovations, which became key components of the PHPro.

Figure 1 (click to enlarge)
Referring to Figure 1:
  • Column A contains a unique ID number for each question.
  • Columns B, D and E are used to control the branching logic (when the response to one question determines the subsequent question to be presented).
  • Column C is a symbol that identifies the response scale to use (e.g., Yes-No, Yes-No-Uncertain, select one item from a list, select multiple items from a list, use a 1-9 scale, enter unstructured text, etc.)
  • Column F contains the text for each question.
  • Starting in column G and going to the right are the items a person may select in response to the question.
For example, on row 180, the symbol in column C designates a 9-point scale, with “NOT AT ALL” on one end (as indicated in column G) and “A GREAT DEAL” on one end (as indicated in column H). The “BRL” in column B, the number 4 in column E, and the ID number in column D, all instruct the software to branch (jump to) question “103 01 20 10” if the person response is less than 4 (on the 1-9 scale). Note that some of the cells are colored, which gives developers a visual depiction of the types of content in those cells.

Figure 2 (below) shows a series of screen shot depicting a type of user form the PHP system uses for manual data input. Macros automate the process by which the forms read the Question Grid above and present the questions and response options to the person; they also collect and store the person’s responses.


Figure 2 (click to enlarge)

As with all the PHPro components, this patented data collection process is very flexible:
  • New questions are added by simply inserting them as new rows into the Question Grid
  • Questions are modified by typing the changes into the Question Grid (and adjusting the ID number accordingly)
  • Questions are removed by deleting their corresponding rows from the Question Grid.
Note that entirely new Question Grids can be constructed at any time in the same manner. In fact, entire libraries of Question Grids can be developed for use by people with different roles and in different situations.

In any case, as the questions are answered, the PHPro automatically stores person’s responses in a list containing the question ID (in a cell of column A) and the person’s response next to it (in column B)—which comprise the “raw data” —as shown in Figure 3 (below). Note that data not manually entered (e.g., data queried from databases, extracted from documents, or streamed from medical devices) can be added automatically to the manually input data using custom macros.


Figure 3 (click to enlarge)

2. Next comes data organization and analysis. Once the raw data are collected, the PHPro uses another spreadsheet grid—the Publisher Spreadsheet Grid shown in Figure 4 (below)—whose cells contain an assortment of formulas (which are not visible in the screen shot). A portion of the Publisher Spreadsheet Grid is which, along with its macros, automatically transform the raw data into structured information ready for report writing. Most of the rules (algorithms) for analyzing the data are included in this spreadsheet and others to which it is linked. These rules may contain criteria identifying when certain data indicate the existence of a health problem (e.g., when a lab test is abnormal, when someone’s emotional or state or cognitions reflects a serious psychological concern, when a reported symptom may be due to an adverse medication side-effect, etc.).

Since this involves technical spreadsheet model-building, I’m not going to take the time to explain what exactly is in this spreadsheet. Suffice to say that the data in this publisher spreadsheet are organized and calculated in a predefined manner that corresponds to the PHP reports.


Figure 4 (click to enlarge)

3. Then the contents of the Publisher Spreadsheet Grid are stored and shared. The contents of the Publisher Spreadsheet Grid are now stored in another file, without any macros, formulas or formats. A section of the stored grid, which is called a “Content File,” is shown in Figure 5 (below). The Content File (which can be converted easily to a delimited text file) is encrypted to protect the data inside. It can retrieved at any time the data needs to be updated, and whenever the person wants to view their information. And if they want, people can share any portions of of their Content File with individuals they authorize.
Note that the Content File can be saved in any location the person wants and its security is compliant with HIPPA regulations. This addresses a concern raised in a recent NY Times article, titled “Warning on Storage of Health Records”. The article discusses how the benefits of personal health records stored in Web-based databases is offset by concerns about risk to privacy. One way to diminish this risk is by putting health records directly in hands of the individual to whom they belong; that way individuals have complete control over who (if anyone) gets to see their personal information. The PHPro Content Files enable such protection.


Figure 5 (click to enlarge)

4. Now comes report generation. Figure 6 (below) portrays a piece of the PHPro report, which I discussed in my initial post on the topic. Only this time I’m showing columns B through F, which are hidden in the actual report.


Figure 6 (click to enlarge)

The cells in these columns contain numeric data that have been extracted from the PHPro Content File (described above). Other cells use these data to determine what rows should be visible and how the data should be displayed. For example, the series of blue boxes in cells K352 and K380, which indicate the amount of distress the person experiences in two situations, are created by a formula in those cells that use data in column E to determine the number of boxes to display.
A few other things about using spreadsheets for the PHPro reports:
  • In addition to numbers, text and symbols, a PHPro report can contain multiple images (including pictures and charts).
  • A report can contain buttons and links that automatically retrieve and display external information from the Web, as well as from electronic documents stored in a person’s own computer or in other computers via networks.
  • Changing a report is similar to modifying the Questions Spreadsheet Grid: Add rows, delete rows, and change the words, formulas and formats of any cells in any rows. A wide variety of charts (graphs) can also be easily added and removed.
Web-Enabling the PHPro

The PHPro was originally built as a stand-alone desktop application. We are now in the process of making in web-enabled as well, so anyone with a browser and Internet connection can use it. I will have more to say about this in future posts.

Fertilizing the Seed

The quote from the ZDNet article at the beginning of this post included the statement that the bottom-up strategy often resembles a “seed” model in which an application’s small beginning eventually grows in complexity and completeness. This requires that the “seed” be nourished (fertilized). The PHPro is designed to grow and evolve continually through a collaborative process in which consumers/patients, sick-care and well-care professionals, research scientists, educators, software developers and others provide ideas and content that are incorporated into the system. Because it is built with highly efficient and flexible spreadsheets, uses a library of categorized data definitions (similar to a book libraries Dewey Decimal system), has a modular structure, and can interoperate with most (all?) other software system, the PHPro is able to molded and expanded into ever-more-powerful knowledge tools that are tailored to the needs of just about anyone. And best of all, this can be done for little cost and with little hassle, which is an important consideration in today’s difficult economic climate.

My hope is that these posts will help motivate people from all groups to join our team of collaborators and grow the seed we’ve been nourishing into a complete, diversified personal health knowledge system that has a positive impact on the health and wellbeing of all people.

1 comment:

Anonymous said...

Nice post on extracting data, simple and too the point :), For simple stuff i use python to get or simplify data, data extraction can be a time consuming process but for other projects that include the web, files, or documents i tried "extracting data" which worked great, they build quick custom screen scrapers, extracting data, and data parsing programs