Friday, December 18, 2009

Combining Cloud Computing, Client-Server and Novel Pub/Sub Mesh Node Network Architectures (Part 1 of 2)


There are at least three key architectures for deploying health IT software programs and exchanging patient information:
  1. Cloud computing
  2. Client server
  3. Publish/subscribe mesh node networks.
Each of these architectures has its own use cases and, as I discuss below, I've concluded that using all three provides the best solution for operating a national health information network (NHIN).

Cloud Computing Architecture Described

Here's a definition of the cloud computing architecture that I believe captures its essence: Cloud computing architectures store software programs and data in servers that are accessed over the Internet via services (e.g., software as a service, SaaS); web browsers provide end users an online interface to those services. The servers—along with the software and centralized databases they contain—may be owned and managed by third-party vendors (public cloud), by the end user's organization (private cloud), or by both (hybrid cloud). In other words, typical cloud computing enables applications to be accessed online from a Web browser, while the software and data are stored on remote servers. Also see this link.

Client-Server Architecture Described

Then there's the client-server architecture in which the data are stored in centralized or distributed databases through servers controlled by the end-user's organization or third-party vendors. Each client sends data to, and receives data from, those databases. Like the cloud, the data are not stored locally on the end-users' computers.

Novel Publish/Subscribe Mesh Node Network Architecture Described

Then there is a publish/subscribe (pub/sub) mesh node network architecture, which I've been promoting. A "node" in this architecture is a connection point in a network, and a "mesh node network" is a federated/distributed node structure in which any nodes can communicate with any other nodes, just like the telephone system works.

Each node consists of a desktop (standalone) software program installed in a personal computer/laptop/notebook, smart phone, or other computerized device that access patient data from any appropriate data store (hard drive, flash drive, smart card, memory stick, etc.). This desktop software gives each node both publisher (sender) and subscriber (receiver) functionality. Thus, a publishing node sends data to its subscribing nodes (i.e., the nodes subscribing to it). Such data exchange can be done asynchronously in near real time, which means data can be sent rapidly whenever it is ready without having to wait until the receiver signals that it's ready to receive them. Note that a node may be partially automated (i.e., require some human input) or it may operate in a completely automated (unmanned) manner.

This node-to-node architecture is, in many ways, unlike the typical peer-to-peer (P2P) file-sharing systems, such as Gnutella and Napster. As described in the italicized link above), my proposed system uses a unique underlying technology—the patented CP Split™ (CPS) software method I invented 12 years ago—by which:
  • CPS publishing nodes use data grid templates to create maximally efficient CPS data files. These dense small files are encrypted and shipped inexpensively and securely by e-mail (or other communication protocols).
  • CPS subscribing nodes retrieve the CPS data files in near real time, store them locally, and use corresponding data grid templates to consume (open and use) those data files. They then present (display) customized reports or export the data to other data stores (e.g., local databases).
See this link for a depiction of how CPS nodes provide a simple, low-cost, secure way to exchange health data.

In addition to their asynchronous pub/sub functionality (described above), CPS nodes have other critical capabilities including:
  • Data transformation
  • Universal translation
  • Composite reporting.
These capabilities, which are described below, enable CPS node networks to exchange patient information easily no matter what data structures and terminology standards are used, and where the original data are stored. This means that data transformation and universal translation provide a means for modifying (transforming, translating) data as they pass between the nodes, so that all subscriber nodes receive from their publisher nodes the right data, in the right format (structure) and language, and with the right terms (semantics). This means the nodes accommodate all data standards, as well as any non-standardized (local or domain/discipline specific) data. It also means that composite reporting combines data multiple sent publisher nodes to a single subscriber node in order to generate composite reports containing information from multiple sources.

Data Transformation Capabilities

CPS nodes use their extensive data transformation capabilities when data structures have to be modified to allow disparate databases to exchange their data. This happens when, for example, the databases have (a) different table and field names, (e.g., one database may use the field "birth_date" and another "dob" or (b) different data formats/syntax, e.g., whether the birth data is month/day/year (as in U.S.) or day/month/year (as in Europe), as well as how many digits or characters are used in the date.

One way to deal with the issue of exchanging data between incompatible databases is forcing everyone to use the same data standard. One such method is transforming all data to XML format, using a common "schema" (data structure), before shipping those data between databases. While the CPS nodes can exchange XML files, there are distinct advantages to having a publisher node (a) convert XML data into CPS data files before shipping them to other nodes and (b) convert data from a database directly into a CPS data file without any XML. As I discussed at this link, two advantages of the CPS data files over XML are: (1) they are easily 1000% to 2000% (10 to 20 times) smaller than XML files containing the same data and (2) they are much simpler and more efficient to process (i.e., parse and render) than XML.

As the data are being sent to the CPS data file, the publishing node transforms them depending on the data format needed by each subscribing node. The publisher then ships the transformed data to the subscribing node. The transformation process requires that the publisher node be notified in advance as to the data transformations required by each subscribing node. This notification process can happen during when a subscribing node registers with a publishing node, or upon a subscribing node's request for data from the publishing node.

Universal Translation Capabilities

CPS nodes use their universal translation capabilities when publishers use a different language or terms than their subscribers. This is most likely to happen when people in loosely-coupled professional and social networks share information; that is, when diverse groups of individuals exchange data from diverse data sources. One such example is the vast diversity of people who will exchange a great variety of data across a national health information network. Sometimes language translation is necessary (e.g., English to Spanish), which is fairly straightforward. Other times different people use different terms (their local terminology standards) to refer to the same concept (thing or idea), which can be a complex problem.

One common strategy used to deal with complex terminology problems is to discard local terminology standards by forcing everyone to adopt the same global terminology standards. This is done by agreeing on one set of terms (semantics) for a particular thing. While such global standards for health-related terms can foster widespread communications between people from different regions, organizations and disciplines, there is a serious downside to eliminating the local standards people rely upon; the problem is the loss of important information due to reduced semantic precision and nuance.

Take, for example, the term "high blood pressure;" there are 126 different terms referring to this concept of elevated blood pressure levels. These terms include "malignant hypertension," which refers to very high blood pressure with swelling of the optic nerve behind the eye; it's a condition usually accompanied by other organ damage such as heart failure, kidney failure, and hypertensive encephalopathy. "Pregnancy-induced hypertension," on the other hand, is when blood pressure rises during pregnancy (also called toxemia or preeclampsia). These are very different types of hypertension. So, while referring to a person's condition using a global standard term such as "hypertension" clearly conveys that the person has high blood pressure, the standardized term loses important details found in the more detailed local standard terms. These lost details, in turn, could very well affect treatment decisions and outcomes. So, there is a good reason to have multiple terms that refer to the same health-related concept.

An advantage of the nodes' universal translation capabilities, therefore, is that they enable end-users to keep existing local standards, support the evolution of those standards, and use the data translation described above to ensure everyone gets the information needed using the terms they need and understand. In a node network, this can be accomplished in a way similar to data transformation. But instead of having the publishing node transform the data, it replaces specific terms with the alternate terms required by the subscribing nodes.

Composite Reports

The nodes can also generate composite reports that are comprised of data sent from multiple publisher nodes to a single subscriber node. The subscriber node takes all that information from multiple sources and combines it into a single integrated patient health report that is tailored to the needs of the individual.

For example, let's say a primary care physician (PCP) wants to keep track of the treatment a patient is receiving from several provider specialists. The PCP's node, which serves as the subscriber, would send a request for certain data from all the patient's specialists. Upon receipt of the data request, the specialists' nodes, which serve as the publishers, retrieve the requested data from their different electronic health record databases, transform and translate the data as necessary, and then send the data automatically to the PCP's node. The PCP's node then incorporates the data into a composite report tailored to the PCP's needs and preferences, and presents the report on screen for the PCP to view. The PCP's subscriber node could also be instructed to request data from the patient's node connected to his/her personal health record and, upon receipt, add the data into the same report. Likewise, a patient node could create composite reports in a similar manner from data sent by multiple provider nodes.

So, which architecture is best—cloud, client-server, the novel pub/sub node network, or a combined solution that embraces all three? I contend that it's the latter, whereby the benefits of each architecture are realized in different use cases.

In my next post, I will share my thoughts about this multifaceted solution by presenting possible use cases.

Related posts:

2 comments:

chillyzhosting said...

Cloud hosting has become the latest form of shared computing and shared hosting. When ever we are using free services like gmail, flickr or photo bucket ,we are using cloud hosting services unknowingly.

Dr. Steve Beller said...

This is true. And security, even of gmail, is an ongoing concernwith public clouds (see http://bit.ly/7xxon1).

So, public clouds do have value, but NOT for storing your personal health information in a public cloud databased if you're concerned about security and privacy.