Thursday, June 6, 2013

The Cost of Free Software.... and Non-free

There is continual debate (and unfortunately confusion) about the cost, use and value of "free" software. I am hoping to clear up a few things with this post.

First, we should define:
Free Software is Free Open Source Software (FOSS). This software is free of cost to use. It also has free access to the source code which allows you to make changes to the software and allows you to distribute these changes to anyone for them to also use the software.
A program is free software if the program's users have the four essential freedoms:

  • The freedom to run the program, for any purpose (freedom 0).
  • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
  • The freedom to redistribute copies so you can help your neighbor (freedom 2).
  • The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

Non-Free Software places restrictions on one or more of the four freedoms. There might be restrictions on where and when you can run the software. The source code might not be available for examination and changes. You might not be able to distribute copies of the software. Someone else controls the software.

It's not (just) about the cost.
Yes, free software is free in that you don't have to pay for the software. However, many people fail to look beyond cost to the other values of free software. Most people don't think that they will ever need the source code to modify the software. But... most people at some point will think that it would be nice if the software did something new or different. Free software allows anyone to make changes to the software and to distribute these changes. Once a free software project reaches a certain point where it becomes useful, there is usually a community of developers and users who exchange ideas on how to improve the software and incorporate those ideas into new versions of the software.
Most people don't consider that they will spend a lot of time working with software to learn how to use it and in entering data. What happens if non-free software stops working or the company supporting it goes out of business or stops responding to questions and suggestions from users? The users' investment in time in learning the software and entering data will be wasted. Their data may also be inaccessible.
What happens if the data one is collecting is confidential and must be stored on computers under the direct control of your organization? Some software uses cloud servers. Some non-free software does not give you the option of running your own cloud server. You may not be able to use this software if your data must be under your direct control.

What about the cost of free software?
Free software is free. This means that it doesn't cost money to use it.
However, we all know that the cost of implementing a software system includes more than the cost of the software itself.  There is the cost of the hardware to run the software, the cost of training, the cost of support and maintenance.

Let's look at the total cost of implementing software. This is sometimes called "total cost of ownership" or TCO. We'll use the example of software which allows users to collect data using a mobile device. This is a common function in most development organizations. This is also a common configuration for modern software which is usually composed of two components: a client piece which is installed on a PC or mobile device and a server piece which holds the data.

Fortunately, we have a lot of choices for mobile data collection software. There is a good comparison of 24 different mobile data collection software systems at Humanitarian Nomad.
I'll compare the cost and the advantages of three different data collection platforms. Two of these are in the Humanitarian Nomad data set and the third is not. I've chosen these systems for comparison primarily because I am familiar with them.
These systems are:

  • Open Data Kit (ODK) which is a set of software applications (one for the the mobile device and one for the data collection server. You can set up your own server or use Formhub as a public server to collect and analyze your data.
  • Magpi (formerly EpiSurveyor) which is a software application which runs on mobile devices to collect data. You must use the data collection server provided by Magpi. There is no option to set up your own.
  • DHIS2 (District Health Information System) software is much more comprehensive than the two options above in that it is designed to collect data from multiple sites and aggregate this data up through many levels of an organization. It also includes a set of sophisticated analysis tools (including mapping). It is free software which runs on a server and is accessed through a web browser. Mobile data collection can be through a mobile device web browser or through a mobile java client.

I won't try to do a full TCO comparison but will look at the most common costs and in particular look at where these costs differ with these three systems. Namely software cost and server cost. All of these systems allow data collection on mobile devices such as phones and tablets. All of them send their data to a server for aggregation and analysis.




ODK ODK Formhub Magpi DHIS2
Software cost Free Free $0.20-$0.25 per form submitted Free
Server hardware cost Self* Free Self*
Per use cost Free Free (Free for 500 forms/mo.) Free
Free software Yes Yes No Yes
User-controlled Server Yes No No Yes

The server cost is one of the differences here. If you use ODK and Formhub, there is no cost for the software or the server. If you use Magpi, the software and server are provided as part of their service. It is free for limited use (20 forms, 100 questions, 500 forms/month) but more than that cost $0.20 to $0.25 per form submitted. You must "pre-purchase" blocks of forms for $5,000 or $10,000. There is no option to set up your own server so if you have confidential data that must be under your direct control (either by law or by your organization's policy), you cannot do this with Magpi.
*The cost to set up a server for ODK or DHIS2 will vary depending on the skills and resources of your organization. The server hardware itself can cost anywhere from a few hundred dollars to several thousand dollars, depending on capacity. There is also the option of hosting the server on a cloud service where you rent the server by the month which usually costs about $100 a month.
The server software for both ODK and DHIS2 is free and if your organization has IT skills, you can install the server software yourself. If not, you will need to hire someone. I am not highly technically skilled and I have set up servers for both ODK and DHIS2 and it took me about a day (including some head scratching time). If you have to pay someone, figure a days work (although I have seen skilled people set these servers up in just a few hours). This can be done remotely for either a cloud server or a server on your site so there is no need for travel. Most software is updated periodically and it is wise to keep your software current. This can be done in a few hours a few times a year.

So, yes, free software is indeed less expensive to implement. It also gives you more options for configuration, use and features. Most importantly, it gives you the assurance that you (and you only) control your data and your software so that you are not suffering at the whim of another organization which may have different priorities.

I hope this discussion has shed some light on the subject of the cost of free software. There are many other costs including training and there are, of course, differences in features between software choices. All of these should be evaluated carefully.



Saturday, June 1, 2013

The Cost of Free Software

A recent post (more like a rant) by Joel Selanikio on Datadyne's web site:

Global development: where “free” means “expensive”
JOEL SELANIKIO ON 29 MAY 2013
In global development, I often hear people talking about “free” technology, particularly free software.  Well, as part of the team that has created Magpi, I’m a great believer in free software.  But my definition of free — “doesn’t cost any money to use” —  seems to be very different from the definition used in global development discussions.  Which is odd since “free” seems like a pretty basic concept.

Right off the bat, Joel misses the point of Free Open Source Software (FOSS). Yes, it is free of cost which he seems to understand but it is much more. To quote Richard Stallman:
"Free software means that you, as a user, have four essential freedoms: (0) to run the program as you wish, (1) to study and change the source code so it does what you wish, (2) to redistribute exact copies, and (3) to redistribute copies of your modified versions."

FOSS is much more than just free of cost. It gives you control over the software. You are not beholden to another who may put undesirable restrictions on your use of the software. This is especially important when you will typically invest a large amount of time and resources in configuring the software, training people to use the software and entering data into the software. You don't want to wake up and find that someone else has put restrictions on the software preventing you from using it as you wish.

Joel goes on to say:
"So what part of “free” doesn’t global development understand?  As it turns out, quite a lot — because international development consultants often use the word “free” when recommending systems that are VERY expensive to implement (usually open-source systems)!  Even worse, they seem to think it’s the end-users’ fault if they don’t understand that when a consultant says “free”, they mean “it will require many consultants”."

I can only say to Joel, "So what part of "free" doesn't Joel understand?"

As an international development consultant, I have never misled people about the cost of software. I often recommend FOSS software and I always point out that it will cost money to configure the software, train people and enter data. These costs for FOSS software are usually comparable to the costs for proprietary software (although they can be less since there is usually an active community of users willing to help).

Joel then goes on to compare the cost of his Magpi survey software with the DHIS2 (District Health Information System) software. This is an odd comparison since these software packages perform very different functions. Magpi (formerly Episurveyor) is survey software (designed to collect data from surveys) and DHIS2 is used at clinical facilities to keep track of clinic statistics and individual patient information and roll these numbers up to be analyzed for purposes of monitoring and evaluation of programs, management of services, planning and policy. It is much more complex software that performs many more functions. Really apples and oranges.

Anyway, if you look at Joel's cost comparisons, he is really trying to point out the difference in cost between setting up a server on site versus having external hosting ("software as a service" SAAS). Joel compares the cost of Magpi's "free" (there's that word again) service with the cost of setting up a server in the host country.
OK... a few obvious problems here right off the bat... First, Magpi's "free" service level on only free if you don't plan on using it very much. People who use the software a lot will have to pay $5,000 or $10,000 a year... so not really free. He then vastly overestimates the cost to set up a server for DHIS2. Since I recently contracted to have this done, I know exactly what it costs and it is no where near his inflated estimates. The cost was less than $20,000 for an on site consultant to install, configure and train people and the ongoing cost will be only a few thousand dollars a year (if that) for maintenance, upgrades, etc. Much less than the $50,000 a year he estimates.

It would be a much better to compare Magpi to a similar software, Open Data Kit ODK (opendatakit.org) which is also survey software similar to Magpi. ODK seems to have more features and a more active support community but for purposes of discussion, we'll assume they are equivalent. ODK Collect and Aggregate is FOSS and there are several options for SAAS hosting which are free FormHub (formhub.org)  and Enketo (enketo.org). There is also a nice link to Google App Engine if you want to host your own in the cloud. They are part of the JavaRosa project. Unlike Magpi, all of the code and services are "Free" as in free of cost for use and hosting or you can set up your own server.

Magpi says that it is "free" (of cost) but only for limited use. Anything more will cost you. Also, Magpi doesn't provide the source code and you can't host your own server so you don't have control of the software or your data so it is not free, but rather subject to Magpi's terms and conditions.

So, to get back to DHIS2 which was inexplicably compared to Magpi... Yes, you can run DHIS2 in a SAAS "cloud" and that is exactly what a number of countries are doing. Yes, it does cost something to run the cloud servers (but much less than Joel's estimates).

The software industry has long recognized that the "cost" of software purchase is only a part of the total cost of ownership. In fact, they have always promoted the idea of TCO studies when contemplating any software change. It would be naive to expect otherwise.  I don't know what set Joel off on his rant but it really doesn't do anyone any good to spread FUD.
Software has costs. FOSS software doesn't have a purchase price but does have many of the other costs of any software installation. FOSS software also gives you the freedom to control the software and your data which is the most important part of "free".

Free is a pretty basic concept and it's surprising that so many people get it wrong.

-----

Some additional background information from Wikipedia:
Technology deployment can include the following as part of TCO:
  • Computer hardware and programs
    • Network hardware and software
    • Server hardware and software
    • Workstation hardware and software
    • Installation and integration of hardware and software
    • Purchasing research
    • Warranties and licenses
    • License tracking - compliance
    • Migration expenses
    • Risks: susceptibility to vulnerabilities, availability of upgrades, patches and future licensing policies, etc.
  • Operation expenses
    • Infrastructure (floor space)
    • Electricity (for related equipment, cooling, backup power)
    • Testing costs
    • Downtime, outage and failure expenses
    • Diminished performance (i.e. users having to wait, diminished money-making ability)
    • Security (including breaches, loss of reputation, recovery and prevention)
    • Backup and recovery process
    • Technology training
    • Audit (internal and external)
    • Insurance
    • Information technology personnel
    • Corporate management time
  • Long term expenses
    • Replacement
    • Future upgrade or scalability expenses
    • Decommissioning
In the case of comparing TCO of existing versus proposed solutions, consideration should put towards costs required to maintain the existing solution that may not necessarily be required for a proposed solution. Examples include cost of manual processing that are only required to support lack of existing automation, and extended support personnel.