Friday, September 23, 2011

Making Data Public

What sort of information should be made public by organizations?  The question, phrased like this, means what type of information is displayed on the company's web site?  Information for the web has been prepackaged, so to speak, for consumption by human readers.  Perhaps the more interesting question is this — what I'm interested in anyway — what type of raw data should be exposed through an API?  Who would use such data and for what purpose?

With pre-formatted information, it's difficult to make sense of it in large volumes.  Google does this for their search index by crawling monumental assemblages of HTML pages and other web resources.  But for those of us who don't have enormous computing centers and the software capable of deciphering these data sets, we need something a little more primitive.  Something that'll let our software draw it's own conclusions about public data.

Who wants to expose their data?
The internet changed things for all organizations — large and small.  Having a place on the web is no longer a nice-to-have — you simply cannot compete without one.  For one thing, consumers have come to expect this — a place where they can gather information on products and or services.  Also, the web is a social animal now.  Companies need a social presence on the web if they're to engage with their customers.

This is just web technology — read-only web technology.  Information on what the business does, how they're a cut above the competition, and so forth.  Just the ignition for the social correspondence that follows.  I won't bother going into the whole social end of it because it's fairly obvious — it's important.

What isn't so obvious, however — how the specific information organizations publish on their websites is chosen.  Obviously, publishing damaging information will come back to haunt the organization commodiously.  Hence the reluctancy to publish anything at all.  So why take the chance in making more information available to consumers when it's much safer to make less data available?

A competitive edge
Making data public — useful data — will absolutely give your organization a competitive advantage.  That of course assumes that you have both interesting data to make available and the know-how to design an API that other software makers can accommodate.  If you've got those two things — you're ahead of the game because you're enabling third party software to be developed on your behalf.  This software directly impacts how existing and potential customers connect with your organization.

Take Apple's App Store for instance.  There is obviously no shortage of applications available for users to install and use.  In fact, it is amazingly imbalanced — the number of applications available in the App Store relative to other device maker's software markets.

Developers who make these applications aren't doing it because of the interesting data Apple is exposing through an API — they're doing it because of the iPhone's popularity.  There is a much larger potential user base.  So how does this relate to making your data available to the general public through an API?  Because this is the type of following you want from developers.  How useful would the iPhone be without plethora third-party applications?  It would still be great device, but it would be missing a lot of things that users want.

Another aspect to the competitive advantage of having third-party software built for you — they'll use your data in ways you haven't thought of.  Could Apple really have thought of all the applications available for their devices?  Probably, but not in a time frame of just a few years.

Fear of over-exposure
If our companies weren't scared of exposing data items they shouldn't, they probably would have done it years ago, right?  I don't think that's necessarily true because we're only still discovering the neat things we can build with other organization's data.

How do you protect yourself from exposing too much information, perhaps leading to a competitive edge for your competition?  I'll tell you what won't protect you from it — not making an API for interesting information.  The reality is, it's easy to spot weaknesses — every company in the world has at least one.  If the competition is smart enough to exploit these weaknesses, they won't need an API to do it.

So listen to your customers — what kind of applications would they find useful?  Or, more generally, what information would they find beneficial?  Expose that information through an API.  Developers will build cool things if the data is valuable.  Who knows what new ideas these applications will in turn bring forth.