science_fingerprintAs close as five years ago, many of the identity products on the web were built off of proprietary, one-use systems that fit the needs of the site or service that was implementing them but nothing else. The main purpose of these products was simply to be able to log a person in and associate actions and configuration settings to that given user.
Much changed over the years since then. The concept of basic auth, passing a username and password through an HTTP request became a very popular methodology for a company to extend its identity influence to application integrators off of its own platform; Twitter was a very popular service that implemented this system for its authentication. From this basic premise, services that abstracted out the authentication credentials of a user became more popular, like OpenID. At the same time authorization systems (allowing a user to give an application permission to access their details and do things on their behalf) also started targeting major holes in the identity industry. The specification market on identity and identity mining services began to grow.
Flash forward to the present, where brick and mortar stores like Target are using the information they mine from their clients to determine purchasing profiles and appropriate products for them, the identity industry has changed dramatically.
In this post we will look at many of the current and historical trends in the industry from an implementers point of view, coming from years in the industry and from thousands of developer integrations onto identity systems, exploring where we are sitting in the world of identity and identity data mining.
The Slow Downtrends of OpenID and OAuth 1.0a
OpenID was a truly wonderful concept at one time, to build an authentication system that abstracts out user credentials and not having the need for a service to maintain heavy-weight identity databases themselves, allowing integrators to leverage off of the vast identity databases of major OpenID providers like Yahoo, Google and many others who had massive numbers of existing users.
At the same time, OAuth 1 (eventually patched to the currently used OAuth 1.0a specification after a session fixation attack vector was discovered) gained popularity as a main method for applications to provide authorization capabilities, again allowing users to give an application or service the ability to do things on their behalf. This was an incredibly popular premise during the heydays of the social application and platform.
Speaking from the perspective of a person who integrated hundreds of partners and developers from all sorts of different development backgrounds, I can easily say that the biggest problem with these services was their complexity of integration and the average amount of time that they would take to build out. The services were so complex for regular integrators that they would tend to use step-by-step guides to build their auth platforms on these services and not really understand how they worked in the least, let alone be able to repair issues due to a tendency of providers to have incredibly poor error messaging.
The only saving grace for many OpenID/OAuth providers was the use of SDKs and libraries to abstract out the complexities of the products. This is a method that Twitter uses for it's current OAuth 1.0a integration. The problem here is that it takes either a large amount of engineering resources to create and maintain them yourselves, or a very strong community that creates them for you (which is incredibly rare).
Through my personal experiences, what I saw happen more often than not was the creating of SDKs in just a few languages, and very lackluster support of them. What would happen from that point on would be that developers would have a need for support in other languages, and invest a significant amount of time building a one off integration specifically designed for their company using different libraries and samples that may or may not be up-to-date with their needs. Many of these integrations by partners and developers would tend to take weeks of time and would require engineers with a firm understanding of the technology.
In the end, there are a number of reasons why we are seeing the decline in use of OpenID and OAuth 1.0a:
- The integrations and specification were too complex for most people to implement. Sure the specification was solid from a security standpoint, but point is there in having a secure specification if no one wanted to use it.
- OpenID was treated as a second class citizen. Most of the major OpenID providers would tend to hide their integrations deep in the documentation. For many people simply authenticating a user and gaining a small amount of profile information wasn't everything they wanted. This led to several companies building out implementations of the OpenID / OAuth hybrid specification to try to combine both authentication and authorization mechanisms into a comprehensive identity and social product. What it was in reality was taking the complexities of two very complicated specifications and making people understand both, with a twist.
- Other specifications came into the market that simplified the approach to authentication and authorization.
In Support of the OAuth 2 Premise
Many of you may have seen posts in the past about some of the issues with the OAuth 2 specification and the team itself. While I agree with some of the technical points about the problems with OAuth 2 (such as the lack of bearer tokens), and the compromises that the specification took to appease certain companies, OAuth 2 gave us something that OAuth 1.0a never did - a simple integration.
I started working with OAuth 2 shortly after Facebook first integrated it into its open graph. What I found shocking was how I was able to start looking at the documentation and have an application created and the entirety of the OAuth 2 integration code written from scratch within 15 minutes. The first time I tried to integration OAuth 1 my integration took significantly more time, headaches and promoted more angry outbursts. It was then that I knew that a huge part of my job working with developers for their auth integrations was rendered moot, and I couldn't have been happier!
What gave OAuth 2 its real lift was the fact that Facebook began supporting it heavily during the draft days (before the specification was even finalized). Many other companies followed this example, like Gowalla, and saw how this increase in adoptability of their service meant that there was no longer a need for extensive support of the product, which freed up numerous highly technical resources to do other things.
Working at PayPal, I have had a chance to work with OAuth 2 in numerous integrations due to its integration into PayPal Access, which provides a business and merchant auth solution by delivering a mechanism for logging in a user with their PayPal account. The user information obtained is verified and business oriented in nature, as opposed to the social sign-in mechanisms that provide user-curated profile information that isn't valuable for the nature of merchant or business transactions.
What I've noticed in these developer integrations is that I have to do next to no teaching, development and hands-on integrations beyond providing them with a link to some simple code on Github that I created. The samples are so simple that they can be replicated in any language quickly and easily.
Even though OAuth 2 provided developers with a very simplified approach to integration, there have been a few issues that have arisen. During the days of OAuth 1.0a and OpenID there was a clear differentiation between authentication (logging a person in) and authorization (giving a service permission to do things on your behalf) technologies. OpenID Connect was supposed to serve as the successor to the authentication service of OpenID, but it seemed at times that the working group and companies could not come to any agreements or valuable completion of the service, which left OAuth 2 alone in the space. In Facebook's early adoption, as well as many early integrations, OAuth 2 became used as a dual auth system, which meant that identity was deeply bound in the context of an application.
From the integration I have been involved in within PayPal Access, many integrators are looking for a more divorced auth flow, or at least the ability to use the service for true authentication capabilities, such as the simple concept of logging out a user instead of revoking access to an application. Many of the failing points of the integration of OAuth 2 came down to simple confusion from integrators on what the capabilities of OAuth 2 should and could be.
Building on an Authentication Base with OpenID Connect
Earlier in 2012, OpenID Connect was finally launched out of draft mode and started to be integrated with early adopters like PayPal (into PayPal Access). Whatever the project was originally supposed to be, what it became, in its simplest form, was an enhancement on top of OAuth 2 to supply much needed identity endpoints and capabilities for core authentication features that were requested in OAuth 2 integrations. The OAuth 2 and OpenID Connect specifications were so easily interchangeable that they could be done on the fly with little effort.
The OpenID Conenct specification has been criticized by some for being overly complicated due to that fact that it had to be split up into six separate specifications, and that it had perverted its original idea by appeasing too many individual opinions.
What OpenID Connect has granted us at PayPal is a mechanism for providing full authentication and authorization capabilities to allow our users to use the product in both an application and a sign-in sense, but not to have to bind themselves to both in order to get one.
What's Left of Some Other Interesting Identity Data Mining Techniques
During the age of the big players in the identity space, there was also room for many other identity and identity data mining technologies to be used. There were two in the industry that I had worked with quite a bit that still have some measure of usability in this current day and age of identity, WebFinger and the Open Graph Protocol.
If any of you are old enough to remember the golden age of computing, you may be familiar with the use of the finger protocol. It was a method for gaining insights about a user (typically within a company) by using their e-mail address as the identifier (instead of a username and password). This allowed you to look up any information on a person that they deemed as publically accessible. WebFinger attempted to bring this concept to the web, allowing e-mail addresses to be used as sources of public identity information for users.
Google and Yahoo! were two major companies that implemented the specification, even though their implementations were very well hidden. The greatest boon for identity data miners with this service was that they could look up information about users, including valuable insights such as a portable contacts data object, without the person having to even give you express permission to do so.
In the world of identity today, WebFinger is pretty well defunct. The interesting thing though is that Google still has a valid integration that allows you to pull down portable contact information on people via their e-mail address. Information on how to do this is available from one of my previous presentations (slides 35-40).
Open Graph Protocol
The Open Graph Protocol is an open source service that is heavily influenced by Facebook. Its primary use is to provide relevant meta information about a web page that a service can then mine to provide sharable objects. I know what most of you are thinking, what does this have to do with identity? Well, the trends and future of identity (as we can see in the case of Target at the beginning of the article) is all around mining relevant information about a person based on the things that they interact with online. This allows us to build entity relationships for a user to help us automatically determine what their interests are.
Meta data services have never really found any foothold on the web, until the open graph protocol. What made it different was the fact that when embedding Facebook services like the Like and Share buttons, Facebook had you add open graph meta tags to your page to allow its services to mine the information that you deemed as relevant for the page.
What's great about this integration is that meta data is freely mineable, and with the extent of reach that Facebook has on websites, there is now a wealth of relevant entity information that can be mined on user interaction. Over time, the open graph protocol has become a very valuable tool for identity entity data mining, and scraping the data is very simple and straightforward, as shown in this code sample.
Browser Integration Trends
One of the current newer features in the realm of identity comes to us from Mozilla, through their Persona project. Based on the BrowserID protocol from the early days, Mozilla has developed a whole identity service in Persona that is meant to transform the way that identity is thought of.
The concept is really interesting, and the fact that they are trying to take identity to a new level really speaks volumes about what's happening in the identity industry.
This simplification is not without its issues though. For truly useful user information to be automatically obtainable through Persona, the protocol needs to leverage off of the identity systems of the e-mail providers that the user is using to sign-in. This is the concept that the world has seen as the standard since the adoption of basic auth, OpenID, OAuth and the like. With that said, I just don't see that happening in the near term.
The Future of Identity
The trends we are seeing in the industry around identity, some of which have been alluded to in this article, have gone from the simple mechanisms of sign-on to the more complex mechanisms of identity data mining through entity associations. Personality and likeness profiles are being built by computing the actions and interactions that a person makes on a site or service, seamlessly and without their involvement in personalization.
All of our technology advancements have been leading us to this point. Simplification and abstraction of the technology with a desire on the user side for personalized, easy experiences. What we're seeing now are true advancements in the data mining and analysis to drive true relevance for users.
Image Credit: Phillip Martin
Jonathan LeBlanc (@jcleblanc) is an Emmy award winning software engineer, developer evangelist at PayPal, and author of the OReilly book "Programming Social Applications". He blogs at www.nakedtechnologist.com and you can see his Github contributions here.