How Apache Ranger and Chuck Norris help secure Hadoop

The Hadoop ecosystem has always been a bag of parts -- at least it was before Apache Ranger came to town

The Hadoop security project called Ranger supposedly was named in tribute to Chuck Norris in his "Walker, Texas Ranger" role. The project has its roots in XA Secure, which was acquired by Hortonworks, then renamed to Argus before settling in at the Apache Software Foundation as Ranger.

When Hadoop started, it was a set of loosely coupled parts primarily used in the back end of the big Internet companies like Yahoo. These parts were wrapped into distributions and marketed as Hadoop by the likes of MapR, Cloudera, and Hortonworks.

Such piecemeal architecture isn't unusual in the world of open source or even in the wide world of commercial software. It does, however, result in security challenges. Some will read this as "it's insecure," but that isn't necessarily the case -- though it can be. The problem is more how do you authenticate users to all parts of this system of parts -- and once you authenticate them how do you authorize them to do only what you mean to allow them to do?

Each part of Hadoop has its own LDAP and Kerberos authentication, as well as its own means and rules of authorization (and in most cases totally separate implementations of the same). This means you get to configure Kerberos or LDAP to each individual part, then define those rules in each separate configuration. What Apache Ranger does is provide a plug-in to each of these parts of Hadoop and a common authentication repository, as well as allow you to define policies in a centralized location.