22 Ekim 2014 Çarşamba

Hadoop Versions Explained

There is confusion around Hadoop versions. You can see versions like 0.20+, you have 1.0+ versions and you have 2.0+ versions. We have old MR1 and new MR2 with YARN. You can see the list of Hadoop releases in Apache release page. In this page, you can see there are releases spanning from 0.23+ to 1.0+ and to 2.0+ releases.

There are few blog posts that tries to explain evolution of Hadoop version tree. Most complete ones are  here and here. The author has multiple posts that reflect new versions and we can expect to see more updates on his blog. Other good posts that explain the history of versions are here from Cloudera's blog and here from Hortonworks's blog.

To summarize the version tree:

  • Version 0.20.205 is renamed as Hadoop 1.0.0. Later 1.0+ releases continues from here. This provides old MR1 API.
  • Version 0.22 serves as the basis for Hadoop 2.0+ releases. This provides new MR2 API and YARN.
  • Version 0.23 continues to get releases in its own tree. As I understand, this version gets only point releases and does not implement any new feature.
I hopes this helps to clear the confusion.







Hiç yorum yok:

Yorum Gönder