Thursday, May 10, 2012

Using maven to configure a hadoop project.

In this post, I plan to describe how and benefits of using maven to configure a hadoop project.

First comes the benefits: There are many described here, but it benefited me in writing testcode, creating and deploying jars using one single command.

Now the how part: The first thing is to be execute the following command which creates the project directory (along with package structure and pom.xml)
mvn archetype:generate -DgroupId=com.mycompany.example1 -DartifactId=example1 -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

Notice a example1 project would be created (which matches articfactId) and a package hierarchy inside src/com/mycompany/example1 will be present (this matches the groupId)

Now open the pom.xml inside example1 and the following inside <dependencies></dependencies> tags.
<dependency>
  <groupId>junit</groupId>
  <artifactId>junit</artifactId>
  <version>3.8.1</version>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>org.apache.hadoop</groupId>
  <artifactId>hadoop-core</artifactId>
  <version>1.0.1</version>
</dependency>
<dependency>
  <groupId>org.jmock</groupId>
  <artifactId>jmock-junit4</artifactId>
  <version>2.5.1</version>
</dependency>
<dependency>
  <groupId>org.jmock</groupId>
  <artifactId>jmock-legacy</artifactId>
  <version>2.5.1</version>
</dependency>

Specifying above dependencies inside POM.xml (project object model) will grab the required jars (in this case, will pull hadoop-core 1.0.1, junit 3.8.1, jmock 2.5.1 respectively from the maven repository)

Now we are all set, using your favorite IDE, start your coding!

Check the directory structure using maven dependency tree.
mvn dependency:tree 
The above project should be as follows
[INFO] [dependency:tree {execution: default-cli}]
[INFO] com.xoom.example3:example3:jar:1.0-SNAPSHOT
[INFO] +- junit:junit:jar:3.8.1:test (scope not updated to compile)
[INFO] +- org.apache.hadoop:hadoop-core:jar:1.0.1:compile
[INFO] |  +- commons-cli:commons-cli:jar:1.2:compile
[INFO] |  +- xmlenc:xmlenc:jar:0.52:compile
[INFO] |  +- commons-httpclient:commons-httpclient:jar:3.0.1:compile
[INFO] |  |  \- commons-logging:commons-logging:jar:1.0.3:compile
[INFO] |  +- commons-codec:commons-codec:jar:1.4:compile
[INFO] |  +- org.apache.commons:commons-math:jar:2.1:compile
[INFO] |  +- commons-configuration:commons-configuration:jar:1.6:compile
[INFO] |  |  +- commons-collections:commons-collections:jar:3.2.1:compile
[INFO] |  |  +- commons-lang:commons-lang:jar:2.4:compile
[INFO] |  |  +- commons-digester:commons-digester:jar:1.8:compile
[INFO] |  |  |  \- commons-beanutils:commons-beanutils:jar:1.7.0:compile
[INFO] |  |  \- commons-beanutils:commons-beanutils-core:jar:1.8.0:compile
[INFO] |  +- commons-net:commons-net:jar:1.4.1:compile
[INFO] |  +- org.mortbay.jetty:jetty:jar:6.1.26:compile
[INFO] |  |  \- org.mortbay.jetty:servlet-api:jar:2.5-20081211:compile
[INFO] |  +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile
[INFO] |  +- tomcat:jasper-runtime:jar:5.5.12:compile
[INFO] |  +- tomcat:jasper-compiler:jar:5.5.12:compile
[INFO] |  +- org.mortbay.jetty:jsp-api-2.1:jar:6.1.14:compile
[INFO] |  |  \- org.mortbay.jetty:servlet-api-2.5:jar:6.1.14:compile
[INFO] |  +- org.mortbay.jetty:jsp-2.1:jar:6.1.14:compile
[INFO] |  |  \- ant:ant:jar:1.6.5:compile
[INFO] |  +- commons-el:commons-el:jar:1.0:compile
[INFO] |  +- net.java.dev.jets3t:jets3t:jar:0.7.1:compile
[INFO] |  +- net.sf.kosmosfs:kfs:jar:0.3:compile
[INFO] |  +- hsqldb:hsqldb:jar:1.8.0.10:compile
[INFO] |  +- oro:oro:jar:2.0.8:compile
[INFO] |  +- org.eclipse.jdt:core:jar:3.1.1:compile
[INFO] |  \- org.codehaus.jackson:jackson-mapper-asl:jar:1.0.1:compile
[INFO] |     \- org.codehaus.jackson:jackson-core-asl:jar:1.0.1:compile
[INFO] +- org.jmock:jmock-junit4:jar:2.5.1:compile
[INFO] |  +- org.jmock:jmock:jar:2.5.1:compile
[INFO] |  |  +- org.hamcrest:hamcrest-core:jar:1.1:compile
[INFO] |  |  \- org.hamcrest:hamcrest-library:jar:1.1:compile
[INFO] |  \- junit:junit-dep:jar:4.4:compile
[INFO] \- org.jmock:jmock-legacy:jar:2.5.1:compile
[INFO]    +- org.objenesis:objenesis:jar:1.0:compile
[INFO]    \- cglib:cglib-nodep:jar:2.1_3:compile

No comments:

Post a Comment