I got introduced to Pentaho Kettle very recently and immediately excited to get my hands dirty. I have been playing around with Hadoop eco-system for quite a while now and Pentaho for BigData drew my attention.
It took me quite a while to get a hang of the conepts (well not that long, its 2 days!!!).
I already had Hadoop-1.1.1 cluster running and my job now is to integrate Kettle with the already running Hadoop. The Spoon UI that is bundled with Kettle helps to design jobs and transformations. I had real hard times in getting the spoon UI opened on my AWS EC2 Ubuntu instance. Well, i do not want to talk about those issues here and its still not solved :-(. It may be worth a separate post once i have the solution.
In short, i suspect that it could be video graphics driver issue.
Spoon would have taken care of my need end to end - from designing the jobs and transformations to running them. But unfortunately the ubuntu issue forced me to use Spoon on windows and use the generated kjb and ktr files on Ubuntu. Well, Kettle comes with very useful scripts to run jobs and transformation (pan and kitchen respectively). Cool, atleast i could integrate kettle with HDFS and Hive successfully.
Using Spoon from windows has its caveats. Certain design steps would try to connect to the running instance of hadoop, hbase etc, which in my case is not possible as my windows PC reside in the private network.
After 2-3 days of struggle, i am atleast happy that few things worked.
It took me quite a while to get a hang of the conepts (well not that long, its 2 days!!!).
I already had Hadoop-1.1.1 cluster running and my job now is to integrate Kettle with the already running Hadoop. The Spoon UI that is bundled with Kettle helps to design jobs and transformations. I had real hard times in getting the spoon UI opened on my AWS EC2 Ubuntu instance. Well, i do not want to talk about those issues here and its still not solved :-(. It may be worth a separate post once i have the solution.
In short, i suspect that it could be video graphics driver issue.
Spoon would have taken care of my need end to end - from designing the jobs and transformations to running them. But unfortunately the ubuntu issue forced me to use Spoon on windows and use the generated kjb and ktr files on Ubuntu. Well, Kettle comes with very useful scripts to run jobs and transformation (pan and kitchen respectively). Cool, atleast i could integrate kettle with HDFS and Hive successfully.
Using Spoon from windows has its caveats. Certain design steps would try to connect to the running instance of hadoop, hbase etc, which in my case is not possible as my windows PC reside in the private network.
After 2-3 days of struggle, i am atleast happy that few things worked.
No comments:
Post a Comment