The Hortonworks Blog

More from Julien Ledem

This was originally published on my blog; I’m re-posting it here on request from the fine people at Hortonworks.

1. Introduction

This a follow-up on my previous post about implementing PageRank in Pig using embedding. I also talked about this in a presentation to the Pig user group.

One of the best features of embedding is how it simplifies writing UDFs and using them right away in the same script without superfluous declarations.…

In this post I’m going to give a very simple example of how to use Pig; embedded in Python to implement the PageRank; algorithm. It goes in a little more details on the same example given in the presentation I gave at the Pig user meetup. On the same topic, Daniel published a nice K-Means implementation using the same embedding feature. This was originally published on my blog; I’m re-posting it here on request from the fine people at Hortonworks.…