Friday, August 16, 2019

Apache Pig Run Modes

Apache Pig executes in two modes: Local Mode and MapReduce Mode.

Apache Pig Run Modes

Local Mode

  • It executes in a single JVM and is used for development experimenting and prototyping.
  • Here, files are installed and run using localhost.
  • The local mode works on a local file system. The input and output data stored in the local file system.
The command for local mode grunt shell:
  1. $ pig-x local    

MapReduce Mode

  • The MapReduce mode is also known as Hadoop Mode.
  • It is the default mode.
  • In this Pig renders Pig Latin into MapReduce jobs and executes them on the cluster.
  • It can be executed against semi-distributed or fully distributed Hadoop installation.
  • Here, the input and output data are present on HDFS.
The command for Map reduce mode:
  1. $ pig    
Or,
  1. $ pig -x mapreduce  

Ways to execute Pig Program

These are the following ways of executing a Pig program on local and MapReduce mode: -
  • Interactive Mode - In this mode, the Pig is executed in the Grunt shell. To invoke Grunt shell, run the pig command. Once the Grunt mode executes, we can provide Pig Latin statements and command interactively at the command line.
  • Batch Mode - In this mode, we can run a script file having a .pig extension. These files contain Pig Latin commands.
  • Embedded Mode - In this mode, we can define our own functions. These functions can be called as UDF (User Defined Functions). Here, we use programming languages like Java and Python.

No comments:

Post a Comment

Lab 09: Publish and subscribe to Event Grid events

  Microsoft Azure user interface Given the dynamic nature of Microsoft cloud tools, you might experience Azure UI changes that occur after t...