Feb 15, 2016

Apache Pig Exercises: 22. List total sales per departmental store wise and consumer wise



In this post the sample Apache Pig script will display the total sales with respect to either departmental stores or with consumers.

The examples and exercise scripts are created using Apache Pig current version r0.14.0.

@ Test data structure and sample data:

person,  dstore, spent
A, S, 3.3
A, S, 4.7
B, S, 1.2
B, T, 3.4
C, Z, 1.1
C, T, 5.5
D, R, 1.1

@ Apache Pig Script:

a) List total sales per department stores:

grunt> 
data = LOAD 'Documents/store_transactions.txt' using PigStorage(',') as (person:chararray, dstores:chararray, spent:float);

grp = FOREACH (GROUP data BY dstores) {
/* dereference is required to original structure columns */
GENERATE group, COUNT(data.person) AS visitors, (FLOAT)SUM(data.spent) AS revenue; 
};

dump grp;

@Apache Pig Output on Grunt Shell: The  output has dept. store, customer count, total sales.

( R,1,1.1)
( S,3,9.2)
( T,2,8.9)
( Z,1,1.1)


@ Apache Pig Script:

b) List total sales per customer:

data = LOAD 'Documents/store_transactions.txt' using PigStorage(',') as (person:chararray, dstores:chararray, spent:float);

grp = FOREACH (GROUP data BY person) {
GENERATE group, COUNT(data.person) AS visitors, (FLOAT)SUM(data.spent) AS revenue; -- dereference is required to original structure columns 
};

dump grp;

@Apache Pig Output on Grunt Shell: The  output has customer id, total transactions per customer, total sales.

(A,2,8.0)
(B,2,4.6000004)
(C,2,6.6)
(D,1,1.1)

_________________
Thank you!

0 comments:

Post a Comment