In this post the sample Apache Pig script will display the total sales with respect to either departmental stores or with consumers.
The examples and exercise scripts are created using Apache Pig current version r0.14.0.
@ Test data structure and sample data:
person, dstore, spent
A, S, 3.3
A, S, 4.7
B, S, 1.2
B, T, 3.4
C, Z, 1.1
C, T, 5.5
D, R, 1.1
@ Apache Pig Script:
a) List total sales per department stores:
grunt>
data = LOAD 'Documents/store_transactions.txt' using PigStorage(',') as (person:chararray, dstores:chararray, spent:float);
grp = FOREACH (GROUP data BY dstores) {
/* dereference is required to original structure columns */
GENERATE group, COUNT(data.person) AS visitors, (FLOAT)SUM(data.spent) AS revenue;
};
dump grp;
@Apache Pig Output on Grunt Shell: The output has dept. store, customer count, total sales.
( R,1,1.1)( S,3,9.2)
( T,2,8.9)
( Z,1,1.1)
@ Apache Pig Script:
b) List total sales per customer:
data = LOAD 'Documents/store_transactions.txt' using PigStorage(',') as (person:chararray, dstores:chararray, spent:float);
grp = FOREACH (GROUP data BY person) {
GENERATE group, COUNT(data.person) AS visitors, (FLOAT)SUM(data.spent) AS revenue; -- dereference is required to original structure columns
};
dump grp;
(A,2,8.0)
(B,2,4.6000004)
(C,2,6.6)
(D,1,1.1)
_________________
Thank you!
0 comments:
Post a Comment