In this post the sample Apache Pig script will handle nulls (in commission column) and display all Employee records with 0.0 for commission when it is null
Using Apache Pig version r0.15.0.
@ Test data structure:
Please refer to APACHE PIG ~ ALL SAMPLE TABLES and STRUCTURES post for the file structures, visit the reference section shown at the bottom of the post for more.
@ Sample data:
Employees data table:
@ Apache Pig Script:
a) List 0.0 value for commission when it has nulls: *** Note in above Emp table, the commission column has all nulls when there is no value available ***
grunt>
data = LOAD 'Documents/store_transactions.txt' using PigStorage(',') as (empno:int, ename:chararray, job:chararray, mgr:int, hiredate:chararray, sal:float, comm:float, deptno:int);
/* Handling all nulls in the commission column */
all_recs = FOREACH data GENERATE empno,ename,job,mgr,hiredate, sal,(comm is not null ? comm:0 ), deptno;
dump all_recs;
@Apache Pig Output on Grunt Shell: The output has managed nulls from commission column replaced with 0.0 value
(7369,SMITH,CLERK,7902,1980-12-17,800.0,0.0,20)
(7499,ALLEN,SALESMAN,7698,1981-02-20,1600.0,300.0,30)
(7521,WARD,SALESMAN,7698,1981-02-22,1250.0,500.0,30)
(7566,JONES,MANAGER,7839,1981-04-02,2975.0,0.0,20)
(7654,MARTIN,SALESMAN,7698,1981-09-28,1250.0,1400.0,30)
(7698,BLAKE,MANAGER,7839,1981-05-01,2850.0,0.0,30)
(7782,CLARK,MANAGER,7839,1981-06-09,2450.0,0.0,10)
(7788,SCOTT,ANALYST,7566,1982-12-09,3000.0,0.0,20)
(7839,KING,PRESIDENT,,1981-11-17,5000.0,0.0,10)
(7844,TURNER,SALESMAN,7698,1981-09-08,1500.0,0.0,30)
(7876,ADAMS,CLERK,7788,1983-01-12,1100.0,0.0,20)
(7900,JAMES,CLERK,7698,1981-12-03,950.0,0.0,30)
(7902,FORD,ANALYST,7566,1981-12-03,3000.0,0.0,20)
(7934,MILLER,CLERK,7782,1982-01-23,1300.0,0.0,10)
@ Apache Pig Reference/s:
- https://pig.apache.org
- http://pig.apache.org/docs/r0.15.0/
Thank you!
0 comments:
Post a Comment