In this post the sample Apache Pig script will List employees who joined in Before or After 1981
Using Apache Pig version r0.15.0.
@ Test data structure:
Please refer to APACHE PIG ~ ALL SAMPLE TABLES and STRUCTURES post for the file structures, visit the reference section shown at the bottom of the post for more.
@ Sample data:
Employees data table:
@ Apache Pig Script:
a) List employees who joined in Before or After 1981:
grunt>
data = LOAD 'Documents/tbl_EMP.txt' USING PigStorage(',') as (empno:int, ename:chararray, job:chararray, mgr:int, hiredate:chararray, sal:float, comm:float, deptno:int);
all_recs = foreach data generate empno,ename,job,mgr,hiredate,(int)GetYear(ToDate(hiredate, 'yyyy-M-dd')) as yrs, sal, comm,deptno;
rec_fltr = filter all_recs by (yrs < 1981 OR yrs > 1981) ;
rec_ordr = order rec_fltr by yrs;
dump rec_ordr;
@Apache Pig Output on Grunt Shell:
(7369,SMITH,CLERK,7902,1980-12-17,1980,800.0,,20)
(7934,MILLER,CLERK,7782,1982-01-23,1982,1300.0,,10)
(7788,SCOTT,ANALYST,7566,1982-12-09,1982,3000.0,,20)
(7876,ADAMS,CLERK,7788,1983-01-12,1983,1100.0,,20)
----------------------------------------------------------------------------------------------------------------------------------------------------------
@ Apache Pig Reference/s:
- https://pig.apache.org
- http://pig.apache.org/docs/r0.15.0/