Description:
Work in teams to perform data ingestion from disparate sources, develop complex event processing using Hadoop framework, review the data quality and data definitions, and perform data cleansing and data management tasks
Role Summary Description:
· Support client engagements focused on Data and Advanced Business Analytics in diverse domains such as solution development
· Develop spark applications/ map reduce jobs
· Develop streaming/real-time based complex event processing on Hadoop framework
· Interface with different databases (SQL, NoSQL).
· Manage data quality, by reviewing data for errors or mistakes from data input, data transfer, or storage limitations
· Perform data management to ensure data definitions, lineage and source are suitable for analysis
· Work in a multidisciplinary team to understand available data sources, needs, and downstream uses
Functional/Technical Skills:
· Bachelor or higher degree in computer science, information systems, or business
· Bring 1-4 years of software engineering and D&A solution development experience
· Experience with AWS EMR, Redshift, AWS Data pipeline, AWS Glue & PySpark , Pig/Hive on any Hadoop distribution (HDP/CDH/MapR).
· Proficient in SQL, in addition to one or more of modern programming language such as Java, Scala, Python etc
· Experience working with databases (SQL, NoSQL)
· Experience working with data validation, cleaning, and munging
· Experience in one of cloud providers AWS, Azure or GCP
Qualifications
Role Overview: Work in teams to perform data ingestion from disparate sources, develop complex event processing using Hadoop framework, review the data quality and data definitions, and perform data cleansing and data management tasks
Role Summary Description:
· Support client engagements focused on Data and Advanced Business Analytics in diverse domains such as solution development
· Develop spark applications/ map reduce jobs
· Develop streaming/real-time based complex event processing on Hadoop framework.
· Interface with different databases (SQL, NoSQL)
· Manage data quality, by reviewing data for errors or mistakes from data input, data transfer, or storage limitations
· Perform data management to ensure data definitions, lineage and source are suitable for analysis
· Work in a multidisciplinary team to understand available data sources, needs, and downstream uses
Functional/Technical Skills:
· Bachelor or higher degree in computer science, information systems, or business.
· Bring 1-4 years of software engineering and D&A solution development experience
· Experience with AWS EMR, Redshift, AWS Data pipeline, AWS Glue & PySpark , Pig/Hive on any Hadoop distribution (HDP/CDH/MapR)
· Proficient in SQL, in addition to one or more of modern programming language such as Java, Scala, Python etc
· Experience working with databases (SQL, NoSQL)
· Experience working with data validation, cleaning, and munging
· Experience in one of cloud providers AWS, Azure or GCP
Kpmg is looking for B.E. / B.Tech profile candidates.
Short Job Information