Hadoop Developer - Comcast
New Jersey - Email me on Indeed: indeed.com/r/Pravarshi-R/8269aeeecb18f151
Over 5+ years of programming experience with skills in analysis, design, development, testing and deploying for large scale distributed data processing using Hadoop, Pig, Hive and Java and other various software applications with emphasis on Object oriented programming.
About 5 years of work experience on Big Data Analytics.
Strong hands on experience with Big Data Technologies including Hadoop (HDFS & MapReduce), PIG, HIVE, HBASE, ZOOKEEPER, and SQOOP.
Developing Map-Reduce programs to perform Data Transformation.
Have hands on experience in writing Map Reduce jobs on Hadoop Ecosystem including Hive and Pig for different file formats like JSON, XML.
Have hands on experience in developing batch file processing in Hadoop.
Hands on experience in installing, configuring and using ecosystem components like Hadoop MapReduce, HDFS, Pig, Hive, and Sqoop.
Has experience in High performance computing.
Experience with distributed systems, large-scale non-relational data stores, mapreduce systems, data modeling, and big data systems.
Experience in installing, configuring and administrating Hadoop cluster for major Hadoop distributions.
Experience in working with Hadoop in Standalone, pseudo and distributed modes.
Hands-on experience in products development with Hadoop applications.
Experience in working with NoSQL databases like HBase.
Importing and exporting data from different databases like MySQL, Oracle into HDFS and Hive using Sqoop.
Hands-on experience in writing Pig Latin scripts, working with grunt shells and job scheduling with Oozie.
In-depth understanding of Data Structure and Algorithms.
Strong Communication skills of written, oral, interpersonal and presentation.
Implemented Unit Testing using JUNIT and MRUNIT testing during the projects.
Strong desire and ability to perform at a high level for a fast-paced, flexible environment.
Excellent analytical, problem solving, communication and interpersonal skills with ability to interact with individuals at all levels and can work as a part of a team as well as independently.
A quick learner organized and highly motivated as well as a keen interest in the emerging technologies. Willing to relocate: Anywhere
Comcast - Philadelphia, PA - January 2015 to Present
Comcast is implementing a Data Quality project. The purpose of the project is cleansing of data. The raw data is splitted into chunks of data and then it is keyed and it is validated for the data quality. Once the data is validated it is
Then used for analysis. The goal of this project was to analyze the overall ratings of one of the company's product.
Developed various data cleansing features like Schema validation, Row Count and data profiling using mapreduce jobs.
Created hive tables for storing the logs, whenever a map reduce job is executed.
Created a hive aggregator to update the hive table after running the data profiling job.
Extracted data from Teradata to HDFS using sqoop.
Analyzed the data by performing Hive queries
Implemented Partitioning, Dynamic Partitioning and Bucketing in HIVE.
Developed Hive queries to process the data and generate the data cubes for visualizing.
Environment: Hadoop Yarn architecture, MapReduce, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Oozie, Java (jdk 1.7), Eclipse.
Comcast - Philadelphia, PA - January 2014 to October 2014
Purpose of the project is to create Enterprise Data Hub so that various business units and use the data from Hadoop to do Data Analytics. The solution is based on the Cloudera Hadoop. The data will be stored in Hadoop file system and processed using Map/Reduce jobs. Tasked with creating a solution to analyze volumes of stocks traded for a potential