Sr. Data Engineer

16 04 2024
:
Pune, India
:
IS&Digital
:
Regular
apply in romanian

Sr. Data Engineer

- - - - - - - - - - - -

A Data Engineer is a data professionals with extensive knowledge and expertise in “big data” technologies  and frameworks that works with Python/Scala/Java. They are software engineers who design, build, integrate data from various resources, and manage big data. Experienced in writing spark code with python or Scala, Sql queries, Linux/Unix scripting, deploying and maintaining databases.

  • She/He is should be good problem solver and proficient at least in any one of language – Java/Scala/Python.
  • She/He should be good in any relational database system along with hands on experience. Added advantage is having experience in NoSQL, Hive etc.
  • She/He also contributes to the elaboration of the data policy and the structuring of its life cycle within the regulatory framework in force, in collaboration with the Chief Data Officer.
  • Her/His intervention scope center's on application systems in the data management and processing domain, and on platforms such as Big Data, IoT, etc.
  • She/He is responsible for overseeing and integrating data of a variety of types originating from these different sources and confirms the quality of the Data entering the Data Lake (she/he receives data, deletes duplicates, etc.).
    • Captures the structured and unstructured data produced within different applications or outside the entity
    • Integrates the components
    • Structures the data (semantics, etc.)
    • Maps the available components
    • Cleans up the data (deleting duplicates, etc.)
    • Validates the data
    • Where appropriate, he creates the data repository.

SKILLS PREFERRED :-

Technical :-

  • Technical expertise related to data.
  • Expert in writing Spark code with Python.
  • Experience in using Python module like Pandas/NumPy/Sci-kit
  • Expert in writing SQL queries.
  • Specialist in data coming from a database management system.
  • Technical expertise on building ETL data pipeline, extraction from different source like Azure storage, HDFS, Kafka topics, structured and unstructured files, Hive.
  • A Data Engineer understands how to apply technologies to solve big data problems and to develop innovative big data solutions. In order to be able to do this, the Data Engineer should have extensive knowledge in different programming or scripting languages [Python]
  • Good understanding in streaming (Kafka/Storm/Kinesis)
  • Good understanding on Big data components (HDFS, YARN, Map Reduce, Spark, Oozie)
  • Good understanding on Azure components (ADF, ADB, ADLS)
  • Good understanding Version controlling (Git, GitHub, azure DevOps).
  • Competency in Cloud environment is must [preferred, Azure]
  • Should have experience in transforming raw/unstructured data to clean data [Data Quality]

Behavioural :-

  • Planning & Organizing
  • Teamwork & Collaboration
  • Customer Focus
  • Continuous learning
  • Ownership attitude
  • Problem solving expertise