Job description
Job Responsibilities
-
Work closely with other data scientists and engineers to put models into production.
-
Define and promote data engineering best practices within the team
-
Educate the team and the rest of the company about data science
-
Establish and sustain batch and near-real-time ETL processing pipelines, as well as analytical and machine learning workflows.
-
Create large-scale solutions that are optimized for performance
-
Communicate with various teams about data availability and coding uniformity
-
Make proactive suggestions about how the team can increase the performance of our data platform by using new technologies and architectures
-
Database pipeline architectures must be built, tested, and maintained
-
Work with management to understand the company's goals
Requirements
Job Requirements
-
Bachelor’s/Master’s degree in Engineering, Computer Science (or equivalent experience)
-
At least 3+ years of relevant experience as a data engineer
-
Demonstrable experience working with Big Data / BigQuery
-
Prolific knowledge of Python, SQL, Google Cloud, and AWS
-
Expertise in ETL and ELT
-
Nice to have some experience in HIVE, HADOOP, and Spark
-
Prior experience building data lakes and data warehouses is desirable
-
Knowledge of Pandas, Pyspark, NoSQL, and PostgreSQL
-
Experience working with large data sets and performance optimization of databases
-
A proven track record in dealing with data ingestion/ ETL processes design and technologies
-
Must be a team player with excellent collaboration skills
-
Excellent written and verbal communication skills in English