We are looking for an Experienced Data Engineer with a preference for Python that will be responsible for managing ETL/ELT and performant data storage and retrieval. Our projects ingest data from multiple sources, wrangles the disparate data into a unified schema, and then provides a final database/cloud storage for reporting efforts by other groups. We are looking for candidates proficient related AWS services and DBs.
Your primary focus will be the development of server-side logic, data ingestion, data wrangling, and algorithm development. Major technologies involved include AWS, Python 3, Spark, Pandas, MySQL.
Skills And Responsibilities
- Development of new RDBMS schema to handle the addition of new datasets.
- Applicable AWS proficiency
- Comfortable with Containerization (Docker, Vagrant, etc)
- Ability to write intermediate to advanced SQL for data wrangling and reporting efforts.
- Development of Python/Pandas code to wrangle multiple datasets covering a full spectrum of ETL tasks including entity resolution.
- Occasional Linux server management including the review or management of log files, crontab, security configuration, etc.
- Familiarity with machine learning topics to support supervised and unsupervised classification efforts.
- Data exploration, analysis, and reporting skills with an eye towards developing a narrative using Jupyter Notebook.
- Working understanding of REST APIs.
- Developing techniques to work with both tabular and hierarchical data.
The ideal candidate:
- Motivated by a passion to create highly fault tolerant apps with excellent design practices
- Enjoys collaborating with other engineers on architecture and sharing designs with the team
- Has experience collaborating with team members and communicating code patterns.
- Interacts with others using sound judgment, good humor, and consistent fairness in a fast-paced environment
Founded by freelance engineers, Blue Orange Digital wanted to bring an engineering-first approach to the development agency model. We aim to work on projects that use the latest and greatest technologies. We care about the products we build and only work with clients who understand that good applications come from happy engineers and team members. We’re headquartered in NYC and DC with additional remote engineers across the US.
|Job Category||Data Engineer|