Activities
- Implement ETL/ELT processes using various tools and programming languages (Scala, Python) against our MPP databases StarRocks, Vertica and Snowflake
- Work with the Hadoop team and optimize Hive and Iceberg tables
- Contribute to the existing Data Lake and Data Warehouse imitative using Hive, Spark, Iceberg, Presto/Trino
- Analyze business requirements, design and implement required data models
Skills
- English C1 - Advanced
- BA/BS in Computer Science or in related field
- 1+ years of experience with MPP databases such as StarRocks, Vertica, Snowflake
- 3+ years of experience with RDBMS databases such as Oracle, MSSQL or PostgreSQL
- Programming background with Scala, Python, Java or C/C++
- Strong in any of the Linux distributions, RHEL,CentOS or Fedora
- Experience working in both OLAP and OLTP environments
- Experience working on-prem, not just cloud environments
Desired: (nice to have)
- Experience with Elasticsearch or ELK stack
- Working knowledge of streaming technologies such as Kafka
- Working knowledge of orchestration tools such Oozie and Airflow
- Experience with Spark. PySpark, SparkSQL, Spark Streaming, etc…
- Experience using ETL tools such as Informatica, Talend and/or Pentaho
- Understanding of Healthcare data
- Data Analyst or Business Intelligence would be a plus
Additional Information
- 4k USD/mo
- Remote