Spark Application Architect – 100% Remote

Matlen Silver
November 4, 2021
Plymouth Meeting, Pennsylvania
Job Type
Remote Work
  • Yes


• Work on a team of experienced Hadoop architects, engineers, and administrators managing thousands of nodes across many clusters around the globe.
• Partner with internal tenant application development and architecture teams to enable them to be more successful in utilizing our clusters.
• Embed with our tenant teams to gather and share successful design patterns and opportunities for data and code reuse when applicable
• Identify opportunities to optimize Spark code, Impala queries, Hive tables, workflows for better scalability, reliability, and performance in existing applications
• Identify opportunities to stabilize our clusters through user application changes to eliminate small files, memory overallocation, data skew, and other bad practices
• Implement Proof-of-Concept applications to demonstrate best practices, new design patterns, or new technology adoption.
• Review requirements for new user initiatives to design in best practices from the start
• Collect existing documentation from application architects to provide more transparency into our users use cases and technology stack to our team
• Document high level summaries of tenant use cases and data dependencies
• Present findings both to tenant and cluster admin teams to enable knowledge transfer

• You are experienced in software design architectures, design patterns
• You have delivered architectural artifacts, then software solutions derived from them
• You have solutioned big data infrastructure on prem and in the cloud
• You have strong communication and documentation skills
• You enjoy building proof of concepts and performing demos
• You have experience working with healthcare, PHI, PII, or other sensitive data

Technical Experience Requirements:
• Spark (most important), Scala, Spark SQL
• Hive, ETL/ELT Pipelines, Impala
• Hadoop, MapReduce, YARN, Big Data, NoSQL
• Cloudera CDH/CDP or Hortonworks HDP
• Oozie or Airflow
• Kafka or other messaging
• Cloud (Azure, AWS)
• Unix / Bash
• Python
• REST APIs, Postman

Good to Haves:
• Cloudera CDP
• Hive ACID, Kudu
• Atlas API
• Python3, PySpark
• Java, Kotlin, SpringBoot
• Node.js, React.js
• Gradle/Gitlab CI
• MuleSoft

Drop files here browse files ...

Related Jobs

November 30, 2021
Lead UX Designer   New York, New Jersey new
November 30, 2021
Senior Splunk Engineer   Dallas, Texas new
November 30, 2021
Senior Splunk Engineer   Dallas, Texas new
November 30, 2021
November 30, 2021
Scroll to Top
Are you sure you want to delete this file?