Staff Data Engineer
staff · version v1-S4RexL
88a1a037-7ba2-4ded-b933-5451787037d5
Competencies
weights sum to 1.00| Name | ID | Weight | Definition |
|---|---|---|---|
| Distributed Data Systems Design | distributed_systems_architecture | 0.35 | Design and own end-to-end data pipelines handling 10TB+/day at scale. Architect schema evolution, partitioning strategies, and data contracts that support analytics, ML, and billing simultaneously without downtime. |
| Query Performance & Optimization | sql_query_optimization | 0.20 | Read EXPLAIN plans, identify bottlenecks (join order, missing indexes, cardinality issues), and rewrite queries for cost and latency. Optimize warehouse spend across analytics and feature serving workloads. |
| Streaming & Batch Processing Frameworks | streaming_framework_mastery | 0.20 | Hands-on expertise with Kafka, Flink, Spark, or Beam in production. Build reliable ingestion pipelines that maintain correctness (exactly-once semantics) and handle backfills without service interruption. |
| Technical Leadership & Mentorship | technical_leadership | 0.15 | Review design docs and mentor 4 mid-level engineers to ship higher-quality systems. Raise team capability through code review and architectural guidance on data platform decisions. |
| Data Infrastructure Cost Optimization | cost_optimization | 0.10 | Identify and execute cost reduction opportunities in warehouse infrastructure spending (six-figure monthly spend). Model trade-offs between compute, storage, and query latency. |
Scoring Weights
Adjust how much each competency contributes to the final score. Changes apply to all future evaluations for this job spec.
distributed_systems_architecture
35%0.35
sql_query_optimization
20%0.20
streaming_framework_mastery
20%0.20
technical_leadership
15%0.15
cost_optimization
10%0.10
Total weight: 1.00 ✓ balanced
Interview questions
AI-generated from the JD · auto-published · edit anytime3 currently · candidate sees this many
Behavioral round (Round 02)
Voice Q&A · pre-generated from the JD · edit anytime4 prompts · Cal walks through each in order during Round 02
Live performance
Aggregated from candidates who took this job spec.
Interviews
1
0 completed
Avg score
—
out of 100
Completion
0%
0 of 1
Avg score per competency
Distributed Data Systems Designweight 0.350.00n=2
Query Performance & Optimizationweight 0.200.00n=2
Must-haves
- required8+ years in data engineering or distributed systems
- requiredExperience designing for schema evolution and zero-downtime backfills
- requiredHands-on proficiency with at least one of: Spark, Flink, Beam, or dbt-at-scale
- requiredProven track record mentoring engineers and improving team output