職位:數(shù)據(jù)工程負(fù)責(zé)人 – 
經(jīng)驗(yàn):9–12年 
角色概述 
我們正在尋找一位經(jīng)驗(yàn)豐富的數(shù)據(jù)工程負(fù)責(zé)人(供應(yīng)商),以支持我們的平臺(tái)現(xiàn)代化計(jì)劃。該人員將與內(nèi)部團(tuán)隊(duì)合作,將應(yīng)用程序從Cloudera CDH遷移至基于Kubernetes的全球數(shù)據(jù)平臺(tái),并確保高質(zhì)量數(shù)據(jù)工程解決方案的及時(shí)交付。 
主要職責(zé) 
? 為從Cloudera(Spark、Hive、Kafka、Control-M)到Kubernetes技術(shù)棧(Spark 3.5、DBT、Airflow、MinIO/S3、Kafka、Solace)的遷移項(xiàng)目提供技術(shù)領(lǐng)導(dǎo)。 
? 領(lǐng)導(dǎo)小型團(tuán)隊(duì)及內(nèi)部工程師,完成項(xiàng)目交付物。 
? 參與設(shè)計(jì)、架構(gòu)討論以及遷移計(jì)劃,與內(nèi)部負(fù)責(zé)人協(xié)作。 
? 構(gòu)建并評(píng)審高性能、可用于生產(chǎn)環(huán)境的數(shù)據(jù)管道。 
? 確保符合標(biāo)準(zhǔn)、合規(guī)性及治理要求。 
? 向利益相關(guān)者提供狀態(tài)報(bào)告、問(wèn)題上報(bào)及交付跟蹤。 
? 設(shè)計(jì)并實(shí)施遷移/加速框架,實(shí)現(xiàn)端到端自動(dòng)化遷移。 
? 持續(xù)優(yōu)化框架,確保穩(wěn)定性、可擴(kuò)展性,并支持多樣化的用例和場(chǎng)景。 
? 與各類數(shù)據(jù)應(yīng)用程序協(xié)作,推動(dòng)并支持遷移流程。 
必備技能 
? 9–12年數(shù)據(jù)工程及大數(shù)據(jù)生態(tài)系統(tǒng)相關(guān)經(jīng)驗(yàn)。 
? 熟練掌握Spark、Hive、Kafka、Solace等技術(shù)的實(shí)際應(yīng)用。 
? 具備Kubernetes部署及容器化數(shù)據(jù)負(fù)載的工作經(jīng)驗(yàn)。 
? 精通Python、Scala和/或Java。 
? 熟悉編排工具(Airflow、Control-M)及SQL轉(zhuǎn)換框架(優(yōu)先DBT)。 
? 了解對(duì)象存儲(chǔ)(S3、MinIO)。 
? 具備數(shù)據(jù)湖庫(kù)格式(Iceberg、Delta Lake、Hudi)的實(shí)際操作經(jīng)驗(yàn)。 
? 擁有領(lǐng)導(dǎo)供應(yīng)商或分布式團(tuán)隊(duì)完成企業(yè)項(xiàng)目的經(jīng)驗(yàn)。
  Position: Data Engineering Lead – Vendor  
Experience: 9–12 Years  
Role Summary  
We are looking for an experienced Data Engineering Lead (Vendor) to support our platform  
modernization program. The resource will work with internal teams to migrate applications  
from Cloudera CDH to a Kubernetes-based global data platform and ensure timely delivery  
of high-quality data engineering solutions.  
Key Responsibilities  
? Provide technical leadership for migration projects from Cloudera (Spark, Hive, Kafka,  
Control-M) to Kubernetes stack (Spark 3.5, DBT, Airflow, MinIO/S3, Kafka, Solace).  
? Lead a small team and internal engineers to deliver project deliverables.  
? Participate in design, architecture discussions, and migration planning with internal  
leads.  
? Build and review high-performance, production-ready pipelines.  
? Ensure adherence to standards, compliance, and governance requirements.  
? Provide status reporting, escalations, and delivery tracking to stakeholders.  
? Design and implement migration/acceleration framework to automate end to end  
migration.  
? Continuous enhancements to the frameworks to ensure the stability, scalability and  
support for diverse use cases and scenarios.  
? Work with various data applications to enable and support the migration process.  
Required Skills  
? 9–12 years of experience in data engineering and big data ecosystems.  
? Strong hands-on expertise in Spark, Hive, Kafka, Solace.  
? Working experience with Kubernetes deployments and containerized data workloads.  
? Proficiency in Python, Scala, and/or Java.  
? Experience in orchestration tools (Airflow, Control-M) and SQL transformation  
frameworks (DBT preferred).  
? Familiarity with object stores (S3, MinIO).  
? Hands on experience of data Lakehouse formats (Iceberg, Delta Lake, Hudi).  
? Prior experience leading vendor or distributed teams for enterprise projects.