361 Lead Data Architect jobs in the United Arab Emirates
Lead Data Architect
Posted today
Job Viewed
Job Description
A leading investment institution is seeking a
Data Engineering and Operations Lead
to drive the development and operation of a modern, scalable enterprise data platform. This senior leadership role is critical to enabling advanced analytics, AI, and business intelligence across investment and operational domains.
You will bring a strategic mindset, deep technical expertise, and a strong focus on data integrity and operational excellence.
Key Responsibilities
Strategic Leadership
- Define and execute the data engineering vision aligned with business and analytics goals.
- Advise senior leadership and collaborate with data science and governance teams to drive innovation.
- Represent the function in digital transformation and data modernization initiatives.
Architecture & Platform Engineering
- Design and manage a cloud-native architecture using
Snowflake
,
Azure
, and
Databricks
. - Implement medallion architecture and data lakehouse patterns.
- Ensure secure, modular, and interoperable data systems.
Data Integration & Mastering
- Lead ingestion and harmonization of market, partner, and third-party data.
- Establish frameworks for data mastering, hierarchy management, and lifecycle observability.
Advanced Data Engineering
- Build complex transformation workflows using
dbt
,
Databricks
, and
Python
. - Enable enriched data layers for analytics, AI, and reporting.
Governance & Observability
- Embed data quality, lineage, and privacy into engineering workflows.
- Utilize tools like
Collibra
,
Purview
,
Monte Carlo
, and
Great Expectations
.
Business Intelligence & AI Enablement
- Develop semantic models and governed datasets in
Power BI
. - Support AI systems and retrieval-augmented generation (RAG) pipelines.
Required
Education
- Bachelor's or Master's in Computer Science, Data Engineering, Information Systems, or related field.
Experience
- 10+ years in enterprise data engineering, with 5+ years in leadership.
- Strong background in financial services or investment management.
- Proven experience with market data, GP data, and external vendor feeds.
- Deep expertise in
Snowflake
,
Azure Data Services
, and
Databricks
.
Technical Skills
- Expert in
SQL
and
Python
for pipeline development. - Advanced orchestration with
dbt
,
Airflow
, or
Azure Data Factory
. - Experience with semantic modeling, data mastering, and AI-ready data services.
- Familiarity with
Power BI
, metadata enrichment, and MLOps practices.
Lead Data Architect
Posted today
Job Viewed
Job Description
Data Infrastructure and Analytics Role
We are seeking an experienced Data Scientist to strengthen our data capabilities. This role will play a crucial part in enhancing our data infrastructure, maintaining high data quality, and delivering actionable insights across departments.
Responsibilities include:
- Driving strategic decisions with data-driven insights
- Developing and implementing efficient data infrastructure and tooling solutions
- Providing ad-hoc analytics support to critical business units
- Conducting research and analysis to uncover key findings
- Improving data collection and analytics processes across teams
- Ensuring dashboards provide clarity, usability, and real-time visibility
Lead Data Architect
Posted today
Job Viewed
Job Description
- Strategic Data Platform Leadership: Define and implement an enterprise-wide data architecture strategy that supports interoperability, AI / ML readiness, and regulatory compliance
- Lead the evolution of our AWS-based data lake architecture, supporting structured, semi-structured, and unstructured data types—especially FHIR-formatted JSON healthcare data
- Cloud Data Lake & Storage Optimization: Design and maintain scalable, secure, and cost-effective data lakes using Amazon S3, AWS Glue, Athena, Redshift, and Lake Formation
- Leverage Mountpoint for S3 to enable high-performance, POSIX-compliant access to S3 objects, including vectorized data files
- Optimize data storage and retrieval strategies for performance and cost-efficiency, including partitioning, file formats (e.g., Parquet, ORC), and compression techniques
- AI / ML Enablement and Vector Infrastructure: Collaborate with data science teams to implement embedding models, vectorization pipelines, and real-time inference architectures
- Design and manage vector storage systems (e.g., S3-based, FAISS, Pinecone, or Amazon OpenSearch) to support semantic search, retrieval-augmented generation (RAG), and intelligent data access
- Ensure vectorized data pipelines are aligned with model training, evaluation, and deployment strategies
- Healthcare Data Architecture & Interoperability: Architect systems to ingest, process, and store FHIR-compliant JSON data from EHRs, APIs, and HL7 sources
- Ensure conformance with healthcare interoperability standards and optimize for queryability and downstream analytics
- Implement data normalization and enrichment pipelines for use in both clinical and operational contexts
- Security, Compliance & Governance: Lead efforts to ensure data security at rest and in transit using AWS-native encryption, IAM, VPC controls, and bucket policies
- Implement and manage data access controls, audit logging, and role-based security models across AWS environments
- Oversee data governance including lineage, cataloging, and stewardship with tools such as AWS Glue Data Catalog, Lake Formation, or third-party platforms
- Team Leadership & Cross-Functional Collaboration: Build and lead a high-performing team of data architects and engineers
- Work closely with stakeholders from engineering, data science, product, and compliance teams to deliver data initiatives
- Bachelor's or Master's in Computer Science, Data Engineering, or related field
- 8–12+ years of experience in data architecture with 3–5 years in a technical leadership role
- Proven experience architecting AWS-based data lakes and analytics pipelines
- Deep understanding of healthcare data standards (FHIR, HL7) and working with FHIR JSON objects in large-scale systems
- Expertise with embedding and vectorization models, semantic search, and managing vector storage solutions
- Hands-on experience with Amazon S3, Mountpoint for S3, and optimizing S3-based workloads for performance and cost
- Strong background in data security, encryption, access control, and compliance frameworks (HIPAA, HITRUST)
- Preferred Qualifications
- AWS certifications (e.g., AWS Certified Big Data or Data Analytics – Specialty)
- Familiarity with open-source vector databases (e.g., FAISS, Weaviate) and MLOps pipelines
- Experience in clinical systems integration, claims processing, or population health analytics
- This is an independent contractor position.
- Job Type : Full-time
- Location : Remote
- Hours : Available during standard US business hours (9am-5pm EST or 8 : 30am-4 : 30pm EST)
- This job description is intended to describe the general requirements for the position.
- It is not a complete statement of duties, responsibilities or requirements.
- Other duties not listed here may be assigned as necessary to ensure proper operations of the department.
#J-18808-Ljbffr
Lead Data Architect
Posted today
Job Viewed
Job Description
- Strategic Data Platform Leadership: Define and implement an enterprise-wide data architecture strategy that supports interoperability, AI / ML readiness, and regulatory compliance
- Lead the evolution of our AWS-based data lake architecture, supporting structured, semi-structured, and unstructured data types—especially FHIR-formatted JSON healthcare data
- Cloud Data Lake & Storage Optimization: Design and maintain scalable, secure, and cost-effective data lakes using Amazon S3, AWS Glue, Athena, Redshift, and Lake Formation
- Leverage Mountpoint for S3 to enable high-performance, POSIX-compliant access to S3 objects, including vectorized data files
- Optimize data storage and retrieval strategies for performance and cost-efficiency, including partitioning, file formats (e.g., Parquet, ORC), and compression techniques
- AI / ML Enablement and Vector Infrastructure: Collaborate with data science teams to implement embedding models, vectorization pipelines, and real-time inference architectures
- Design and manage vector storage systems (e.g., S3-based, FAISS, Pinecone, or Amazon OpenSearch) to support semantic search, retrieval-augmented generation (RAG), and intelligent data access
- Ensure vectorized data pipelines are aligned with model training, evaluation, and deployment strategies
- Healthcare Data Architecture & Interoperability: Architect systems to ingest, process, and store FHIR-compliant JSON data from EHRs, APIs, and HL7 sources
- Ensure conformance with healthcare interoperability standards and optimize for queryability and downstream analytics
- Implement data normalization and enrichment pipelines for use in both clinical and operational contexts
- Security, Compliance & Governance: Lead efforts to ensure data security at rest and in transit using AWS-native encryption, IAM, VPC controls, and bucket policies
- Implement and manage data access controls, audit logging, and role-based security models across AWS environments
- Oversee data governance including lineage, cataloging, and stewardship with tools such as AWS Glue Data Catalog, Lake Formation, or third-party platforms
- Team Leadership & Cross-Functional Collaboration: Build and lead a high-performing team of data architects and engineers
- Work closely with stakeholders from engineering, data science, product, and compliance teams to deliver data initiatives
- Bachelor's or Master's in Computer Science, Data Engineering, or related field
- 8–12+ years of experience in data architecture with 3–5 years in a technical leadership role
- Proven experience architecting AWS-based data lakes and analytics pipelines
- Deep understanding of healthcare data standards (FHIR, HL7) and working with FHIR JSON objects in large-scale systems
- Expertise with embedding and vectorization models, semantic search, and managing vector storage solutions
- Hands-on experience with Amazon S3, Mountpoint for S3, and optimizing S3-based workloads for performance and cost
- Strong background in data security, encryption, access control, and compliance frameworks (HIPAA, HITRUST)
- Preferred Qualifications
- AWS certifications (e.g., AWS Certified Big Data or Data Analytics – Specialty)
- Familiarity with open-source vector databases (e.g., FAISS, Weaviate) and MLOps pipelines
- Experience in clinical systems integration, claims processing, or population health analytics
- This is an independent contractor position.
- Job Type : Full-time
- Location : Remote
- Hours : Available during standard US business hours (9am-5pm EST or 8 : 30am-4 : 30pm EST)
- This job description is intended to describe the general requirements for the position.
- It is not a complete statement of duties, responsibilities or requirements.
- Other duties not listed here may be assigned as necessary to ensure proper operations of the department.
Lead Data Architect
Posted today
Job Viewed
Job Description
- Strategic Data Platform Leadership: Define and implement an enterprise-wide data architecture strategy that supports interoperability, AI / ML readiness, and regulatory compliance
- Lead the evolution of our AWS-based data lake architecture, supporting structured, semi-structured, and unstructured data types—especially FHIR-formatted JSON healthcare data
- Cloud Data Lake & Storage Optimization: Design and maintain scalable, secure, and cost-effective data lakes using Amazon S3, AWS Glue, Athena, Redshift, and Lake Formation
- Leverage Mountpoint for S3 to enable high-performance, POSIX-compliant access to S3 objects, including vectorized data files
- Optimize data storage and retrieval strategies for performance and cost-efficiency, including partitioning, file formats (e.g., Parquet, ORC), and compression techniques
- AI / ML Enablement and Vector Infrastructure: Collaborate with data science teams to implement embedding models, vectorization pipelines, and real-time inference architectures
- Design and manage vector storage systems (e.g., S3-based, FAISS, Pinecone, or Amazon OpenSearch) to support semantic search, retrieval-augmented generation (RAG), and intelligent data access
- Ensure vectorized data pipelines are aligned with model training, evaluation, and deployment strategies
- Healthcare Data Architecture & Interoperability: Architect systems to ingest, process, and store FHIR-compliant JSON data from EHRs, APIs, and HL7 sources
- Ensure conformance with healthcare interoperability standards and optimize for queryability and downstream analytics
- Implement data normalization and enrichment pipelines for use in both clinical and operational contexts
- Security, Compliance & Governance: Lead efforts to ensure data security at rest and in transit using AWS-native encryption, IAM, VPC controls, and bucket policies
- Implement and manage data access controls, audit logging, and role-based security models across AWS environments
- Oversee data governance including lineage, cataloging, and stewardship with tools such as AWS Glue Data Catalog, Lake Formation, or third-party platforms
- Team Leadership & Cross-Functional Collaboration: Build and lead a high-performing team of data architects and engineers
- Work closely with stakeholders from engineering, data science, product, and compliance teams to deliver data initiatives
- Bachelor's or Master's in Computer Science, Data Engineering, or related field
- 8–12+ years of experience in data architecture with 3–5 years in a technical leadership role
- Proven experience architecting AWS-based data lakes and analytics pipelines
- Deep understanding of healthcare data standards (FHIR, HL7) and working with FHIR JSON objects in large-scale systems
- Expertise with embedding and vectorization models, semantic search, and managing vector storage solutions
- Hands-on experience with Amazon S3, Mountpoint for S3, and optimizing S3-based workloads for performance and cost
- Strong background in data security, encryption, access control, and compliance frameworks (HIPAA, HITRUST)
- Preferred Qualifications
- AWS certifications (e.g., AWS Certified Big Data or Data Analytics – Specialty)
- Familiarity with open-source vector databases (e.g., FAISS, Weaviate) and MLOps pipelines
- Experience in clinical systems integration, claims processing, or population health analytics
- This is an independent contractor position.
- Job Type : Full-time
- Location : Remote
- Hours : Available during standard US business hours (9am-5pm EST or 8 : 30am-4 : 30pm EST)
- This job description is intended to describe the general requirements for the position.
- It is not a complete statement of duties, responsibilities or requirements.
- Other duties not listed here may be assigned as necessary to ensure proper operations of the department.
#J-18808-Ljbffr
Lead Data Architect
Posted today
Job Viewed
Job Description
- Strategic Data Platform Leadership: Define and implement an enterprise-wide data architecture strategy that supports interoperability, AI / ML readiness, and regulatory compliance
- Lead the evolution of our AWS-based data lake architecture, supporting structured, semi-structured, and unstructured data types—especially FHIR-formatted JSON healthcare data
- Cloud Data Lake & Storage Optimization: Design and maintain scalable, secure, and cost-effective data lakes using Amazon S3, AWS Glue, Athena, Redshift, and Lake Formation
- Leverage Mountpoint for S3 to enable high-performance, POSIX-compliant access to S3 objects, including vectorized data files
- Optimize data storage and retrieval strategies for performance and cost-efficiency, including partitioning, file formats (e.g., Parquet, ORC), and compression techniques
- AI / ML Enablement and Vector Infrastructure: Collaborate with data science teams to implement embedding models, vectorization pipelines, and real-time inference architectures
- Design and manage vector storage systems (e.g., S3-based, FAISS, Pinecone, or Amazon OpenSearch) to support semantic search, retrieval-augmented generation (RAG), and intelligent data access
- Ensure vectorized data pipelines are aligned with model training, evaluation, and deployment strategies
- Healthcare Data Architecture & Interoperability: Architect systems to ingest, process, and store FHIR-compliant JSON data from EHRs, APIs, and HL7 sources
- Ensure conformance with healthcare interoperability standards and optimize for queryability and downstream analytics
- Implement data normalization and enrichment pipelines for use in both clinical and operational contexts
- Security, Compliance & Governance: Lead efforts to ensure data security at rest and in transit using AWS-native encryption, IAM, VPC controls, and bucket policies
- Implement and manage data access controls, audit logging, and role-based security models across AWS environments
- Oversee data governance including lineage, cataloging, and stewardship with tools such as AWS Glue Data Catalog, Lake Formation, or third-party platforms
- Team Leadership & Cross-Functional Collaboration: Build and lead a high-performing team of data architects and engineers
- Work closely with stakeholders from engineering, data science, product, and compliance teams to deliver data initiatives
- Bachelor's or Master's in Computer Science, Data Engineering, or related field
- 8–12+ years of experience in data architecture with 3–5 years in a technical leadership role
- Proven experience architecting AWS-based data lakes and analytics pipelines
- Deep understanding of healthcare data standards (FHIR, HL7) and working with FHIR JSON objects in large-scale systems
- Expertise with embedding and vectorization models, semantic search, and managing vector storage solutions
- Hands-on experience with Amazon S3, Mountpoint for S3, and optimizing S3-based workloads for performance and cost
- Strong background in data security, encryption, access control, and compliance frameworks (HIPAA, HITRUST)
- Preferred Qualifications
- AWS certifications (e.g., AWS Certified Big Data or Data Analytics – Specialty)
- Familiarity with open-source vector databases (e.g., FAISS, Weaviate) and MLOps pipelines
- Experience in clinical systems integration, claims processing, or population health analytics
- This is an independent contractor position.
- Job Type : Full-time
- Location : Remote
- Hours : Available during standard US business hours (9am-5pm EST or 8 : 30am-4 : 30pm EST)
- This job description is intended to describe the general requirements for the position.
- It is not a complete statement of duties, responsibilities or requirements.
- Other duties not listed here may be assigned as necessary to ensure proper operations of the department.
Lead Data Architect
Posted today
Job Viewed
Job Description
- Strategic Data Platform Leadership: Define and implement an enterprise-wide data architecture strategy that supports interoperability, AI / ML readiness, and regulatory compliance
- Lead the evolution of our AWS-based data lake architecture, supporting structured, semi-structured, and unstructured data types—especially FHIR-formatted JSON healthcare data
- Cloud Data Lake & Storage Optimization: Design and maintain scalable, secure, and cost-effective data lakes using Amazon S3, AWS Glue, Athena, Redshift, and Lake Formation
- Leverage Mountpoint for S3 to enable high-performance, POSIX-compliant access to S3 objects, including vectorized data files
- Optimize data storage and retrieval strategies for performance and cost-efficiency, including partitioning, file formats (e.g., Parquet, ORC), and compression techniques
- AI / ML Enablement and Vector Infrastructure: Collaborate with data science teams to implement embedding models, vectorization pipelines, and real-time inference architectures
- Design and manage vector storage systems (e.g., S3-based, FAISS, Pinecone, or Amazon OpenSearch) to support semantic search, retrieval-augmented generation (RAG), and intelligent data access
- Ensure vectorized data pipelines are aligned with model training, evaluation, and deployment strategies
- Healthcare Data Architecture & Interoperability: Architect systems to ingest, process, and store FHIR-compliant JSON data from EHRs, APIs, and HL7 sources
- Ensure conformance with healthcare interoperability standards and optimize for queryability and downstream analytics
- Implement data normalization and enrichment pipelines for use in both clinical and operational contexts
- Security, Compliance & Governance: Lead efforts to ensure data security at rest and in transit using AWS-native encryption, IAM, VPC controls, and bucket policies
- Implement and manage data access controls, audit logging, and role-based security models across AWS environments
- Oversee data governance including lineage, cataloging, and stewardship with tools such as AWS Glue Data Catalog, Lake Formation, or third-party platforms
- Team Leadership & Cross-Functional Collaboration: Build and lead a high-performing team of data architects and engineers
- Work closely with stakeholders from engineering, data science, product, and compliance teams to deliver data initiatives
- Bachelor's or Master's in Computer Science, Data Engineering, or related field
- 8–12+ years of experience in data architecture with 3–5 years in a technical leadership role
- Proven experience architecting AWS-based data lakes and analytics pipelines
- Deep understanding of healthcare data standards (FHIR, HL7) and working with FHIR JSON objects in large-scale systems
- Expertise with embedding and vectorization models, semantic search, and managing vector storage solutions
- Hands-on experience with Amazon S3, Mountpoint for S3, and optimizing S3-based workloads for performance and cost
- Strong background in data security, encryption, access control, and compliance frameworks (HIPAA, HITRUST)
- Preferred Qualifications
- AWS certifications (e.g., AWS Certified Big Data or Data Analytics – Specialty)
- Familiarity with open-source vector databases (e.g., FAISS, Weaviate) and MLOps pipelines
- Experience in clinical systems integration, claims processing, or population health analytics
- This is an independent contractor position.
- Job Type : Full-time
- Location : Remote
- Hours : Available during standard US business hours (9am-5pm EST or 8 : 30am-4 : 30pm EST)
- This job description is intended to describe the general requirements for the position.
- It is not a complete statement of duties, responsibilities or requirements.
- Other duties not listed here may be assigned as necessary to ensure proper operations of the department.
Be The First To Know
About the latest Lead data architect Jobs in United Arab Emirates !
Lead Data Architect
Posted today
Job Viewed
Job Description
- Strategic Data Platform Leadership: Define and implement an enterprise-wide data architecture strategy that supports interoperability, AI / ML readiness, and regulatory compliance
- Lead the evolution of our AWS-based data lake architecture, supporting structured, semi-structured, and unstructured data types—especially FHIR-formatted JSON healthcare data
- Cloud Data Lake & Storage Optimization: Design and maintain scalable, secure, and cost-effective data lakes using Amazon S3, AWS Glue, Athena, Redshift, and Lake Formation
- Leverage Mountpoint for S3 to enable high-performance, POSIX-compliant access to S3 objects, including vectorized data files
- Optimize data storage and retrieval strategies for performance and cost-efficiency, including partitioning, file formats (e.g., Parquet, ORC), and compression techniques
- AI / ML Enablement and Vector Infrastructure: Collaborate with data science teams to implement embedding models, vectorization pipelines, and real-time inference architectures
- Design and manage vector storage systems (e.g., S3-based, FAISS, Pinecone, or Amazon OpenSearch) to support semantic search, retrieval-augmented generation (RAG), and intelligent data access
- Ensure vectorized data pipelines are aligned with model training, evaluation, and deployment strategies
- Healthcare Data Architecture & Interoperability: Architect systems to ingest, process, and store FHIR-compliant JSON data from EHRs, APIs, and HL7 sources
- Ensure conformance with healthcare interoperability standards and optimize for queryability and downstream analytics
- Implement data normalization and enrichment pipelines for use in both clinical and operational contexts
- Security, Compliance & Governance: Lead efforts to ensure data security at rest and in transit using AWS-native encryption, IAM, VPC controls, and bucket policies
- Implement and manage data access controls, audit logging, and role-based security models across AWS environments
- Oversee data governance including lineage, cataloging, and stewardship with tools such as AWS Glue Data Catalog, Lake Formation, or third-party platforms
- Team Leadership & Cross-Functional Collaboration: Build and lead a high-performing team of data architects and engineers
- Work closely with stakeholders from engineering, data science, product, and compliance teams to deliver data initiatives
- Bachelor's or Master's in Computer Science, Data Engineering, or related field
- 8–12+ years of experience in data architecture with 3–5 years in a technical leadership role
- Proven experience architecting AWS-based data lakes and analytics pipelines
- Deep understanding of healthcare data standards (FHIR, HL7) and working with FHIR JSON objects in large-scale systems
- Expertise with embedding and vectorization models, semantic search, and managing vector storage solutions
- Hands-on experience with Amazon S3, Mountpoint for S3, and optimizing S3-based workloads for performance and cost
- Strong background in data security, encryption, access control, and compliance frameworks (HIPAA, HITRUST)
- Preferred Qualifications
- AWS certifications (e.g., AWS Certified Big Data or Data Analytics – Specialty)
- Familiarity with open-source vector databases (e.g., FAISS, Weaviate) and MLOps pipelines
- Experience in clinical systems integration, claims processing, or population health analytics
- This is an independent contractor position.
- Job Type : Full-time
- Location : Remote
- Hours : Available during standard US business hours (9am-5pm EST or 8 : 30am-4 : 30pm EST)
- This job description is intended to describe the general requirements for the position.
- It is not a complete statement of duties, responsibilities or requirements.
- Other duties not listed here may be assigned as necessary to ensure proper operations of the department.
Lead Data Architect
Posted today
Job Viewed
Job Description
- Strategic Data Platform Leadership: Define and implement an enterprise-wide data architecture strategy that supports interoperability, AI / ML readiness, and regulatory compliance
- Lead the evolution of our AWS-based data lake architecture, supporting structured, semi-structured, and unstructured data types—especially FHIR-formatted JSON healthcare data
- Cloud Data Lake & Storage Optimization: Design and maintain scalable, secure, and cost-effective data lakes using Amazon S3, AWS Glue, Athena, Redshift, and Lake Formation
- Leverage Mountpoint for S3 to enable high-performance, POSIX-compliant access to S3 objects, including vectorized data files
- Optimize data storage and retrieval strategies for performance and cost-efficiency, including partitioning, file formats (e.g., Parquet, ORC), and compression techniques
- AI / ML Enablement and Vector Infrastructure: Collaborate with data science teams to implement embedding models, vectorization pipelines, and real-time inference architectures
- Design and manage vector storage systems (e.g., S3-based, FAISS, Pinecone, or Amazon OpenSearch) to support semantic search, retrieval-augmented generation (RAG), and intelligent data access
- Ensure vectorized data pipelines are aligned with model training, evaluation, and deployment strategies
- Healthcare Data Architecture & Interoperability: Architect systems to ingest, process, and store FHIR-compliant JSON data from EHRs, APIs, and HL7 sources
- Ensure conformance with healthcare interoperability standards and optimize for queryability and downstream analytics
- Implement data normalization and enrichment pipelines for use in both clinical and operational contexts
- Security, Compliance & Governance: Lead efforts to ensure data security at rest and in transit using AWS-native encryption, IAM, VPC controls, and bucket policies
- Implement and manage data access controls, audit logging, and role-based security models across AWS environments
- Oversee data governance including lineage, cataloging, and stewardship with tools such as AWS Glue Data Catalog, Lake Formation, or third-party platforms
- Team Leadership & Cross-Functional Collaboration: Build and lead a high-performing team of data architects and engineers
- Work closely with stakeholders from engineering, data science, product, and compliance teams to deliver data initiatives
- Bachelor's or Master's in Computer Science, Data Engineering, or related field
- 8–12+ years of experience in data architecture with 3–5 years in a technical leadership role
- Proven experience architecting AWS-based data lakes and analytics pipelines
- Deep understanding of healthcare data standards (FHIR, HL7) and working with FHIR JSON objects in large-scale systems
- Expertise with embedding and vectorization models, semantic search, and managing vector storage solutions
- Hands-on experience with Amazon S3, Mountpoint for S3, and optimizing S3-based workloads for performance and cost
- Strong background in data security, encryption, access control, and compliance frameworks (HIPAA, HITRUST)
- Preferred Qualifications
- AWS certifications (e.g., AWS Certified Big Data or Data Analytics – Specialty)
- Familiarity with open-source vector databases (e.g., FAISS, Weaviate) and MLOps pipelines
- Experience in clinical systems integration, claims processing, or population health analytics
- This is an independent contractor position.
- Job Type : Full-time
- Location : Remote
- Hours : Available during standard US business hours (9am-5pm EST or 8 : 30am-4 : 30pm EST)
- This job description is intended to describe the general requirements for the position.
- It is not a complete statement of duties, responsibilities or requirements.
- Other duties not listed here may be assigned as necessary to ensure proper operations of the department.
Lead Data Architect
Posted today
Job Viewed
Job Description
- Strategic Data Platform Leadership: Define and implement an enterprise-wide data architecture strategy that supports interoperability, AI / ML readiness, and regulatory compliance
- Lead the evolution of our AWS-based data lake architecture, supporting structured, semi-structured, and unstructured data types—especially FHIR-formatted JSON healthcare data
- Cloud Data Lake & Storage Optimization: Design and maintain scalable, secure, and cost-effective data lakes using Amazon S3, AWS Glue, Athena, Redshift, and Lake Formation
- Leverage Mountpoint for S3 to enable high-performance, POSIX-compliant access to S3 objects, including vectorized data files
- Optimize data storage and retrieval strategies for performance and cost-efficiency, including partitioning, file formats (e.g., Parquet, ORC), and compression techniques
- AI / ML Enablement and Vector Infrastructure: Collaborate with data science teams to implement embedding models, vectorization pipelines, and real-time inference architectures
- Design and manage vector storage systems (e.g., S3-based, FAISS, Pinecone, or Amazon OpenSearch) to support semantic search, retrieval-augmented generation (RAG), and intelligent data access
- Ensure vectorized data pipelines are aligned with model training, evaluation, and deployment strategies
- Healthcare Data Architecture & Interoperability: Architect systems to ingest, process, and store FHIR-compliant JSON data from EHRs, APIs, and HL7 sources
- Ensure conformance with healthcare interoperability standards and optimize for queryability and downstream analytics
- Implement data normalization and enrichment pipelines for use in both clinical and operational contexts
- Security, Compliance & Governance: Lead efforts to ensure data security at rest and in transit using AWS-native encryption, IAM, VPC controls, and bucket policies
- Implement and manage data access controls, audit logging, and role-based security models across AWS environments
- Oversee data governance including lineage, cataloging, and stewardship with tools such as AWS Glue Data Catalog, Lake Formation, or third-party platforms
- Team Leadership & Cross-Functional Collaboration: Build and lead a high-performing team of data architects and engineers
- Work closely with stakeholders from engineering, data science, product, and compliance teams to deliver data initiatives
- Bachelor's or Master's in Computer Science, Data Engineering, or related field
- 8–12+ years of experience in data architecture with 3–5 years in a technical leadership role
- Proven experience architecting AWS-based data lakes and analytics pipelines
- Deep understanding of healthcare data standards (FHIR, HL7) and working with FHIR JSON objects in large-scale systems
- Expertise with embedding and vectorization models, semantic search, and managing vector storage solutions
- Hands-on experience with Amazon S3, Mountpoint for S3, and optimizing S3-based workloads for performance and cost
- Strong background in data security, encryption, access control, and compliance frameworks (HIPAA, HITRUST)
- Preferred Qualifications
- AWS certifications (e.g., AWS Certified Big Data or Data Analytics – Specialty)
- Familiarity with open-source vector databases (e.g., FAISS, Weaviate) and MLOps pipelines
- Experience in clinical systems integration, claims processing, or population health analytics
- This is an independent contractor position.
- Job Type : Full-time
- Location : Remote
- Hours : Available during standard US business hours (9am-5pm EST or 8 : 30am-4 : 30pm EST)
- This job description is intended to describe the general requirements for the position.
- It is not a complete statement of duties, responsibilities or requirements.
- Other duties not listed here may be assigned as necessary to ensure proper operations of the department.