Principal Data Scientist
Company: University of Texas MD Anderson Cancer Center
Location: Houston
Posted on: October 31, 2024
Job Description:
The mission of The University of Texas M. D. Anderson Cancer
Center is to eliminate cancer in Texas, the nation, and the world
through outstanding programs that integrate patient care, research,
prevention, and education. Core to the success of our mission is
the ability to orchestrate multidimensional data, data analytics,
and machine learning to create sustainable impact within a
framework of responsible AI. We are building a dynamic team to
drive machine learning operations in order to accelerate the impact
of AI across the enterprise, driving long-lasting improvements in
cancer care.We are seeking a Principal Data Scientist to lead the
development and support of innovative generative AI models across
the organization. This role is at the heart of our endeavor to
pioneer generative AI healthcare solutions, aimed at
revolutionizing healthcare operations, enhancing patient outcomes,
and making substantial contributions to the fields of medical and
AI research. The selected candidate will be responsible for
developing individual and comprehensive multi-modal foundational
and generative AI models, utilizing their deep expertise in
algorithm architecture, machine learning methodologies, and
scientific processes. This effort is supported by an extensive
repository of contextually relevant data, including medical
imaging, electronic health records (EHR), pathology, operational
data, and other pertinent healthcare data.The successful applicant
will engage in close collaboration with clinical and business
professionals to identify use cases, select appropriate tools and
technologies, and define metrics of impact, ensuring that our
generative AI solutions are relevant, efficacious, and safe for
use. The successful candidate will work alongside a team of data
scientists and machine learning engineers to guarantee the seamless
deployment, accessibility, and ongoing maintenance of AI models
within our infrastructure. The successful candidate must be capable
of nurturing a culture of innovation, promoting team unity, and
driving technological progress to integrate AI seamlessly
throughout our enterprise, guaranteeing its ethical application and
optimizing for impact.Key Responsibilities:
- Generative AI Development: Innovate and develop
state-of-the-art machine learning technologies, focusing on
generative AI, and multimodal models, suitable for complex
healthcare applications.
- Foundation Model Development: Lead the development and
implementation of advanced foundational AI models, concentrating on
the domains of imaging, text, structural data, time-series, and
various healthcare-related data types. These models should enable
and enhance generative AI applications across multiple use
cases.
- Collaborative Integration & Validation: Work closely with
clinical experts, business stakeholders, data scientists, and
machine learning engineers to gather requirements, deploy, and
maintain foundation and generative AI models in production
environments, ensuring they are effectively validated and
integrated into enterprise use.
- Academic Collaboration & Translation: Engage with academic data
scientists and clinical researchers to explore novel AI approaches
and use-cases, facilitating the transition from research algorithms
to practical healthcare solutions.
- Operational Excellence & Compliance: Document and manage
detailed records of model development, maintain rigorous testing
and validation protocols, and ensure AI solutions are aligned with
regulatory standards and ethical guidelines.
- Leadership & Culture Development: Provide technical leadership
to a team of data scientists, fostering a culture of innovation,
continuous learning, and responsible AI development. Develop
thought leadership through presentations, publications, patents,
and participation in the tech community.Technical Expertise:
- An in-depth understanding of machine learning algorithms and
modeling (e.g., supervised, unsupervised, semi-supervised or weakly
supervised learning, generative models, transfer learning,
optimization, large language models, etc.)
- Experience developing foundational and/or generative AI
models.
- Experience working with open-source and closed source
generative AI models.
- Proficient in developing, evaluating, deploying AI/ML
algorithms.
- Skilled in constructing scalable data pipelines, model artifact
management, and model performance analytics.
- Experienced with MLOps tools and processes for data, features,
code, and model management.
- Strong proficiency in Python and either C++ or C#, with
practical knowledge of TensorFlow, PyTorch, and Scikit-learn.
- Knowledgeable about AI/ML platform infrastructure, including
cloud and on-premises architectures.
- Familiar with cloud-native tools, services, and computing
environments (e.g. Azure, AWS, GCP).Analytical Expertise:
- Experience and demonstrated capability to handle challenges
with vague or abstract problem definition.
- In-depth knowledge of AI/ML Model Lifecycle Management.
- Proficient in decision-making, problem-solving, and executing
AI/ML healthcare solutions.
- Skilled at the quantitatively assessing machine learning models
for performance, workflow impact, and potential risks.
- Competent in identifying risks and formulating mitigation plans
to prevent project delays.Oral and Written Communication:
- Demonstrated ability to lead and manage data science teams and
projects.
- Experience with documenting processes, pipelines, workflows,
and machine learning experiments.
- Report project metrics, including progress, impact, and risks,
to leadership, offering strategic recommendations for AI/ML
use-case prioritization.
- Manage stakeholder relations to facilitate solution adoption
and address issues.
- Share knowledge and offer technical assistance to researchers
and colleagues.
- Deliver both technical and non-technical updates in meetings
and at professional gatherings.Education Required:Bachelor's degree
in Biomedical Engineering, Electrical Engineering, Computer
Engineering, Physics, Applied Mathematics, Science, Engineering,
Computer Science, Statistics, Computational Biology, or related
field.Preferred Education:Doctorate (Academic)Experience
Required:Seven years of experience in scientific software or
industry programming with a concentration in scientific computing.
With Master's degree, five years experience required. With PhD,
three years of experience required.Preferred Experience:Two years
in a technical leadership role, leading the technical execution for
a project, providing mentorship, and working collaboratively within
and across teams.It is the policy of The University of Texas MD
Anderson Cancer Center to provide equal employment opportunity
without regard to race, color, religion, age, national origin, sex,
gender, sexual orientation, gender identity/expression, disability,
protected veteran status, genetic information, or any other basis
protected by institutional policy or by federal, state or local
laws unless such distinction is required by law.Additional
Information:
- Requisition ID: 166426
- Employment Status: Full-Time
- Employee Status: Regular
- Work Week: Days
- Minimum Salary: US Dollar (USD) 145,500
- Midpoint Salary: US Dollar (USD) 182,000
- Maximum Salary: US Dollar (USD) 218,500
- FLSA: exempt and not eligible for overtime pay
- Fund Type: Hard
- Work Location: Remote (within Texas only)
- Pivotal Position: No
- Referral Bonus Available?: No
- Relocation Assistance Available?: No
- Science Jobs: No#LI-Remote
#J-18808-Ljbffr
Keywords: University of Texas MD Anderson Cancer Center, Baytown , Principal Data Scientist, Education / Teaching , Houston, Texas
Didn't find what you're looking for? Search again!
Loading more jobs...