Job descriptionPosition Overview We are seeking a Machine Learning Software Developer to build and deploy production‑grade AI systems for our flagship clinical software. This role focuses on foundation models and agentic AI workflows that support clinical reporting, findings summarization, structured outputs, and conversational assistance. You will develop and scale LLM‑based systems using retrieval‑augmented generation, tool integration, structured outputs, and orchestration frameworks across local and cloud environments. Success in this role requires strong attention to reliability, observability, safety, and backend integration within a regulated clinical setting. The Machine Learning Software Developer - Foundational & Agentic AI will report to Director, Artificial Intelligence.
General Responsibilities
Build and scale training pipelines in collaboration with Research Scientists, translating experimental ideas into production‑grade ML systems.
Design and deploy agentic and LLM‑powered workflows for clinical reporting, summarization, structured outputs, and conversational assistance using tool integration, function calling, structured outputs, and orchestration frameworks.
Develop retrieval‑augmented generation pipelines and backend services that integrate AI capabilities into a secure, scalable C++‑based platform.
Establish evaluation, observability, and monitoring practices to measure and improve quality, factuality, safety, latency, reliability, and runtime performance.
Support local and cloud deployment of models and inference services with a focus on privacy, resilience, maintainability, and strong engineering practices.
Required Skills/Experience
4+ years of experience building and deploying machine learning or AI systems in production.
Strong expertise in deep learning architectures, including Transformers and diffusion models, with proficiency in PyTorch.
Hands‑on experience building agentic and LLM‑based applications using retrieval‑augmented generation, structured outputs, function calling, workflow orchestration, and evaluation frameworks.
Experience with distributed training and optimization in HPC or cloud environments using frameworks such as PyTorch Distributed, Ray, DeepSpeed, Megatron, or CUDA.
Strong Python and software engineering skills, including testing, debugging, version control, and experience building REST APIs, backend services, or microservices.
Beneficial Skills/Experience
Hands‑on experience training/finetuning large foundation models in distributed compute environments.
Familiarity with multi‑agent systems, workflow engines, graph‑based orchestration frameworks, and cloud platforms such as AWS, Azure, or GCP.
Proficiency in MLOps or LLMOps tooling such as Docker, Kubernetes, MLflow, Airflow, CI/CD pipelines, or model monitoring systems.
Background in healthcare, biomedical imaging, or other regulated software environments, including translating research into product features.
Educational Requirements
Master’s or PhD in Computer Science, Artificial Intelligence, Data Science, or an equivalent combination of education, training, and experience.
About The Benefits
Competitive compensation and vacation
Flexible working arrangements
Employee Wellness Program
Professional development and tuition reimbursement program
Gratifying internal recognition/kudos programs
Annual salary review – based on company and individual performance
Fun, inclusive, ego‑free environment where diversity and individual thoughts are encouraged and valued
#J-18808-Ljbffr