This role is a hybrid position that bridges the gap between real-time interoperability (Integration) and strategic analytical infrastructure (Data Engineering).
The Integration and Data Engineer acts as the bridge between the our client's core systems and its AI-driven future. You will design and manage the data pipelines that fuel our AI models. This involves using Rhapsody and BizTalk to feed real-time clinical events into AI inference engines, and leveraging Microsoft Fabric to curate high-quality training datasets from Epic Clarity. You will work side-by-side with AI Engineers to develop, deploy, and monitor predictive models and GenAI applications.
Key Responsibilities
1.AI & Machine Learning Data Support
- Feature Engineering: Collaborate with AI Engineers to identify, extract, and transform clinical & non-clinical variables (features) from Epic and other systems into formats ready for machine learning.
- Vector Database Management: Support the implementation of Vector stores within Microsoft Fabric to enable Retrieval-Augmented Generation (RAG) for clinical & non-clinical LLMs.
- Real-time AI Triggers: Configure Rhapsody / BizTalk to trigger AI model scoring based on specific HL7 events.
- Model Monitoring Data: Build pipelines to capture AI model outputs and feed them back into clinical & non-clinical workflows or monitoring dashboards for performance tracking.
2.Advanced Integration
- High-Velocity Streams: Develop and maintain real-time interfaces via Rhapsody and BizTalk that handle high-volume telemetry and data for real-time AI monitoring.
- Standardization for AI: Map diverse legacy data formats to standardized formats to ensure AI models receive clean and interoperable data.
3.Data Engineering & Fabric Ecosystem
- Fabric Lakehouse Design: Build and optimize in Microsoft Fabric specifically optimized for AI training and historical analysis.
- Epic Clarity Data Mining: Perform advanced data extraction from Clarity to build longitudinal patient records used for retrospective AI model validation.
- Pipeline Automation: Use Fabric Data Factory to automate the refreshing of AI training sets, ensuring models do not suffer from data drift.