Summary
We are seeking a highly skilled and motivated Senior AI Engineer to join our Continuous Integration (CI) team. The AI Engineer will play a pivotal role in designing, developing, and deploying AI-driven microservices that power our next-generation enterprise platform. This role is critical to advancing our AI capabilities by leveraging cutting-edge frameworks such as Langchain and LangGraph, and by implementing scalable, maintainable solutions within a microservice architecture. The ideal candidate will bring deep expertise in containerization, orchestration, and multi-agent systems, contributing to the robustness and efficiency of our AI infrastructure. This position offers an exciting opportunity to work at the intersection of AI innovation and cloud-native technologies, collaborating closely with cross-functional teams to drive continuous integration and deployment excellence.
Responsibilities
- Design, develop, and deploy AI-driven microservices using Python and advanced AI frameworks including Langchain and LangGraph.
- Architect and implement scalable multi-agent systems that enhance the intelligence and responsiveness of our enterprise platform.
- Utilize containerization technologies such as Docker to package AI microservices, ensuring consistency across development, testing, and production environments.
- Manage orchestration of containerized applications using Kubernetes, including programmatic handling of deployments through Kubernetes APIs.
- Collaborate with DevOps and platform teams to integrate AI microservices into continuous integration and continuous deployment (CI/CD) pipelines, ensuring rapid and reliable delivery.
- Implement and maintain MCP Reverse Proxy configurations to optimize routing, security, and load balancing for AI services within the enterprise deployment architecture.
- Contribute to the design and deployment of enterprise-grade AI solutions that align with organizational goals and compliance standards.
- Work closely with data scientists, software engineers, and product managers to translate AI research into production-ready services.
- Monitor, troubleshoot, and optimize AI microservices performance, scalability, and reliability in cloud environments.
- Participate in code reviews, knowledge sharing, and mentoring of junior engineers to foster a culture of technical excellence.
- Stay abreast of emerging AI technologies, container orchestration trends, and best practices to continuously improve the AI platform.
Requirements
Must-Have Skills
- Python: Proficient in Python programming, with experience in developing AI applications and microservices. Ability to write clean, efficient, and maintainable code.
- Langchain: Expertise in Langchain framework for building AI applications that integrate language models with external data and tools.
- LangGraph: Experience with LangGraph for constructing and managing graph-based AI workflows and decision-making processes.
- Microservice Architecture: Strong understanding of microservice design principles, including service decomposition, API design, and inter-service communication.
- Multi-agent Systems: Proven experience in designing and deploying multi-agent AI systems that enable autonomous, collaborative, or competitive agent behaviors.
- MCP Reverse Proxy: Knowledge of MCP Reverse Proxy configurations and management to facilitate secure and efficient routing of AI microservices.
- Enterprise Platform Deployment: Familiarity with deploying AI solutions within enterprise-grade platforms, ensuring scalability, security, and compliance.
- Docker and Kubernetes (K8s) Experience: Hands-on experience with containerization using Docker and orchestration with Kubernetes, including deployment, scaling, and management of containerized AI services.
Nice-to-Have Skills
- Application-to-Application (A2A) Integration: Experience integrating AI microservices with other enterprise applications to enable seamless data and process flows.
- Advanced Embedding Strategies: Knowledge of embedding techniques to represent complex data structures and semantic information for AI models.
- Fine-Tuning: Experience fine-tuning large language models or other AI models to improve performance on domain-specific tasks.
- Evaluations: Ability to design and conduct rigorous evaluations of AI models and systems to ensure quality and effectiveness.
- Scaling with Tool Calling: Familiarity with scaling AI workflows by orchestrating external tool calls and managing dependencies.
- Programmatic Handling of Kubernetes Deployments through Kubernetes APIs: Advanced skills in automating Kubernetes operations using APIs and custom controllers.
- Sandboxed Environments for Ephemeral Code Execution: Experience creating secure, isolated environments for running transient AI code safely.
- Apache Kafka: Knowledge of event streaming platforms like Apache Kafka to support event-driven architectures and real-time data processing.
- Event Driven Architectures: Understanding of designing AI systems that react to events asynchronously for improved responsiveness and scalability.
- Caching Large Language Model Responses: Techniques for caching AI model outputs to reduce latency and computational costs.
- Large Language Model Memory: Experience managing memory and context in large language models to enhance conversational AI capabilities.
- Rule-Based Decision Making: Ability to implement rule-based logic to complement AI-driven decision processes.
- Graph-Based Decision Making: Expertise in leveraging graph structures for complex decision-making and knowledge representation.
- Swarm Architectures: Familiarity with swarm intelligence concepts to coordinate multiple AI agents in distributed environments