Key Responsibilities
Agent Framework & Libraries
- Architect modular Python libraries and a CLI that expose core agent primitivestask graphs, skills, memory, and tool interfaces.
Orchestration & Scheduling
- Implement a scalable orchestration layer (Celery, Argo Workflows, Prefect, or similar) that runs multistep CV pipelines with retry, rollback, and SLA guarantees.
- Integrate vector and hybrid search stores so agents can retrieve data during execution.
Tooling & Developer Experience
- Create CLI utilities and REST/gRPC APIs that let engineers trigger, inspect, and debug agent runs.
- Maintain CI/CD pipelines, comprehensive test suites, and infrastructureascode so the agent platform ships reliably on a biweekly cadence.
Integrate CV Toolkits
- Wrap bestinclass vision components (OpenCV, TorchVision, MMDetection, Ultralytics YOLO, Albumentations, etc.) so agents can call dataprep, augmentation, modelzoo, and metric utilities on demand to meet user requirements.
Must-Have Skills
- Solid engineering foundation 5 + years writing production software (ideally Python), strong grasp of algorithms, data structures, Git workflows, and codereview best practices.
- Agent frameworks handson experience designing or extending agent stacks such as LangChain, AutoGen, CrewAI, or custom inhouse taskgraph engines.
- Orchestration at scale proficiency with a workflow scheduler or task queue (Prefect, Argo Workflows, Airflow, Dagster, Celery) and the patterns for retry, rollback, and SLA tracking.
- Computervision pipeline knowhow practical exposure to training and evaluating CV models (classification, detection, segmentation) and understanding of dataquality pitfalls.
- Evaluation & observability ability to build automated test/evaluation harnesses using pytest, MLflow, wandb, or equivalent, and expose metrics via Prometheus/Grafana or OpenTelemetry.
- Vector & hybrid search experience integrating stores such as Pinecone, Weaviate, pgvector, or FAISS to power agent memory and retrieval workflows.
- Model serving & packaging familiarity with TorchServe, Triton, BentoML, ONNX Runtime, or similar frameworks, plus Docker/Kubernetes fundamentals.
- CI/CD & IaC competence setting up GitHub Actions/GitLab CI pipelines and InfrastructureasCode (Terraform, Pulumi) to keep releases predictable.
- Cloud fluency production deployments on one or more providers (AWS, GCP, Azure) and an eye for cost/performance tradeoffs.
- Clear communication comfort writing design docs/RFCs and mentoring peers on agent architecture, testing, and deployment best practices.
Nice-to-Have Skills
- Portfolio of AI/Computer Vision/Agent projects or open-source contributions
- UI development experience (e.g., Gradio, Streamlit)
- ML observability tools familiarity (e.g., Grafana or Datadog)