
Search by job, company or skills
ABOUT THE POSITION
We are looking for a Senior AI Engineer to help design and deliver agentic AI systems that power R&D tooling for video game asset pipelines and production workflows. You will help shape the technical direction of our internal agent platform and drive engineering practices around agent loops, memory, evaluation, and safe deployment of LLM-driven applications.
This is a senior, hands-on individual contributor role: you will write code, help design the agentic architecture, and partner with stakeholders across studios to turn emerging AI capabilities into production-grade tools.
Responsibilities
Agent platform
- Buildcore parts of our ..internal agent libraries.. - abstractions that let teams build agents quickly and consistently.
- Helps shape the ..central agent runtime.. architecture: deployment, registry, monitoring, and governance.
- Evolve the agent loop/harness (prompt orchestration, tool invocation, and sub-agent delegation) and adapt open-source patterns to our use cases.
Agent loop & harness engineering
- Drive prompting at scale: system prompts, guardrails, context poisoning mitigation, and hyperparameter tuning (context window, temperature, and top-k).
- Design agent tool interfaces (MCP servers, structured I/O, sub-agent composition) with built-in observability and telemetry.
- Evaluate and integrate ..local LLMs.. where latency, cost, or data residency requires it.
Agent memory
- Build key parts of the shared memory layer: conversation history, context chaining, and episodic memory.
- Defines short-term vs. long-term memory boundaries (decay/retention) and applies RBAC and tenant isolation for safe multi-team sharing.
Test- evaluation-driven development
- BuildEval harnesses and CI gates: golden traces, regression evaluations, offline/online metrics, and red-team prompts.
- Upheavals as the unit of progress-no agent change ships without a measurable signal.
Backend & platform foundations
- Build scalable Python (FastAPI) services and secure APIs with relational and non-relational data modeling.
- Enforce RBAC, input validation, and error handling implement caching, queues, and vector storage as workloads require.
Quality, delivery & collaboration
- Drive performance tuning, code reviews, and technical documentation within your area of the platform.
- Maintain CI/CD (Git/GitLab, Docker) and partner with cross-functional stakeholders (UI/UX, production, SRE, IT, game teams) to deliver agentic solutions.
Qualifications
Foundation (must-have software-engineering baseline)
- 3+ years of professional experience building production applications, with recent depth in AI/LLM-based systems.
- Strong proficiency in at least one of Python, TypeScript, or JavaScript. Python expertise is required for our stack (FastAPI, Pydantic, SQLAlchemy, or equivalent).
- Solid database skills across relational (PostgreSQL) and non-relational systems (e.g., MongoDB, vector databases) familiar with caching/queues (Redis) where applicable.
- Working knowledge of RBAC, authn/authz patterns, and secure API design.
-Comfortable with Git, GitLab CI/CD, and Docker/containers.
- Proventesting mindset and experience with automated test suites (e.g., pytest).
Agent loop/harness engineering
-Demonstrated experience designing and operating agent loops in production, not just prompt-tuning a chatbot.
- Deep, practical understanding of prompting: guardrails, context poisoning/pollution, and the hyperparameters that govern model behavior (context window size, lost-in-the-middle effects, temperature, and top-k).
- Hands-on experience integrating tools into agents: MCP, structured I/O for context, and sub-agent orchestration.
-Experience with any agent development framework-e.g., LangChain, LangGraph, Claude Agent SDK, Pydantic AI, or comparable-is acceptable.
- Strong instincts for observability and telemetry in non-deterministic systems.
Agent memory
-Practical experience implementing memory for agents: history compaction, context chaining, episodic memory, and short-term vs. long-term separation.
-Familiarity with retention/decay strategies and applying RBAC to multi-tenant memory.
Evaluation & quality
-Experience with test- and eval-driven development for LLM systems: building eval sets, regression suites, and CI gates around model/prompt changes.
Communication
-..English communication is a MUST..-strong written and verbal English required, and fluency is a significant plus given our globally distributed teams.
-Comfortable communicating technical decisions and tradeoffs across cross-functional stakeholders.
Nice to have
-Experience running ..local LLMs.. (e.g. via vLLM, Ollama, llama.cpp) and reasoning about the cost/latency/quality tradeoffs vs hosted models.
- Contributions to or familiarity with open-source agent harnesses (e.g.OpenCode, OpenClaw, etc).
- Experience with agent development frameworks (LangChain/LangGraph/Claude Agent SDK/Pydantic AI) beyond the prototype stage.
Job ID: 147798739
Skills:
data engineering , PostgreSQL, Programming, Tensorflow, Pytorch, Sap Hana, Python, LangChain, ICD-10, AI Machine Learning, OMOP, FHIR, Healthcare Informatics, LlamaIndex
Skills:
Python, large-scale software systems, high-quality maintainable code, API and system architecture, MLOps practices, AI solutions
Skills:
containerization , Tensorflow, Pytorch, Docker, Python, AWS, Gcp, Azure, Kubernetes, embeddings, Hugging Face, MLflow, vector databases, Ray, Pinecone, model evaluation, MLOps tools, RAG pipelines, Building and deploying AI agents, NLP concepts, AI ML frameworks, LangChain, OpenAI Assistants, cloud platforms, prompt engineering, AutoGen, FAISS, Milvus, Fine-tuning LLMs, Weaviate
Skills:
Pytorch, MLops, Python, Deep Learning, Generative AI, Langchain, LMM, Llm
Skills:
Java, Spring Boot, Sql, Nosql, Tensorflow, Numpy, Git, Pandas, Pytorch, Gcp, MLops, Docker, Azure, Kubernetes, Python, AWS, scikit-learn, AI ML Expertise, Agentic AI, Microservices Architecture
We don’t charge any money for job offers