Responsibilities
- Design, build, and enhance AI/NLP solutions including:
- Clause extraction from unstructured documents.
- Semantic comparison against standard/reference clauses.
- Risk identification and classification.
- Process and analyze unstructured financial and legal documents (Word, PDF, scanned files), including multilingual content (e.g., German and English).
- Develop and finetune semantic similarity and embeddingbased models to detect contextually similar clauses with different wording.
- Support creation and maintenance of a structured clause and policy repository aligned to Organization's standards (e.g., trade finance guarantees).
- Collaborate closely with Organization's SMEs, product owners, and compliance stakeholders to translate business and regulatory requirements into AI logic.
- Implement explainable AI approaches to ensure traceability and auditability of AIgenerated outputs.
- Support POC development, client demos, and iterative refinements based on the feedback.
- Expose AI capabilities through secure APIs and integrate with downstream systems as required.
- Ensure adherence to data privacy, confidentiality, and secure coding standards expected in a Tier1 bank environment.
Skills Requirement
- Bachelors degree in Engineering/Information Technology/Computer Science or a related field.
- 6-8 years of experience in relevant field.
- Handson expertise in NLP, semantic search, and document intelligence, with a strong emphasis on accuracy, explainability, and regulatory compliance, in line with Organization's risk and governance standards.
Technical Skills
Programming: Strong proficiency in Python, Doc AI, Vertex AI
NLP / GenAI
- Text preprocessing, embeddings, semantic similarity.
- Transformerbased models and LLMs.
- Prompt engineering and controlled generation techniques.
Frameworks & Libraries
- Hugging Face, spaCy, NLTK
- PyTorch or TensorFlow
- Scikitlearn
Document Intelligence: PDF / Word parsing, OCR integration for scanned documents.
Backend & Integration: REST API development using FastAPI / Flask.
Data Handling: Working with structured and unstructured datasets.
Version Control: Git, basic CI/CD exposure.