AI/ML & Data Engineer

Name: Congruent Software Inc
Address: 4205 148th Ave NE Suite 200, Bellevue, WA, 98007, US
Telephone: +1-844-567-5232

Location: Chennai

Experience 

4~6 years in ML/NLP, preferably in document-heavy domains (finance, legal, policy)

Data Ingestion and Preprocessing: Ability to build and maintain data pipelines to ingest unstructured data from PDFs, gazettes, HTML circulars etc. and process data extraction, parsing, and normalization
NLP & LLM Modeling: Ability to fine-tune or prompt-tune LLMs for summarization, classification, and change detection in regulations. Ability to develop embeddings for semantic similarity.
Knowledge Graph Engineering: Ability to design entity relationships (regulation, control, policy) and implement retrieval over Neo4j or similar graph DBs.
Information Retrieval (RAG): Ability to build RAG pipelines for natural language querying of regulations.
Annotation and Validation: Ability to annotate training data by collaborating with SMEs and validate model outputs
MLOps: Ability to build CI/CD for model retraining, versioning, and evaluation (precision, recall, BLEU, etc.)
API and Integration: Ability to expose ML models as REST APIs (FastAPI) for integration with product frontend.

Languages: Python, SQL

AI/ML/NLP: Hugging face transformers, OpenAI API, Spacy, Scikit-Learn, LangChain, RAG, LLM prompt-tuning, LLM fine-tuning

Vector Search: Pinecone, Weaviate, FAISS

Data Engineering: Airflow, Kafka, OCR (Tesseract, pdfminer)

MLOps: MLflow, Docker