Job Description
We are looking for an Associate Data Engineer to join our IT Data Strategy department. This is a hybrid role based in Bangalore, Pune, or Mohali, reporting to the Principal Data Engineer.
You will be a key member of our Enterprise AI Data Platform team, designing robust unstructured data pipelines for processing and ingestion into Vector and Graph databases. You will collaborate cross-functionally to shape our data infrastructure and drive the technical evolution of our data capabilities.
What you’ll do (Role Expectations)
- Collaborate with Data & Technical architects, integration, and engineering teams to capture data pipeline requirements and develop technical solutions
- Profile and quantify the quality of data sources while building data pipelines for integration into Vector DB, Graph DB, and the Snowflake Enterprise Warehouse
- Partner with the Data Platform Lead to design and implement data management standards and best practices
- Build in-house products that drive scalability and efficiency across the company to enable growth and operational excellence
- Develop large-scale and mission-critical data pipelines using modern cloud and big data architectures while continuously learning next-generation technologies
Who You Are (Success Profile)
- You are a problem-solver who seeks out challenges because you are energized by finding solutions, knowing that solving the hard problems delivers the biggest impact.
- You are a learner with a true growth mindset and never stop developing yourself, actively seeking feedback to become a better partner and a stronger teammate.
- You are a positive force who approaches hard problems with constructive energy and a ‘can-do’ spirit that inspires your team to stay focused on the solution.
- You are driven by innovation and have a deep curiosity for how things work, believing in the power of technology to accelerate transformation through secure and scalable ways.
- You are a pragmatic builder obsessed with creating, iterating, and shipping, balancing technical excellence with the need to deliver value to users quickly.
Minimum Qualifications
- Foundational knowledge of DBMS concepts including normalization, denormalization, relation models, ACID principles, and transactions
- Understanding of distributed data processing frameworks such as Apache Spark, Hadoop, or Apache Flink
- Proficiency in Python coding
- Experience with SQL and scripting languages such as Korn Shell or Scala
- Hands-on experience with Python orchestration frameworks like Airflow, Prefect, or Dagster, or specialized AI tools like LangChain and AutoGen
Preferred Qualifications
- Experience with Ray or Dask for scalable Python applications within AI/ML contexts
- Familiarity with Model Context Protocol (MCP) for agent context management
- Knowledge of Streamlit for building interactive data applications and dashboards






