Intern - Data Engineering

Location: Bangalore

Type: Internship

Number of positions available: Two

Duration: Approximately three months between May and August 2025. The period may be extended based on a candidate’s performance and/or interest.

About the Role

ARTPARK’s One-Health team tackles interconnected challenges in human, animal, and environmental health through collaborative and interdisciplinary efforts.

Working with city, state, and national governments, we support data-driven public health responses to endemic, epidemic, and climate-related threats through innovative solutions leveraging statistical and AI/ML-based approaches. 

In this role, you will have the opportunity to engage with leading experts in disease modelling, climate-health systems, engineering, and public health, both nationally and internationally, in a dynamic and highly motivated environment.

Key Responsibilities:

  1. Data extraction and cataloguing: 

    1. Extract climate, image, or epidemiological data from diverse sources. The data may be in a spatio-temporal, complex, semi-structured or unstructured format. 

    2. Integrate data into a coherent, harmonised format ready for use by advanced computational models. 

  2. Data and model pipeline development: 

    1. Develop and automate a robust, scalable data pipeline. 

    2. Develop data access mechanisms and policies.

    3. Ensure streamlined and reliable data flow. 

    4. Enable computational and simulation modellers to seamlessly access and utilise the data in their models without manual intervention and facilitate real-time processing.

  3. Exhaustive cataloguing and documentation for data and modelling work: 

    1. Catalogue details of data sources, extraction processes, and any standards and processes used.

    2. Document all data analyses, model details, results, plots, evaluations, insights, etc.

  4. Work with and support data analysts, data scientists, and/or computational epidemiologists

  5. Leverage state-of-the-art techniques (AI, ML, LLM, etc.) for production-grade data extraction and models.

Experience & Skills:

  1. The candidate should be pursuing a bachelor's or master’s degree in computer science, engineering, mathematics or a related quantitative scientific discipline.

  2. Experience and understanding of programming, preferably in Python, is required. If a candidate is good in programming, but not familiar with Python, we will expect the candidate to quickly ramp up on Python programming skills.

  3. Experience with processing raw datasets into clean, structured formats is highly desirable. The candidate may have gained such experience via coursework or projects.

  4. Attention to detail while working on data and/or models is highly desirable.

  5. Experience with AI, ML techniques is desirable, but not required.

  6. Experience with AWS, GitHub, databases, etc. is desirable.


About ARTPARK:

ARTPARK at IISc drives impact through innovations in AI & Robotics, by harnessing the best of research/academia, startups/industry, and government/nonprofits.

Our pioneering platform initiatives in language data & AI and health data & AI are driving national-scale impact with stakeholders such as MeitY’s Bhashini,  Office of PSA, ICMR, States and Cities. At ARTPARK, you will work with the best researchers in the country and around the world in a strong data-driven environment and have the opportunity to address systemic issues and implement solutions.

These platforms are in pursuit of our vision – AI for All.


Previous
Previous

Production Engineer - Composite

Next
Next

Program Lead: Deep-Tech Startup Programs & Investments