via Career pages·3d ago

Python-Pyspark Developer

Infosys

Full-timeOn-site

Location:Hyderabad, IndiaType:Full-timePosted:3d ago

We are looking for an experienced Python PySpark Developer to design, develop, and optimize large-scale data processing systems. The ideal candidate will work on big data platforms, build scalable ETL pipelines, and process high-volume datasets using Spark and Python.

Key Responsibilities

Data Engineering & Development

Develop and maintain data pipelines using Python and PySpark Process and transform large datasets in distributed environments Build scalable ETL/ELT workflows

Big Data Processing

Work with Apache Spark (PySpark) for batch and real-time processing Optimize Spark jobs for performance and efficiency Handle structured and unstructured data

Data Integration

Ingest data from multiple sources:

Databases (SQL/NoSQL) APIs Files (CSV, JSON, Parquet)

Integrate with data platforms like:

Hadoop (HDFS) Cloud (AWS, Azure, GCP)

Performance Optimization

Tune Spark jobs (partitioning, caching, parallelism) Optimize SQL queries and transformations Improve data processing efficiency and cost

Collaboration & Support

Work with data engineers, data scientists, and analysts Translate business requirements into technical solutions Participate in code reviews and agile development practices

Monitoring & Troubleshooting

Debug and resolve issues in data pipelines Monitor job execution and data quality Ensure reliability and availability of data workflows

Don't want to miss the next one?

Subscribe to daily email alerts for roles matching your interests.

Get email alerts