top of page

Python Developer - Web Scraping

It's not just about what we do,
it's about who we are and how we do it

Join us as our next Python Developer! If you have the skills we need, send your resume to careers@neolumin.com.

Job Description:

We are seeking a Python Developer - Web Scraping to help us extract paged data from websites that do not offer APIs. The role involves creating scripts to scrape data efficiently without overloading the target websites and ensuring the integrity of the data collected.

 

Responsibilities:

  • Develop and maintain web scraping scripts to extract data from specific websites.

  • Handle pagination and scrape data across multiple pages.

  • Use tools such as BeautifulSoup, Scrapy, or Selenium to parse HTML and extract required data.

  • Ensure respectful scraping practices, including implementing rate limiting and obeying robots.txt rules.

  • Store scraped data in organized formats (CSV, JSON, etc.) or databases.

  • Monitor scraping processes and handle errors such as IP blocking or failed requests.

  • Collaborate with the team to understand data requirements and ensure accuracy in scraping.

​​

Core Skills:

  • Proficiency in Python, specifically for web scraping purposes.

  • Familiarity with scraping libraries and tools such as BeautifulSoup, Scrapy, Selenium, and Requests.

  • Understanding of HTML, CSS, and how to navigate through web page structures.

  • Ability to handle pagination and scrape data efficiently from multi-page websites.

  • Knowledge of ethical scraping practices, including handling rate limits and avoiding overloading websites.

  • Familiarity with storing and processing data using formats like CSV or JSON.

  • Ability to work independently and manage your own time effectively.

 

Desirable Skills:

  • Familiarity with JavaScript for scraping dynamic content.

  • Experience with using proxies or solving CAPTCHAs when required.

  • Knowledge of pandas or similar libraries for processing and cleaning data.​

 

 

bottom of page