Are you passionate about turning raw data into powerful business insights? At Vlex, we’re looking for a **Junior Data Engineer** to help fuel our commercial and marketing strategies with high-quality, actionable datasets. In this dynamic role, you’ll design and maintain automated data pipelines, leverage web scraping techniques, and ensure clean, structured data flows from external sources into our internal systems. If you enjoy working with Python, solving technical challenges, and being part of a collaborative, forward-thinking team, we’d love to hear from you. **Role Summary**: The Junior Data Engineer is responsible for building, maintaining, and optimizing automated data pipelines that provide clean, structured, and actionable datasets to support marketing and commercial initiatives. The role focuses on web scraping, data normalization, environment setup, and technical data handling to ensure the continuous flow of high-quality information from diverse external sources into our internal systems. **Key ResponsibilitiesAutomated Data Collection** - Design and implement **automated scraping processes** from public websites (e.g., bar associations, universities, law firms) using **Python**, **Scrapy**, or APIs to generate structured contact and company databases. - Execute scraping from **web apps or event platforms** to capture attendee lists and participant data from global conferences and webinars. **Scraping Environment Management** - Set up and manage scraping environments, including the use of **virtual machines (VMs)**, **proxy management**, and **IP rotation**, to optimize performance and avoid detection or throttling from target servers. **Data Cleaning & Structuring** - Transform raw scraped or imported data into usable formats by: - Cleaning and **normalizing datasets**: - Generating **import-ready files** for tools like **HubSpot** or advertising platforms - Clean and maintain existing **commercial databases**, ensuring data integrity, removing anomalies, and verifying completeness against internal standards. **Data Extraction & Technical Collaboration** - Run **SQL queries** to support structured data extraction, transformation, and analysis as needed for internal consumption. - Collaborate via **GitHub** to manage code versions, track changes, and maintain alignment with other contributors. **Required Technical Skills** - **Python** (automation and scraping-focused development) - **Web scraping libraries** (Scrapy, BeautifulSoup, Selenium) - **API consumption** and basic integration - **Automation tools** and scripting best practices - **Environment setup**: VMs, proxies, IP rotation strategies - **Data structuring and cleaning** (with commercial use in mind) - **SQL** (basic queries and data manipulation) - **GitHub** (version control and collaboration workflows) - **Familiarity with marketing tools** such as **HubSpot**: - **Ingles C1** This is a **full-time position (8:00 a.m. to 5:00 p.m.)** offered under a **permanent contract**, with **hybrid or remote work options**. **Salary is negotiable** based on experience and skills. Tipo de puesto: Tiempo completo