Job Title: Data Engineer
Reports To: Data Engineer
Employment Type: Full-Time, Permanent, remote
Netstock’s leading business intelligence products are underpinned by cutting edge expertise in data, optimisation metric and phenomenal software development. Our backers are expressly investing in R&D for product innovation, to drive revenue growth of 350% over the next 3 years. Our next innovation frontier is leveraging the 30+TB of customer data under our management, to develop disruptive products that none of our competitors can match. To help get us there, as well as ensure that we engineer data pipelines and workflows that can support the planned scaling to 130TB of data spread across 10,000 customer databases, we are looking for a data engineer of exceptional talent, vision, energy and ingenuity.
Responsibilities
- Technology Research and Testing: Investigate the most suitable technologies to build out a new data lake environment. Assess the pros and cons of different solutions and recommend the most efficient, cost-effective and scalable options.
- Infrastructure Liaising: Collaborate closely with the infrastructure team to communicate necessary infrastructure requirements for building and maintaining the data lake. Coordinate efforts and ensure all infrastructure needs are met.
- Data Lake Development: Design, construct, and deploy the data lake. Ensure it is built with scalability and security in mind to accommodate future data growth and needs.
- Data Pipeline Construction: Develop robust and efficient data pipelines for the ingestion, transformation, and loading of data into the data lake. This includes building pipelines from external sources (APIs) for internal analytics and benchmarking. Ensure the pipelines are optimized for real-time and batch processing, automating as many data processes and workflows as possible
- Data Management: Manage both internal data and customer data. This involves cataloging, storing, retrieving, and maintaining the data in a manner that facilitates easy access and understanding. Data quality & integrity will be critical to ensure that every data process runs smoothly, is optimized and that the output can be trusted.
- Internal Data Provision & Reporting: Engineer the internal data to create useful and insightful reports, as well as consumable datasets for both technical and non-technical internal stakeholders. Ensure data integrity and accuracy of these datasets/reports.
- Customer Data Engineering: Work with customer data for benchmarking and future AI analysis. This includes data cleaning, standardization, transformation, and loading into the appropriate databases or platforms.
- Data Security and Compliance: Uphold data privacy and security standards, and ensure that all data handling practices are compliant with relevant laws & regulations, as well as ISO27001 standards.
- Collaboration with Data Analyst: Work closely with the data analyst to understand data needs and requirements. Make sure the data is structured and cleaned in a way that allows the data analyst to effectively conduct their analysis.
- Reporting to head of department: Regularly report progress, challenges, and successes to the head of the data department. Maintain open lines of communication to ensure alignment with broader strategic goals.
- Future-proofing: Stay up-to-date with the latest trends, technologies, and best practices in data engineering to ensure the data infrastructure is scalable and capable of accommodating future growth developments.
Skills and Qualifications
- Grade 12 or equivalent (essential)
- Related tertiary qualification in Computer Science, Data Science, or a related field (desired)
- Experience in a related field e.g. DevOps, Software Development, Data Engineering (essential)
- Experience with SQL and NoSQL databases such as PostgreSQL, MySQL, MongoDB (essential)
- Proficiency in big data distributed computing frameworks like Hadoop, Spark, or Hive
- Proficiency in Python, Java, Scala, or other data-oriented programming languages (essential)
- Knowledge of data pipeline tools such as Kafka, Airflow, or NiFi (essential)
- Knowledge of cloud technologies like AWS, Google Cloud Platform (essential)
- Knowledge of data modeling and data architecture standards
- Conceptual understanding of machine learning algorithms and knowledge of deploying the models into production
- Knowledge of securing data and data privacy principles
- Familiarity with Agile methodologies
- Knowledge of ETL/ELT processes and tools
- Familiarity with Linux or Unix-based systems (desired)
Technology Stack Knowledge
- SaltStack (can learn on the job).
- Experience with a BI tool required (Power BI, Google Analytics, etc.)
- Experience with an ETL tool required (Kafka, Airflow, or NiFi , etc.)
- MySQL/ Percona (must know)
- Postgresql (must know)
- Duckdb, Apache Airflow (can learn on the job)
- Google Workspace (can learn on the job)
- Messaging Queues (Rabbit MQ)
Competencies
- Scripting experience: Bash, Python, etc.
- Ability to manage internal and external projects from inception to completion
- Ability to manage and support the data needs of multiple teams, stakeholders and products
- Attention to detail, ability to think logically and demonstrate strong analytical and problem solving skills
- Good judgment is essential
- Extensive technical knowledge across various technologies
- Excellent oral and written communication skills
- Good understanding and familiarity with cloud infrastructure
- Experience with Database management: MySQL
- Experience with Logging and security hardening; system documentation and designs.
- Deep understanding of security and privacy by design principles
- Good understanding of Data Engineering principles, processes and tools
- Build staging and development environments when needed
Circumstances
- Flexibility to travel within South Africa from time-to-time for team or company get-togethers.
- Netstock will provide the hardware necessary to perform this role (including UPS power).
- As this is a remote role, you’ll need access to stable, secure, high-speed fixed-line internet connectivity (Netstock provides a subsidy towards internet subscription costs).
Before applying, please make sure you read the Netstock Candidate Privacy Policy referenced below the Privacy Policy on our website.
This position is subject to pre-employment screening, however candidates will not be unfairly discriminated against.
We receive a high number of applications per role and therefore ONLY successful applicants will be contacted.
This role is open to residents of the Republic of South Africa. Although we may consider candidates with permanent residency, preference will be given to citizens of the Republic of South Africa.
Working with us
Netstock was founded with a clear vision: To give the hungry up-and-comers the capability to level the playing field and compete with the industry giants. Working here means embracing that “challenger” mentality: We are smart, scrappy fighters, building our edge with the agility to move faster than the big guys — pioneering smarter ways to work and innovating new ways to deliver powerfully easy to use technologies for our customers.
About us
Netstock is the driving force accelerating the growth of organizations worldwide. Over the last 15 years, we’ve built out a regional presence that gives us deep insights into supply chain planning factors in each industry. We continue to enhance our supply chain planning solutions, making our predictive engine smarter, accelerating automation, and adding sophisticated new capabilities such as AI and machine learning.
You can read more about Netstock’s history and our product offering at Netstock