Data Science & Engineering Foundations

Early Career · Multiple Organizations

2015 – 2021

Overview

Early career roles across insurance and pharmaceuticals, specialising in software development. These foundational experiences built expertise in analytics, statistics, and software quality assurance, establishing the technical and professional skills that would later enable specialisation in data science and cloud engineering.

Roles & Responsibilities

Software Tester · TCP Lifesystems

June 2015 – August 2018 (Part-Time) · Bromley, Surrey

I worked 3 days a week part-time during my A-levels and then continued during all university holidays. My role was an SQL-heavy role as a software tester, where I would receive a product deck and have to ensure that this was thoroughly tested, the documents typically either related to new products and functionality or actuarial pricing changes. I would be responsible for designing and created a test suite in Selenium which would run through the scenarios in the software, these would then be verified using SQL in the back-end to ensure the changes worked as expected. These tests would then get added to our regression testing suite and SQL summaries added to the daily reports.

I would also work on any defects that had been found in production which would typically involve SQL debugging and then recreation in test environments to ensure things worked as expected.

  • Manual and automated software testing
  • Bug tracking and defect management
  • Test case development and execution
  • Quality assurance process improvement

Clinical Statistician and Data Scientist · GlaxoSmithKline

September 2018 – August 2019 · Uxbridge, Middlesex

I undertook a placement year at GSK, I worked a hybrid role across two teams: their statistical data sciences team and clinical research team. I applied statistical methods to clinical trial data, supporting pharmaceutical research and regulatory submissions. Developed expertise in statistical analysis, experimental design, and regulatory compliance requirements. One of these requirements was that, even if just one dataset or output was changed. All submission documententation had to be updated to reflect the latest versions and timestamps. This would mean that lots of manual work needed to be done copy pasting outputs into PowerPoint for submission. To reduce time here, I developed an automated PowerPoint generator in R which built PowerPoint decks for regulatory submission automatically from all the underlying datasets, timestamping everything as required to meet regulatory criteria. Enabling people to define the outputs they required once and then regenerate the PowerPoint in seconds.

  • Statistical analysis of clinical trial data
  • Experimental design and methodology development
  • Regulatory documentation and reporting
  • Collaboration with clinical research teams

Data Analyst · iPipeline

June 2020 – July 2021 · Bromley, Surrey

Covid had just hit and insurance companies wanted a way to act on their data and make decisions more quickly. What used to be a safe profession, such as a working in a supermarket, was now deemed high-risk profession. The typical rate changing process could take anywhere from 3-6 weeks and was often made on incomplete views of their current risk. With the unknowns at the time, companies were keen to control their exposure to certain medical conditions which previously they wouldn't have minded about. This led me to work with a client's underwriting team where we built an optimised reporting database and accompanying PowerBI dashboards which enabled underwriters a real-time view of their risk and would give insight to allow rate changes to be iterated on in hours as opposed to weeks.

Typically in underwriting, we have underwriting rules, these are deterministic rules that classify a policy in a number of categories. One of these categories is referred to underwriting where it typically requires two underwriters to agree on a rating decision before the offer is sent to the policy holder. This involved a lot of repeated work for underwriters and took up a lot of time. We built a simple XGBoost model that would take in all the person's answers and from this predict a rating, if the rating provided by the underwriter matched this then it would skip the confirmation step of the second underwriter and immediately send the quote. This reduced the drop-off that we saw at underwriting stage as it was a competitive time for insurance as Covid saw a surge in people taking out life/income protection insurance policies

  • Data analysis and reporting for insurance products
  • Database design and SQL query development
  • Dashboard creation and data visualisation
  • Collaboration with business stakeholders

Foundation for Future Growth

These early career experiences across insurance, pharmaceuticals, and software development provided diverse perspectives on data analysis, quality assurance, and technical problem-solving. The combination of statistical rigour from pharmaceutical research, analytical skills from insurance data analysis, and quality focus from software testing created a strong foundation for advanced work in machine learning and cloud engineering.

The cross-industry experience developed adaptability, communication skills, and the ability to translate technical concepts for diverse audiences—skills that continue to be valuable in consulting and client-facing roles.

Skills Developed

Analytics

  • Data analysis and interpretation
  • Statistical methods
  • Reporting and visualization
  • Business intelligence

Statistics

  • XGBoost model development
  • Experimental design
  • Hypothesis testing
  • Clinical statistics

Software Quality

  • Test planning and execution
  • Regression analysis
  • Quality assurance
  • Bug tracking
  • Process improvement

Technologies & Tools

SQLRPythonExcelSeleniumJIRA