My name is Sakshi, and I am currently pursuing my second master’s at Dalhousie University. I am in the final year of my Master of Information program. I love my program as it is a combination of management and technology. Prior to this, I completed my master’s in business administration in finance and marketing from Devi Ahilya University in India.
This past summer I was working as a Machine Learning and Artificial intelligence intern with DeepSense as part of their Data Readiness Program and it has been an amazing journey in my career development. DeepSense helped me schedule an interview with a company working in the ocean sector named Global Spatial Technology Solutions (GSTS) and I was able to get the internship.
I found the estimated time of arrival (ETA) prediction for vessels considering the environmental conditions (weather variables) to be very interesting.
Initially, I did some learning on basic terminologies as everything was very new to me. Both of my supervisors at GSTS were very helpful and generous. For the initial weeks, I did exploratory data analysis (EDA) using RStudio on the training data set and then worked my way upwards to data modeling.
The challenges I faced while working were mainly issues with cleaning the data and getting to know it well. I believe whatever position you are working on in the data field, you will always have to do some data cleaning as one might find outliers or need to make changes to the way the data is received from the source.
While working on the project, it was a lot of trial and error as I learned about what preprocessing steps should be taken to clean the data and remove any outliers. For example, while plotting the linear model, I found that some vessels had a time of arrival of 30 days, and some had a negative time of arrival which meant that the vessel has already arrived at the port, but it had actually not. Some vessels do this to reserve their spot in the port so that they don’t have to wait when they arrive. This turned out to be an outlier and thus was necessary to remove from the data set in order to move forward with the analysis.
Overall, I learned that it is always important to go through your data once and understand what the variables represent and what is the scale in which the different variables are measured so that normalization of data can be done before analysis. Exploratory Data Analysis (EDA) is an important step, that is required before actually using the machine learning model.
GSTS offered me a part-time work along with my studies to continue working on the project and move to new projects after completing this one. I am excited for the new role and know I will only keep growing my career thanks to this opportunity.