Completed 1 Year as a Data Science Intern at Deloitte

Shubham Gautam
5 min readApr 30, 2023

--

A very important phase of any Computer Science Engineer’s life is about how he makes the best use of the final year of engineering since it shows whether you are capable to apply what you have learnt to practical use.

Well, being on the verge of end of my final year, I can definitely say that I have been able to apply my skills into the work I have done over the year.

I have finally completed my 1 year Data Science Internship at Deloitte and it has been an Exhilarating and a great learning experience for me which has definitely made me ready for what life holds in the coming years. Looking back at where my knowledge stood and what I have now, there has been an exponential rise of my capability to use Data Science Tech Stacks in real life projects.

Note -: This is the final blog of Series — Data Science Internship at Deloitte. You can also see the other 2 blogs showcasing the basics of what’s Automation in Auditing and the work I have done at Deloitte -:

(1st part -> https://shubhamgautam1211.medium.com/data-science-internship-at-deloitte-75e68b0bf21b

2nd part -> https://shubhamgautam1211.medium.com/completed-6-months-as-data-science-intern-at-deloitte-5e027f8690b4)

If anyone reading the article is unaware of what actually Data Scientists do, then let me tell you in crux, following are some points that I got the opportunity to work on during my Internship-:

(1) Data Collection or sometimes scrapping from websites

(2) Data Cleaning (most tiring but interesting task)

(3) Data Transformation by building Training Subsets

(4) Deriving Insights by Analyzing the Data

(5) Developing Algorithms to build Models either for Forecasting Data or Automating a Monotonous task.

One point that I specially want to lay my emphasis on is Data Cleaning. It’s known to everyone that there is no count of how much data is generated on daily basis and since we at Deloitte get real life data from Huge Clients, most of them are “Big Data” which in layman terms is a data which cannot be fitted inside an Excel. We work on Big Data on daily basis in Deloitte and cleaning using Data Science Tech stacks is one of the most rigorous tasks in the Pipeline that we follow. There are so many issues in the data and therefore we call it as Dirty Data (it’s dirtier than what you can imagine)

In order to automate the task of cleaning Big Data which cannot be accessed using Excel, we were told to develop models in Python and Spark Scala Framework. Combining of 100 + plus files was done using these languages within few minutes which earlier would have not been possible using Manual work.

Quick Pro Tip -> Combining of files can only be done after you standardize all the files in a similar format or it can give you wrong results, so get the validation details of the dataset first before going for appending of the files.

As I have told in my previous blog about how effective Power BI is as a visualization software, I also recently got the opportunity to do a project using Tableau.

As I have worked on Tableau also for some time, I would definitely say that although the interface is little complex but its efficacy and connectivity with Big Data is better than Power BI as It does not lags or hangs in between which is a small issue in Power BI. But, if you want to perform Data Transformation steps along with some insightful visuals, Power BI is your go to software as Tableau does not provide you much with the option of data cleaning or transformation.

Lastly, one of my most favorite and complex task that I got the opportunity to work on was Semantic similarity between 2 entities which involved the use of deep learning architecture. After analyzing and testing of various models, my team finally decided to make use of the BERT model architecture which provided a reasonable accuracy to then research further about the models hype parameters and fine tuning of the activation functions.

My Professors at Bennett University have a big role in my Progress as not only did they taught the AI-ML Subjects well but they also gave me the belief and support to work on complex and challenging projects. Working at Deloitte in such a supportive team atmosphere has definitely made my technical skills robust but also have made my communication and soft skills better which I believe have a huge role to progress in the corporate world.

PS: Learning never stops, will see you soon in the next blog!!

--

--