NHRDN CAREER FEST 2017: THE FUTURE OF JOBS: CAREERS IN DATA SCIENCE

Economic transformation and employability is a key factor towards making India’s demographic dividend, the young workforce, a significant competitive advantage. In this context, the National HRD Network (NHRDN) Mumbai Chapter, recently organised a Career Fest at Nehru Centre in Mumbai. The aim was to provide a common platform to bring together industry, academia and student community, with the intention of helping students to make informed career choices.

Having a rich experience in Data Science, Vivek Shrivastava, Executive Director in Data, PwC India, spoke on career opportunities as a Data Scientist. Vivek has over 20 years of consulting experience in the field of Analytics, Big Data, Data Science, Business Intelligence (BI), Information Management (IM) and Extensive Performance Management (EPM).

Vivek is currently responsible for defining the product/solution strategy, go to market initiatives and incubating differentiated solutions from an industry domain perspective. Corporate Citizen brings to you the excerpts from the development session, where Vivek shared some pertinent aspects of his profession and took the students through the life of a Data Scientist

My journey

I am a very inquisitive person by nature, if something is happening and I don’t know about it, it is very difficult for me to resist. I completed my graduation in Information Technology (IT), that time IT was not what it is today, 25 years back, people were hesitant to go in IT. I was also hesitant at first and I didn’t wanted to join IT. However, my father said, ‘No, IT is a good field, you should go for it’. He had the vision to see the future a little bit at least. He is an engineer so he said, ‘Go ahead and become an engineer’. Therefore, I took up software engineering. It was a good combination of software, hardware and firmware. After that, I did my master’s from BITS Pilani. I had the option to change the subjects but I didn’t, because I felt that four years of software engineering was not enough. When I started working, I was working on ‘mini frame computers’ you must have read it in your first year of ‘Fundamentals of Computer Science’. I was working on JCL (Job Control Language) and COBOL (Common Business Oriented Language) programming. Very quickly, the Word software came in and later it was completely takAugust 1-15, 2017 / Corporate Citizen / 23 en over by RDBMS (Relationship Database Management System). After that, client-server related technology came in and 3-tier and web related architecture technologies came in, post this, data warehouses and data sciences, all of these things came into picture.

Why should you consider data scientist as your career choice?

Let’s take a look upon life of a data scientist. Who are data scientists? ‘Data scientists are big data wranglers’. This is the shortest definition I can think of. A little bit more descriptive version would be, ‘they take enormous amount of messy data points, structured and unstructured and use their skills in maths, statistics and programming to clean and organise data’. If you are good at maths then you should think of this field. Let’s say you are not good at maths, no worries, if you are good at programming then you should improve your maths significantly. I was above average in maths and that has helped me to stay grounded in the fundamentals of data. That is why I was able to go through such a career, which is so diverse across different kinds of technologies; I was grounded on how data works. Even today, whenever I go to meet a client, I am very humble and honest and say that I am still learning and I am still educating myself. That is one area of my interest and I am going to focus on it.

Data scientists apply analytical power in depth. From here, it starts getting slightly different. Lot of students present here are from different fields like engineering, insurance, operational research etc. That’s good, this particular field requires you to stay with a certain sector for a long period. Not like a generic engineer where you can keep moving from one project to another. You have to specialise in a sector or in an industry. The reason for this is the problems, which one tries to solve using data, which are deep rooted in a sector or an industry. Problems are not occurring because there is so much of data, a lot of people say there is so much of data, how can we handle this? That is not the case, there are businesses happening, there are processes being followed, those processes have exceptions, and these exceptions have many things happening to them, which all the ERPs (Enterprise Resource Planning) of the world are doing. They are generating these exceptions and that’s what is causing these problems. Every industry in this world has problems and 80% of those problems can be solved using data science.

“If you are good at maths then you should think of this field. Let’s say you are not good at maths, no worries, if you are good at programming then you should improve your maths significantly”

Why to spend so much time on industry knowledge?

Whatever you will do as a data scientist, you will do in a context. As I said earlier, you won’t be doing generic programming, there is a context which is related and linked to the industry. What you are trying to do is, you are trying to uncover hidden solutions to business challenges. There is a very well defined business challenge. I am a practitioner, I have been a hardcore data focused person for the last 12 years largely doing management related work and huge teams are working for me doing the actual technical work.

My role

If you define my role it is very simple-I spend time understanding client’s problems and helping them recognise that they have these problems. In my 23-24 years of career, I feel my biggest achievement is that I can explain very complex problems in very simple words. That’s the only achievement I have. Rest all, my team does. The reason I am telling you about this is, this is not something that you can get into when you are out of college. This is something which you have to curate in yourself as you progress through your career, you have to be conscious; you have to take efforts to understand what your course does not teach you and learn it yourself by enrolling for courses. For example, if people are being trained on banking and insurance and they want to become data scientists today, they have to fill in their skill gaps in maths and statistics and apart from that, they have to have a keen interest to be a data scientist.

If I am working in an insurance firm and I have a list proficiency of modelling to be done–how does the insurance industry derive pricing based on the file of the customer? That’s nothing to do with technology, it has to do with the business itself. You have to go and sit with an actuary; you have to invest that time. Your job function may not require you to do that. This is something if someone would have guided me at the beginning at my career, it would have an easier life for me. It was very difficult for me to do this, because I was coming from a technical background. It was only for my interest and inquisitive nature I am doing these things. I have also dabbled with various industry areas, I cannot say better or worse because when large part of my career in data science was not something in the forefront. There were other technologies. But if I started my career now, I will focus on one or two related industries and make myself an expert at it so that I can solve almost all the problems in that industry using data science. But that’s the mindset with which you have to be ready to come in the industry to attract employers like us. I ask these questions in interviews, last year I did 142 interviews, over a course of three months for the post of data scientists and I got only one data scientist, that candidate also left. The main reason why we rejected so many candidates is that they were not focused. At this age, if you are not focused, then it is not going to workout well. You should be clear about you goals. The candidate, which we selected came in with 10 years of experience and said, ‘I am also interested in analytics and I am currently pursuing my degree in chemical engineering’. He is a human; he is not a robot to do multiple things at a time in-depth. Another reason I could think of rejecting so many candidates was, the candidate said, ‘I will do the glamorous part of creating a model and do the maths but I will not touch the data. I need three analysts who are going to clean the data for me’. Suppose I give him the team, he will not be able to explain what he wants. If you need more people, it is okay, no one will tell you to do all the work by yourself.

There are deep problems in the industry. If anyone, who has a little bit of exposure on data warehousing and data technologies would know that all the core of all this is Data Modelling. Data Modelling is an extremely business focused activity. Whatever Data Modelling works in a particular field does not work in any other field. One cannot say that the processes that work in an insurance agency, I will use the same processes in the IT industry. It will not work. They are not designed to work like that.

“Today, I have a choice; I can filter THE NEWS THE way I want. Tremendous amount of data is processed to do anything like this. Sophisticated analytical program, machine learning Statistical model, all of these things are important for you to become an expert”

Data scientists in a nutshell

Data scientists are better at statistics compared to software engineers. On the other hand, they are better at programming as compared to statisticians. It is very long list to meet. Someone cannot join this field if they are not deep into maths, statistics and are able to code as well.

A day in a life of a data scientist

I was very fond of war movies when I was young. The Germans in the World War 2 would send cryptic messages. Like Sherlock Holmes, you would have to first clear the dust and then start to solve the problem. That’s exactly how it is, it would be an one line problem statement based on a sector. It would be very difficult for you to understand. You have to brush many layers of the problem aside. Getting down to the problem statement is an uphill task. You should be able to conduct undirected research and frame open-ended industry questions. For example, if someone had not decided that I would just try to link mobiles with taxis, nothing else, and we know what has come out of it, we would not have seen Uber. This is result of undirected research and open-ended industry questions. There are many questions that are not solved and there are many industries making huge amount of money. I never had a CXO meeting wherein if I go and start solving industry related problems that they have, and if I tell that I have some ideas about it, I am not talking about the solution, they will give me some more time to work on it.

Anybody who knows a little about big data will know what I am talking about. In your daily lives, you can see how much data comes your way. Earlier, people used to read newspapers and spend half an hour or so. Today, I have a choice; I can filter the way I want. Tremendous amount of data is processed to do anything like this. Sophisticated analytical program, machine learning statistical model, all of these things are important for you to become an expert. The reason I say to be an expert because this is an expertise field. It is like becoming an astronaut and going on moon-these kind of achievements. Be prepared to give in your 200%. You should be prepared to dig in deep and get the diamonds out of that, as it is said. There is very little predictability in this job. You cannot say that after some time I will build processes around it, put a bot and start working. It will not happen that way. In fact, 88% of the data scientists have a master’s degree and 46% have a PhD. Be prepared to educate yourself, it is a long path, but there is no other way.

“Data scientists are better at statistics as compared to software engineers. On the other hand, they are better at programming as compared to statisticians”

What is the education required?

I don’t have a specific answer. Along with your graduation degree, I would recommend to enrol for certification courses. Start participating in hackathons and other things. Boot camps will surely help you. It is very amazing if you are able to do that. I feel that this generation is far more ready to leave the old protocols, jump into the new and difficult technologies, and grasp them efficiently.

By Vineet K

>>>>>