Skip to content

Latest commit

 

History

History
89 lines (76 loc) · 3.63 KB

README.md

File metadata and controls

89 lines (76 loc) · 3.63 KB

What is Data Science

  • Data Science is a process, not an event. It is the process where we use data to generalize the useful insights.
  • And the overall process is done by professionals like data analysts and data scientists.
  • From the data, you must be able to ask questions and be prepared to answer it. Suppose, there is a lot of data from the Census.
  • So, think what could be the possible questions? You may think, how many people are married, or have children? Which age group of people are in foreign? What is the literacy rate?

What is data?

  • These are the facts, values, figures, texts, audio, videos which get generated by ourselves in day-to-day life.
  • These are generated from our smartphones, photos, videos and texts. In the beginning, they are not being analyzed or utilized for solving business problems. Hours of videos are being uploaded in YouTube every second, thousands of Facebook users are posting something every minute etc. Information: The meaningful insights that come from data after proper analysis of it is information.
flowchart LR
    A((Data)) --> B(Vs)
    B --> BA(Volume)
    B ---> BB(Velocity)
    B ----> BC(Veracity)
    B ---> BD(Value)
    B ----> BE(Variety)
Loading

RoadMap | Data Scientist

  • Data Scientist are those skilled person who solves real world problem using the data. These are the genius who collect the data, analyze the data, manipulate the data, visualize the data and extracts useful information from the data. These are the skills the data scientist must have.
flowchart LR
    A((Data Scientist)) --> B(Skills)
    B --> BA(Statistics)
    B ---> BB(Python)
    B ----> BC(R)
    B ---> BD(Data Cleaning)
    B ----> BE(Preprocessing)
    B ----> BF(Visualization)
    B ----> BG(SQL)
    B ----> BH(Machine Learning)
Loading

Data Science Beginners roadmap

flowchart LR
    A((Data Science)) --> B(Skills)
    B --> BA(Linear Algebra)
    B ---> BB(Basic Mathematics)
    B ----> BC(Probability )
    B ---> BD(Data Cleaning)
    B ----> BE(Preprocessing)
    B ----> BF(Visualization)
    B ----> BG(Machine Learning)
    B ----> BH(SQL)
    B ----> BI(MongoDB)
    B ----> BJ(Numpy)
    B ----> BK(Pandas)
    B ----> BL(Matplotlib)
    B ----> BM(Sklearn)
    B ----> BN(Tensorflow)
    B ----> BO(Bs4)
    B ----> BP(PowerBI)
    B ----> BQ(Tablaeu)
    B ----> BR(Big Data)
    B ----> BS(Hadoop)
Loading

Resources

A good command in a linux, git github , docker and kubernetes is also a plus point. some tips:

  • Make your statistics strong.
  • Be good at programming. Use python(recommendation)
  • Register yourself in kaggle.
  • Read online blogs on toward data science or medium.