Skip to content

sharminshanta/data-science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 

Repository files navigation

data-science

Data Science E-Learning

What is data science?

Data science is a multidisciplinary field that combines statistics, computer science, and domain knowledge methods to extract meaningful insights and wisdom from data. It involves collecting, analyzing, and interpreting large amounts of structured and unstructured data to make informed decisions, predict future trends, and solve complex problems.

Key components of data science:

Data Collection:

Gathering data from various sources, such as databases, sensors, or web scraping.

Data Cleaning:

Preprocessing data to remove errors, inconsistencies, or missing values.

Data Analysis:

Applying statistical techniques and algorithms to uncover patterns, relationships, and trends.

Machine Learning:

Developing models that can learn from data to make predictions or classifications.

Data Visualization:

Presenting the data and findings in graphical formats to make the results easier to understand.

Interpretation and Decision Making:

Concluding the analysis and applying them to solve real-world problems.

Data science is used in many industries, such as finance, healthcare, marketing, and technology, to improve operations, forecast trends, and drive innovation.

What is data science used for?

Data science is used to study data in the following ways:

  1. Descriptive Analysis
  2. Diagnostic Analysis
  3. Predictive Analysis
  4. Prescriptive Analysis

What do we need for Data Science in Python?

Python is a programming language widely used by Data Scientists. It has some libraries with large collections of mathematical functions and analytical tools. In this tutorial, we will use the following libraries:

Pandas:

This library is used for structured data operations, like importing CSV files, creating data frames, and data preparation

Numpy:

It is a mathematical library with a powerful N-dimensional array object, linear algebra, the Fourier transform, and more.

Matplotlib:

This library is used to visualize data.

SciPy:

This library has linear algebra modules

What is the data science process?

The data science process solves business problems. A data scientist works with business stakeholders to understand what business needs. Once the problem has been defined, the data scientist may solve it using the OSEMN data science process:

O – Obtain Data

S – Scrub Data

E – Explore Data

M – Model Data

N – Interpret Results

About

Data Science E-Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published