Intro to Data Analysis – Data Reading

With today’s technology advances, data is without a doubt the most important component for institutions, organizations, and all other entities. As a result, there is an urgent need to leverage the available data to make a difference.

Data analytics fo…


This content originally appeared on DEV Community and was authored by Ondiek Elijah

With today's technology advances, data is without a doubt the most important component for institutions, organizations, and all other entities. As a result, there is an urgent need to leverage the available data to make a difference.

Data analytics focuses on processing and performing statistical analysis on existing datasets, with a focus on developing techniques to capture and organize data to uncover actionable insights for ongoing problems, as well as determining the best manner to communicate this data.

Data analysis is a type of data analytics that is used in businesses to examine data and draw conclusions. Data gathering, data cleaning, data analysis, and data intercept are the steps taken in data analysis to ensure that you comprehend what your data is trying to communicate.

Source — Stack Overflow

As an introduction to data analysis, this post will teach you how to read data that is offered in various formats such as csv, json, or even as a database file.

Table of Contents

  1. Data from a CSV file
  2. Data in SQL flavour
  3. JSON files

Reading data from a CSV file

To read data from a comma-separated values (csv) file into DataFrame we use the pandas.read_csv function.

The read_csv function accepts numerous parameters, the type of which depends on the nature of your dataset or your aim.
Among the most frequently used parameters, excluding the mandatory filepath_or_buffer include sep,delimiter,header, index_col e.t.c

Read comma separated file

The sep parameter, which is short for separator, essentially tells the interpreter how to separate the data items in our CSV file.The interpreter assumes that the delimiter used is a comma by default if the sep parameter is not given.

from pyforest import *
df = pd.read_csv("cereal.csv")
df.head()

Read tab separated file

from pyforest import *
df = pd.read_csv("cereal_tab.csv",sep='\t')
df.head()

Read semicolon separated file

from pyforest import *
df = pd.read_csv("cereal_semicolon.csv",sep=';')
df.head()

Reading Data in SQL flavour

This section involves reading data from various SQL relational databases using pandas.

MySQL database

from pyforest import *
from sqlalchemy import create_engine

# provide a connection string/URL
db_connection_str = "mysql+mysqlconnector://mysql_username:mysql_user_password@localhost/mysql_db_name"
# produce an Engine object based on a URL
db_connection = create_engine(db_connection_str)
# read SQL query or database table into a DataFrame.
df = pd.read_sql('SELECT * FROM table_name', con=db_connection)
# return the first 5 rows of the dataframe
df.head()

Source — Stack Overflow

PostgreSQL

from pyforest import *
from sqlalchemy import create_engine
# produce an Engine object based on a postgresql database URL
engine = create_engine("postgresql:///psql_dbname")
# read SQL query or database table into a DataFrame.
df = pd.read_sql('select * from "user"',con=engine)
# return the first 5 rows of the dataframe
df.head()

SQlite database

from pyforest import *
from sqlalchemy import create_engine

# connect to a database
engine = create_engine("sqlite:///database.db")
# read database data into a pandas DataFrame
df = pd.read_sql('select * from user', engine)
# return the first 5 rows of the dataframe
df.head()

Reading data from JSON files.

Reading data from a JSON file is as simple as reading data from a CSV file. The pandas.read_json function transforms a JSON string to a pandas object with ease. The first parameter it accepts is path_or_bufa, which must be a valid JSON str, path object, or file-like object. This function also has a number of other parameters that it takes.

from pyforest import *
df = pd.read_json('cereal_default.json')
df.head()

If you enjoyed this article, please leave a comment, like it, share it, and follow me on Twitter @dev_elie.

Reference(s)

  1. analyticsvidhya
  2. pandas


This content originally appeared on DEV Community and was authored by Ondiek Elijah


Print Share Comment Cite Upload Translate Updates
APA

Ondiek Elijah | Sciencx (2021-10-14T12:12:23+00:00) Intro to Data Analysis – Data Reading. Retrieved from https://www.scien.cx/2021/10/14/intro-to-data-analysis-data-reading/

MLA
" » Intro to Data Analysis – Data Reading." Ondiek Elijah | Sciencx - Thursday October 14, 2021, https://www.scien.cx/2021/10/14/intro-to-data-analysis-data-reading/
HARVARD
Ondiek Elijah | Sciencx Thursday October 14, 2021 » Intro to Data Analysis – Data Reading., viewed ,<https://www.scien.cx/2021/10/14/intro-to-data-analysis-data-reading/>
VANCOUVER
Ondiek Elijah | Sciencx - » Intro to Data Analysis – Data Reading. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2021/10/14/intro-to-data-analysis-data-reading/
CHICAGO
" » Intro to Data Analysis – Data Reading." Ondiek Elijah | Sciencx - Accessed . https://www.scien.cx/2021/10/14/intro-to-data-analysis-data-reading/
IEEE
" » Intro to Data Analysis – Data Reading." Ondiek Elijah | Sciencx [Online]. Available: https://www.scien.cx/2021/10/14/intro-to-data-analysis-data-reading/. [Accessed: ]
rf:citation
» Intro to Data Analysis – Data Reading | Ondiek Elijah | Sciencx | https://www.scien.cx/2021/10/14/intro-to-data-analysis-data-reading/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.