Ovido
Taal
  • Engels
  • Spaans
  • Frans
  • Portugees
  • Duits
  • Italiaans
  • Nederlands
  • Zweeds
Tekst
  • Hoofdletters

Gebruiker

  • Inloggen
  • Account aanmaken
  • Upgrade naar Premium
Ovido
  • Startpagina
  • Inloggen
  • Account aanmaken

Intro to Data Analysis - Module 2

Unstructured Data

Free-form data that can't be organized into rows and columns.

Define Structured Data?

Data that's well organized and in formats that can be stored in a database. (such as a csv file)

Define Semi-Structured Data?

Data that's partially organized and partially free-form. (For example, emails)

What do Query Languages do with data?

Access and manipulate data from databases.

What do Programming Languages do with data?

Develop applications and control their behavior.

What do Shell and Scripting languages do with data?

Automate repetitive operational tasks.

What's a Data Repository?

Data that has been:

- COLLECTED

- ORGANIZED

- ISOLATED


Before being used for reporting, analytics and archival purposes.

Name Five Types of Data Repositories

- DATABASES (Relational & Non-Relational

- DATA WAREHOUSES


- DATA MARTS


- DATA LAKES


- BIG DATA STORES

What's defines a Database?

[DATA REPOSITORY] - Defined by:

- Following a set of organizational principles


- Only storing specific data


- Using specific tools to query, organize, and retrieve data

What's a Data Warehouse?

[DATA REPOSITORY] - Consolidates incoming data in one place.

What's a Data Mart?

[DATA REPOSITORY] - Sub-section of a warehouse that isolates data for a specific use case.

What's a Data Lake?

[DATA REPOSITORY] - Stores large amounts of structured, semi-structured, and unstructured data in their native format.

*Often used as staging areas.

What does a Big Data Store do? (2 parts)

[DATA REPOSITORY] - 1. Distributes computational and storage infrastructure.

2. Used to store, scale, and process very large data sets.

What's the ETL process?

An automated process that converts raw data into analysis ready data.

Describe the ETL process

EXTRACT data from source location.

TRANSFORM raw data by cleaning, enriching, standardizing, and validating it.


LOADING the processed data into a destination system or data repository.

Is a Data Pipeline and ETL process the same thing? Explain.

Yes. Both encompass the process of moving data from its source to a destination such as a data lake or application.

What's Big Data?

The vast amount of data being produced by people, tools, apps, and machines.

What are The Vs of Big Data?

[V]ELOCITY
[V]OLUME

[V]ARIETY

[V]ERACITY

[V]ALUE

Velocity

The speed at which data accumulates.

Volume

The scale of the data or the physical size of stored data.

Variety

The diversity of the data such as sources and data type.

Veracity

The quality and origin of the data including consistency, completeness, integrity, and ambiguity.

Value

The potential to turn the data into tangible value.

Quiz
examen spe
glosor v 3
4 Fun
2 Great
i sumeri
Intro to Data Analysis - Module 1
forensic instrumentation
chapter 9
tisular - copia
tisular
Exam
siciologia
laylay
1 mycket
engelska glosor 1
bio 207 lab
bio 207 lecture 1-3
aa
vulcanismo
family
Privat elonomi
figure de style
China
l'organisme pluricellulaire
glosor 1
anglais
Impact of WWI and WWII onto surgery and technology
Terminologie (Radicaux) (Q-R)
proteine
geografia tema 6
WWI impact of surgery
all the vocabulary words listed
Unit 7, Unit 8, Unit 9, Unit 10, Unit 11
Arsène Lupin
7th Grade English Vocabulary
T. 7. El montaje
Actual boards
svenska prov
Tema 5: autores
PROPRIÉTÉS GÉOMÉTRIE
latijn moeilijke woorden 12-18
spanska verb
Soins
franska 3a
hang - meet
7th Grade English Vocabulary
7th Grade English Vocabulary
Valencias mesa periodica
numeros de oxidacion
nil