itm 618 week 3
What is data mining?
- The process of extracting interesting (non-trivial, implicit, previously unknown and potentially useful) knowledge or patterns from data in large databases
What are the objectives of data mining?
- Discover knowledge that characterizes general properties of data
- Discover patterns on the previous and current data in order to make predictions on future data
What is an alternative name for data mining?
Knowledge discovery in databases (KDD)
In the CRISP-DM process, what do you do under the busines understanding process?
- Determining business objectives: Gathering background information, compiling the business background, and defining business objectives
- Assessing the situation: Requirements, assumptions, and constraints, What sort of data are available for analysis? Do you have access?
- Determining data science goals: Data science goals, Data science success criteria
In the CRISP-DM process, what do you do under the data understanding process?
- Collect initial data: Existing data, purchased data, and additional data
- Describe data: Amount of data and value types
- Verify data quality: Missing data and data errors
In the CRISP-DM process, what do you do under the data preparation process?
- Select Right data: Select training examples and featurs, is a given attribute relevant to your data mining goals
- Clean data: Fill in missed data, correct data errors
- Format data: Put data in a format for training the model
In the CRISP-DM process, what do you do under the modelling process?
- Select modelling techniques: Select data types available for analysis, select an algorithm or a model, define modelling goals, state specific modeling requirements
- Set up hyper parameters and build the model: Train the model, describe the result
- Asses the model: Overfitting and under fitting
In the CRISP-DM process, what do you do under the evaluation process?
- Evaluate the results: Are results presented clearly? Are there any novel findings? Can models and findings be applicable to business goals? How well do the models and findings answer business goals? What additional questions the modeling results have risen?
- Review the process: Did the stage contribute to the value of the results? What went wrong and how it can be fixed? Are there alternative decisions which could have been executed?
- Determine the next steps
In the CRISP-DM process, what do you do under the deployment process?
- Planning for deployment: Summarize models and findings, For each model create a deployment plan, Identify any deployment problems and plan for contingencies
- Plan Monitoring and maintenance: Identify models and findings which require support, How can the accuracy and validity be evaluated?, How will you determine that a model has expired?, What to do with the expired models?
- Conduct a final project review
What is a model?
A simplified representation of reality created to serve a purpose. Examples include maps, prototypes, black-scholes model, etc.
What is a prediction?
An estimate of an unknown value
What is a predictive model?
- A formula for estimating the unknown value of interest: the target
- The formula can be mathematical, logical statement
What is an instance/example?
- Represents a fact or a data point
- Described by a set of attributes (fields, columns, variables, or features)
What is training data?
The input data to create the model
What are the 2 feature types?
- Numeric: Anything that has some order like numbers, dates
- Categorical: Stuff that does not have order like text
What are some common data mining tasks?
- Classification and class probability estimation
- Regression
- Similarity Matching
- Clustering
- Co-occurrence grouping and association rules
What is an example of a classification model?
decsion tree
What is the purpose of a regression model? Provide examples.
- It finds a function from data which relates a real-valued variable with one or more other variables
- For example, predict daily water demand
What is the purpose of a clustering model?
- To group data to form classes (clusters)
- Class label is unknown in the training data
- Principle: maximizing the intra-class similarity and minimizing the inter-class similarity
- Applications include market/customer segmentation
What are supervised targets?
- A supervised technique is given a specific purpose for the grouping—predicting the target.
- Supervised tasks require different techniques than unsupervised tasks and are more useful
What are the 2 main subclasses of supervised data mining?
- Classification and regression
What are the 2 main subclasses of supervised data mining distinguished by?
- They are distinguished by the type of target
What are the 2 types of subclasses of supervised data mining under classication?
- Binary
- Categorical target
What type of supervised data mining might we address the following question with?
"Which service package (S1, S2, or none) will a customer likely purchase if given incen‐ tive I?"
This is also a classification problem, with a three-valued target.
What type of supervised data mining might we address the following question with?
"Will this customer purchase service S1 if given incentive I?"
This is a classification problem because it has a binary target (the customer either purchases or does not).
What type of supervised data mining might we address the following question with?
"How much will this customer use the service?"
This is a regression problem because it has a numeric target. The target variable is the amount of usage (actual or predicted) per customer
Explain how data mining applications can be applied to finance.
- Clustering and classification of customers for targeted marketing
- Identify customer groups or associate a new customer to an appropriate customer group
Explain how data mining applications can be applied to retail
- Discover customer shopping patterns and trends
- Re-arrange store layout
- Purchase recommendation and cross-reference of items
Explain how data mining applications can be applied to DNA Anlysis.
- Association analysis: identification of co-occurring gene sequences
- Most diseases are not triggered by a single gene but by a combination of genes acting together
- Association analysis may help determine the kinds of genes that are likely to co-occur together in target samples
What is dimensionality of a dataset?
- It is the sum of the dimensions of the features
- It the sum of the number of numeric features and the number of values of categorical features
What are association analysis used for?
- It is widely used for market basket or transactional data analysis
Which data mining tasks are supervised methods?
- Classification
- Regression
- Casual modeling
- similarity matching
- Link predicition
- Data reduction
Which data mining tasks are unsupervised methods?
- Similarity matching
- link prediction
- data reduction
- clustering
- co-occurence grouping
- profiling
What are some classical pitfalls in data mining setup?
What are some classical pitfalls in data mining setup?
Quiz |
---|
stems list w |
Communication |
Organisation du noyau |
nucleic acidThe polymer of DNA is called |
US révision Dossier 1 Thème 2AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA |
itm618 week 2final exam review |
bacteriology |
NGO toets 2.3 & 2.4 |
1- SCIN 1556 Communication infirmière (examen finale) |
dual facial |
Nucleic acids (a-level) |
chapter-2 |
Afrikanska huvudstäder |
La membrane plasmique |
Mitochondries |
bio 11 |
Pharmacology |
Cytosquelette |
newfoundland drivi g test |
Communication cellulaire |
Les choses practiques |
History |
bill of rights |
french directionsFrench directions |
BLG101 Chapter 16 |
Last section of soc |
WLL |
Diverse 1 |
French- Verb to like |
French- Pronouns |
ADN, opéron Trp |
Ljud och ljus begreppNO prov begrepp träning |
infection and responses |
geschiedenis hoofstuk 2hoofdstuk 2 woorden |
Chem-121 Exam |
PHL Final |
EBDM |
Lipides 1 et 2 |
Lipides 3 |
test review |
Python |
lecture 1-4 research methdology |
Lois de probabilités |
Business- Booklet F |
Intérêts des statistiques |
7 ontleedbare stoffen |
Rayons X |
metallurgy exam reviewmetallurgy exam review |
CHYS 2P10 Final Flashcards- from lecture content from the remaining lectures |
Week 11 - Skin Care 1 - Assignment - Nutrition |
RBCs |
Fizika |
Week 12 - Skin Care 1 - Assignment - Cosmetic Chemistry Part 1 and Part 2 |
Key Spanish Vocab Year 10 Mocks |
1.2.2Demand |
1.1.5Specialisation and the Division of Labour |
1.4. Mon école au quotidien |
1.1.4Production Possibility Frontiers |
bocchiaro |
model |
milgram |
Apocalyptic Sci-Fi Authors and titles |
1.1.1 - 1.1.3Nature of Economics |
woorden 3 |
Reversible reactions |
Rate of reaction |
cellbiologi - metabolism |
Inhibition enzymatique |
Anatomy bonesbones i need for my anatomy test |
Geography Year 9 Term 1Includes
-oceanic and continental crusts
-pangea and evidence
-convection currents
-plate boundaries (diagrams, what happens, features) |
Intersectionallities RG&S |
Religion Vocabulary Chapter 4 |
Détermination protéines |
SRMSocial Research Methodology |
L2 S1 : DP (6,7,8) |
L2 S1 : DP : Les élements constitutifs de l'IP : élément moral (5 2/2) |
MEtabolism |
Tentamen Kwalitatief Onderzoek |
Figure de répulsion |
Stéréochimie |
L2 S1 : DP : Les élements constitutifs de l'infraction pénale : élément matériel (5 1/2) |
WW6 |
lucy |
Psy 3080 final exam |
EDEL 321 FINAL |
civics unit test |
macromolecules |
digestive system |
chapter 4 |
Etre- to be (present tense) |
Samhällskunskap 9 prov |
RE judaism |
modern world exam |
latijn woorden 1/130latijn leren |
Hoofdstuk 1 #2 |
Reading Unit 5 Vocabulary 11-20 |
1- SCIN 1505 Discipline infirmières (mi-session) |
english vocabulary (mixed) |
discipline infirmière |
anglais |
History of Ireland |
L2 S1 : DP : L'application de la loi pénale (4 2/2) |
Pools hoofdstuk 1 #1 |
Mécanique ventilatoire |
Reading Trimester 1 Vocabulary |
L'appareil ventilatoire |
L2 S1 : DP : L'application de la loi pénale (4 1/2) |
Physiologie respiratoire |
yr 9 knowledge JPN |
Triple gcse biology paper 2 |
Bella Dunnelecrity |
LINJEBOK II.0. NU JÄVLAR. |
german 12 |
OMPROF. LINJE-BOCK OCH SIGNAL. |
Gonnerhea |
memory |
jia yi rekenen |
memory |
jia yi |
Unit 15: Key terms |
Final Exam Questions |
All Quiet on the Western Front List C |
biologygrowth and differentiation year 9 |
Conflit Israël-Palestine : entre terre promise et religion |
Capitale du monde |
Haut-karabakh : un conflit centenaire entre l'Arménie et l'Azerbaïdjan |
L2 S1 : DP : Les sources de la loi pénale (3) |
Religion 110-C Exam |
Lésions ADN |
Week 4 - Skin Care 1 - Assignment - Skin Anatomy Part 2 |
biologiebiologie |
Chapitre 16: Santé et stress |
Week 4 - Skin Care 1 - Assignment - Skin Anatomy Part 1 |
Chapter 9 |
Propriétés des acides nucléiques |
Séquences ADN répétées |
Chapitre 12: Personnalité |
Ch 28 Air Induction Systems |
chap 10 Intelligence |
Psychology Exam Final |
Psychology Vocabulary Chapter 9 |
2e semaine |
chap 4 Developpement |
Y9 Science - Detection in Chemistry, Forces, Fit and HealthyScience revision for the 2nd test of year 9. |
frans h2 |
chap 20 |
Régime politique français |
test 2quiz |
Ventricles of the brain |
Brain |
Mandats Présidents Français |
plab 2 |
Biology Quiz 2 |
ADM |
M11: H16.6 |
Level 3 questions |
9 x 9 |
MDSÉlimination vésicale et intestinale |
PSYCH*1000 therapies |
Army Idrarmy idr inspection questions |
PSYCH*1000 mental disorders |
PSYCH*1000 health stress and coping |
Quiz 13 surrentrainement |
cours 12b Doping |
cours 12b |
BIOGLOGIE-CHAPITRE 8 |
Répétition des ADN |
Biochimica clinica |
Variation |
1- SCIN 1557 Interventions (examen finale) |
Anthropolgie et comportement humain |
anatomy final |
L2 S1 : DP : Les caractéristiques de la loi pénale (2) |
Bases moléculaires du génome |
L2 S1 : DP : Introduction (1) |
Substantiv |
MDSS.V. et mesures anthropométriques |
samhällprov |
Ma1c |
Enzymologie |
Sociology -educationeducation topic 2 |
Sociology - Educationeducation overall AI generated |
Sociology- EducationEducation- Sociology Topic 1 |
Manon Lescaut |
Introduction to Organic chemistry |
chinese |
sociology names! |
Film Quotations |
Chromosomes |
PSYCH*1000 social psychology |
genglish - copy |
Interventions autre |
PSYCH*1000 personality |
PSYCH*1000 motivation and emotion |
citizenship test (studying) pt2 |
citizenship test (studying) |
Biology exam |
Bible Exam |
Circulatory System |
history |
Week 3 - Skin Care 1 - Facial Muscles and Massage Techniques |
indigenous art vocab |
Week 2 - Skin Care 1 - Wellness Concepts and Aromatherapy |
Week 7 - Skin Care 1 - Enzymes & Fitzpatrick Scale Skin Typings |
Week 10 - Skin Care 1 - Client Consultation and Homecare |
nederlans |
PSYCH*1000 lifespan development |
Geschiedenisgeschiedenis |
quiz 10A Fixation de buts |
diversity week 10-11 |
African American History: American Revolutionary War for Independence EraAn exploration of African American role during the 18th Century C.E American Revolution Era . |
ELTEKNIK. PROPH. |
WOORDEN 2 |
science ks3langton boys |
Hemostase |
Oscars Trivia |
math trial revisionrevision flash cards for maths trial |
Sociology test Revision |
Tissu musculaire |
L2 S1 : DO Sanctions de l'inexécution (papier orange moyen) (8) |
L2 S1 : Les effets du contrat entre les parties (papier moyen bleu) (6) |
L2 S1 : Le contenu du contrat (papier vert moyen) (5) |
L2 S1 : Le consentement (papier rose bas) (4) |
L2 S1 : Les avant-contrats (papier orange bas) (3) |
anatomisk språk |
Tissu nerveux |
L2 S1 : DO : La période pré-contractuelle (papier bleu bas) (2) |
Kin - MusclesHip Flexors & Extensors - Anterior & Posterior
Hip Adductors
Quads
Hamstrings
Anterior & Posterior Extrinsic Foot Muscles |
Hematology |
SOC Final |
module 5 part 3- final |
personalities |
PAST TIMES |
MDSPrévention des infections et
Examen mentale et physique |
DAILY ROUTINE |
GÉOGRAPHIE CULTURELLE |
titles of JesusMr O is cruel |
semiologie cardique |
cours 9A relaxation |
Jayla |
thoracic and lumbar spine revision |
KIN 1070 Final Exam |
Stimulus Recover Adapatation (SRA) |
cours 9b Imagerie et hypnose |
pelvis, hip and femur revision |
psycho cours 6 |
WGS FInal |
PSYCH*1000 intelligence |
vocab 14 |
History |
PSYC*1000 thought and language |
knee revision - diagnostic radiography |
Kraft och rörelse |
Engelska läxa |
frans leest toets |
Intro to Canadian Legal System - Dec 6study for test |
Intro to Canadian Legal System |
welness exam |
PSYC*1000 memoryquestions to practice for psych final exam |
particel model of matter |
Criminal Law- non-fatal offences |
..... |
strat socialestrat sociale quizz |
French |
EngelsVwo leerjaar 1 Irregular verbs |
russian |
L2 S1 : DO : Introduction (papier vert bas) (1) |
women |
History 1.2, 1.3, 1.5, 2.1, 2.2 |
Stoichometry |
SOC Term 2 |
social chapter 2 quizsocia |
OSI Model Layers |
Molecular Genetics Part 2 |
chem 120 |
Week 1 - Skin Care 1 - First Impressions & Room Furnishings |
Week 1 - Skin Care 1 - Bacteriology & Sanitation |
Crim 2p33 start-test 1 |
PSYCH 333: Early Adulthood |
PSYCH 333: Adolescence |
Intervention |
DNA |
korean |
Module 6- part 4 |
Criminal law- sexual offences |
Criminal Law- Robbery |
Bio Unit 0,3a,3b |
English NounsPeople = Personas |
Crim 2P33 2nd midterm-final class |
Economie |
tent |
Gov final |
2.2 History Review |
2.1 History Flashcards |
diritto internazionale |
initiation |
frans |
frans |
woorden |
L2 S1 : HDP Section 5 & 6 (Mr Hoarau) (7) |
EPA - Project Management |
L2 S1 : HDP : Des peines et des châtiments (Mr Hoarau) (6) |
L2 S1 : HDP La naissance et développement de la procédure laique (Mr Hoarau) (5) |
PSYCH 333: Early AdulthoodFinal exam on December 11 |
droit penal international |
PSYCH 333: Middle AdulthoodFinal exam on December 11 |
French |
Science test Prep 2 |
science |
french verbs |
New Religious Movements |
L2 S1 : HDP Section 1 : justice royale et 2 : sources (Mr Hoarau) (4) |
Life science |
Criminal Law- Duress |
Criminal law- Self-Defence |
Criminal law- Theft |
Chapter 13- STD's |
spanska till 5 December |
Psych exam! |
L2 S1 : HDP La peine dans le monde héllénistique antique (Mme Lault) (3) |
Module 6 part 3 final anatomy |
Chapter 12- Substance Use and Abuse |