Karl Ho
School of Economic, Political and Policy Sciences
University of Texas at Dallas
Dr. Jeyakesavan Veerasamy
Expert helpers:
Nick Haoran Weng
Kinds of Data
Quantitative vs. Qualitative
Structured vs. Semi/unstructured
Measurement
Nominal/ordinal/interval/ratio
metadata, paradata
Ackoff, R.L., 1989. From data to wisdom. Journal of applied systems analysis, 16(1), pp.3-9.
One assumes that the data are generated by a given stochastic data model. |
---|
The other uses algorithmic models and treats the data mechanism as unknown. |
---|
Data Model |
---|
Algorithmic Model |
---|
Small data |
---|
Complex, big data |
---|
Data are generated in many fashions. Picture this: independent variable x goes in one side of the box-- we call it nature for now-- and dependent variable y come out from the other side.
The analysis in this culture starts with assuming a stochastic data model for the inside of the black box. For example, a common data model is that data are generated by independent draws from response variables.
Response Variable= f(Predictor variables, random noise, parameters)
Reading the response variable is a function of a series of predictor/independent variables, plus random noise (normally distributed errors) and other parameters.
The values of the parameters are estimated from the data and the model then used for information and/or prediction.
The analysis in this approach considers the inside of the box complex and unknown. Their approach is to find a function f(x)-an algorithm that operates on x to predict the responses y.
The goal is to find algorithm that accurately predicts y.
Unsupervised Learning
Supervised Learning vs.
Source: https://www.mathworks.com
One assumes that the data are generated by a given stochastic data model. |
---|
The other uses algorithmic models and treats the data mechanism as unknown. |
---|
Data Model |
---|
Algorithmic Model |
---|
Small data |
---|
Complex, big data |
---|
Introduction - Data theory
Data methods
Statistics
Programming
Data Visualization
Information Management
Data Curation
Spatial Models and Methods
Machine Learning
NLP/Text mining
Introduction - Data theory
Fundamentals
Data concepts
Data Generation Process (DGP)
Algorithm-based vs. Data-based approaches
Taxonomy
Data methods
Passive data
Data at will
Qualitative data
Complex data
Text data