Some interchangeable terms in statistics

The science part of HCI lies mostly in statistics. HCI researchers are using statistical tools for testing, e.g. the effect of a new interface, interaction technique etc. I am not a statistician, I learned it partly from a user study course at the university, and partly from books and numerous free lecture slides across the web. However, I soon discovered some of the equivalent terms that are used in different tutorials, books and articles.  Let me speak about them in this blog.

1. Independent Variables (IV) and Dependent Variables (DV) 

When I started learning the basics of statistical testing couple of years ago,  IV and DV are among the first terms that I learnt. Briefly speaking, the former one can be seen as the cause, and the latter is the effect/outcome. I am not going to talk about details, since this is not the focus of this short article. I will be speaking about what other terms I’ve found in other literatures.

IV and DV are also referred to as the following:

(1) Predictor variable and Response variable

(2) Factor (Covariate) and Measure

(3) Explanatory variable and Outcome variable

(4) Regressor and Regressand (in regression analysis)

(5) Controlled variable and Measured variable

(6) Manipulated variable and Responding variable

I want to note something about (2) and (5):

Although “Controlled variable” is sometimes refers to the same thing as IV by some authors. In experiment design, it usually means something that is constant and unchanged in an experiment. We know it will affect the the experiment, but we control it to be constant across the experiment. This is similar to what some authors referred to as covariate.  A covariate is usually not manipulated, such as gender, age, temperature, walking speed of a subject. We usually also control the covariate. A factor mainly refers to categorical variables with several levels.

Anyway, all these terms are quite common in literature, and we need to be familiar with different terminologies.

2. The type of data

Regarding the type of data, there are also different terms:

(1) Nominal variable, categorical variable

Here people use the two terms interchangeably.  When a categorical variable has only two levels, then it is called dichotomous variable.

(2) Ordinal variable

This refers to e.g. questionnaire data

(3) Continuous variable, scalar variable

They refer to the same thing, but there are two sub-types: interval (for data without absolute zero) and ratio data.

3. Mixed data 

There are also several terms for mixed data, or say, mixed effect model:

(1) Multilevel model

(2) Hierarchical linear model

(3) Random coefficient model

(4) Variance component model

(5) Nested model

(6) Mixed model

(7) Random effect model

(8) Random parameter model

(9) Split-plot design

This one is pretty complex, but you’ll find all these terms quite often. Also, if you are talking about longitudinal data, someone also refers to panel data.

In short, this article is a short summary of some equivalent statistical terms.


Final Remark

Perhaps a hierarchical presentations of types of variables is needed in addition to the discussion before, and this presentation is taken from Andy Field’s book <Discovering Statistics Using SPSS>

Categorical variable:

  • Binary variable : There are only two categories
  • Nominal variable: There are more than two categories (e.g. whether someone is an omnivore, vegetarian, vegan or fruitarian)
  • Ordinal variable: The same as a nominal variable but the categories have a logical order (e.g. Likert-scale)

Continuous l variable:

  • Interval variable : equal intervals on the variable represent equal differences in the property being measured (e.g. the difference between 6 and 8 is equivalent to the difference between 13 and 15)
  • Ratio variable : the same as an interval variable, but the ratios of scores on the scale must also make sense (e.g. a score of 16 on an anxiety scale means that the person is, in reality, twice as anxious as someone scoring 8)