## Basic Course Information

The basic course description and teaching schedule are available in the Curriculum Guide.

Lectures are given weekly Tuesdays 14-16 in Pinni B0016 starting Jan 7, see the preliminary lecture schedule below. Due to the coronavirus situation, lectures from March 17 onwards (for the time being) **will not have contact teaching** and will be arranged using the Zoom software; more information below. Lecturer: professor Jaakko Peltonen.

**Course Contents**

Preliminary contents: Properties of high-dimensional data; Feature Selection; Linear feature extraction methods such as principal component analysis and linear discriminant analysis; Graphical excellence; Human perception; Nonlinear dimensionality reduction methods such as the self-organizing map and Laplacian embedding; Neighbor embedding methods such as stochastic neighbor embedding and the neighbor retrieval visualizer; Graph visualization; Graph layout methods such as LinLog.

## Course Material

The course is based on the lecture slides. However, a related book is Nonlinear Dimensionality Reduction (John Lee, Michel Verleysen). For lecture 4 (graphical excellence) a related book is The Visual Display of Quantitative Information (Edward R. Tufte). For lecture 5 (human perception) a related book is Information Visualization: Perception for Design (Colin Ware).

**Learning Outcomes**

After the course, the student will be aware of main approaches and issues in dimensionality reduction and visualization, will be aware of a variety of methods applicable to the tasks, and will be able to apply some of the basic techniques.

**Passing the Course**

To pass the course, you must pass the exam and complete a sufficient number of exercises from the exercise packs. Exercise packs will be released during the course.

Preliminary grading scheme (note: preliminary information only, may change!): the exercise packs are graded in total either as 0 (fail) or as a fractional number between 1 and 5 (such as 1.34). The exam is similarly graded either as 0 (fail) or as a fractional number between 1 and 5. The total grade of the course is computed as round(0.8*ExamGrade + 0.2*ExercisesGrade), so that e.g. 4.51 rounds up to 5 and 4.49 rounds down to 4.

**Information about Remote Lectures (Updated March 16, 2020)**

Due to the coronavirus situation, lectures from March 17 onwards (for the time being) **will not have contact teaching**. Instead, the lectures will be arranged using the Zoom software. For each lecture, a link to the lecture will be sent before the lecture – if you have not received a Zoom link to the lecture, contact the lecturer.

To participate in a Zoom lecture, please make sure that:

- You have a laptop/desktop with a working microphone (you need a mic, even if you just want to listen in)
- You have installed either the Zoom software, or the Chrome browser. Firefox and Microsoft Edge do not work.

To install the Zoom software, and to learn how to use it, please see the handbook on the university intranet (you need to login with your university account to see the page).

To join the meeting, you will receive a link by email. Simply open the link. Your browser should ask to open in the Zoom application; do that, or if you do not have the Zoom application installed, choose “join with browser” instead (you have to use Chrome as your browser in that case).

During the lecture, I recommend to keep your microphone muted (you can do that from the Zoom interface) when you are not talking, otherwise background noise is heard by everyone.

The Zoom lectures will be recorded, and the lecture slides and the lecture recording will be available from Panopto as usual after the lecture.

**Preliminary Schedule**

The preliminary schedule below may change as the course progresses. Lecture slides for each lecture will be added to the schedule as the course progresses.

Jan 7 | Lecture 1: Introduction, properties of high-dimensional data. Material: Lecture slides, Lecture video in Panopto |

Jan 14 | Lecture 2: Feature selection. Material: Lecture slides, Lecture video in Panopto – part 1, Lecture video in Panopto – part 2. |

Jan 21 | Lecture 3: Feature selection continued, and Linear dimensionality reduction. Material: Lecture slides, Lecture video in Panopto |

Jan 28 | Lecture on linear dimensionality reduction continued. Material: no new slides, Lecture video in Panopto: part 1, part 2, part 3, part 4. |

Feb 4 | Lecture 4: Graphical excellence. Lecture material: Lecture slides, Lecture video in Panopto. |

Feb 11 | Lecture 5: Human perception. Lecture material: Lecture slides, Lecture video in Panopto |

Feb 18 | lecture on human perception continued. Lecture material: Lecture slides, Lecture video in Panopto. |

Feb 25 |
Lecture 6: Nonlinear dimensionality reduction, part 1. Lecture material: Lecture slides, Lecture video in Panopto |

Mar 3 | continuation of nonlinear dimensionality reduction part 1, and beginning of part 2. Lecture material: Lecture slides, Lecture video in Panopto |

Mar 10 | Lecture 7: Nonlinear dimensionality reduction, continuation of part 2. Lecture material: same slides as March 3, Lecture video in Panopto |

Mar 17 | Lecture 8: Nonlinear dimensionality reduction, part 3. Lecture material: Lecture slides, Lecture video in Panopto |

Mar 24 | Lecture 9: Metric learning. Lecture material: Lecture slides, Lecture video in Panopto |

Mar 31 | Lecture 10: Neighbor embedding, part 1. Lecture material: Lecture slides, Lecture video in Panopto |

Apr 7 | Lecture 11: Neighbor embedding, part 2. Material: Lecture slides (updated Apr 14, 2020), Lecture video in Panopto |

Apr 14 at 12:15-14 |
Lecture 12: Graph visualization. Note new lecture time. Lecture material: Lecture slides, Lecture video in Panopto |

Apr 21 | Lectures 11-12 continued. Lecture material: Lecture video in Panopto |

Apr 28 |
Lecture 13: Dimensionality reduction for graph layout. Lecture material: Lecture slides, Lecture video in Panopto |

May 5 | Recap for course material, discussion of exercise packs. Lecture material: Lecture video in Panopto |

May 19 | Tentative date for first exam. |

**Exercise Packs**

Exercise packs will be released during the course. They can be completed using e.g. Octave, Matlab, or R.

- Exercise pack 1, return deadline
**March 29, 2020**(extended deadline) - Exercise pack 2, return deadline
**May 31, 2020**

**About Octave, Matlab, and R**

**About Octave, Matlab, and R**

Octave (GNU Octave) is a free software that is very similar in operation to Matlab, and is available for several systems including Windows, Linux, and Mac OS X. For Linux it is likely available in the software repository of your distribution such as Ubuntu Software Center; for Windows download it

through the download page; for Mac OS X there are various alternatives, the easiest is a slightly older version at SourceForge.

Several tutorials are available online about programming in Matlab and programming in Octave. If you are familiar with R, Prof. David Hiebeler (University of Maine) has written a useful Matlab/R reference that tells how the same operations are done in both languages.

R is a software for statistical computing, also available for Windows, Linux, and Mac OS X. For Linux it might be already installed (check with “which R”) or is likely available in the software repository of your distribution such as Ubuntu Software Center. For Windows and Mac OS X download it through one of the many CRAN mirror sites.

There are a large amount of R tutorials available online (e.g. this one). If you are familiar with Matlab or Octave, Prof. David Hiebeler (University of Maine) has written a useful Matlab/R reference that tells how the same operations are done in both languages.