Lecture notes on high-dimensional data

Sep-20-2024–arXiv.org Artificial Intelligence

The text below arose from a course on'Mathematical Data Science' that I taught twice for final year BSc Mathematics students in the UK between 2019 and 2020. The notes presently cover the first part (roughly a third) of the course focussing on the characteristics and peculiarities of high-dimensional data. An improved version of the notes appeared as part of the textbook [7]; we refer the reader in particular to [7, Chapters 8 -12]. I would like to thank my former students who attended the course and helped me with their feedback to write these lecture notes. Concrete examples are as follows. Each user can give a rating from one to five stars for each movie. When doing medical diagnostic tests, we can represent a subject by the vector containing her/his results. These can include integers like antibody counts, real numbers like temperature, pairs of real numbers like blood pressure, or binary values like if a subject has tested positive or negative for a certain infection. If we name the users 1, 2, 3,..., we can represent user j in R Given such a high-dimensional data set A, classical tasks to analyze the data, or make predictions based on it, involve to compute distances between data points. This can be for example the classical euclidean distance (or any other p-norm), CHAPTER 1. THE CURSE OF HIGH DIMENSIONS 4 However, if d is very large, we are faced with the following two obstructions.

exp, gaussian, probability, (16 more...)

arXiv.org Artificial Intelligence

Sep-20-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe
  - Germany > Hamburg (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (1.00)

Industry:
- Health & Medicine > Diagnostic Medicine (0.54)
- Media > Film (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found