One way to deal with high dimensionality is projections on a lower dimensional space. The fundamental basis for why projections might work is Johnson-Lindenstrauss lemma. The lemma states that a small set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. We will touch on random and other projections when we talk about NLP in Chapter 9, NLP in Scala, but the random projections work well for nested structures and functional programming language, as in many cases, generating a random projection is the question of applying a function to a compactly encoded data rather than flattening the data explicitly. In other words, the Scala definition for a random projection may look like functional paradigm shines. Write the following function:
def randomeProjecton(data: NestedStructure) : Vector = { … }
Here, Vector
is in low dimensional space.
The map used for embedding is at least Lipschitz, and can even be taken to be an orthogonal projection.
18.222.119.177