Public Interface
Reference for the public interface for UMAP.jl.
Contents
Public Interface
UMAP.fit — Function
function fit(data[, n_components=2]; <kwargs>) -> UMAPResultEmbed data into a n_components-dimensional space. Returns a UMAPResult.
Keyword Arguments
n_neighbors::Integer = 15: the number of neighbors to consider as locally connected. Larger values capture more global structure in the data, while small values capture more local structure.metric::{SemiMetric, Symbol} = Euclidean(): the metric to calculate distance in the input space. It is also possible to passmetric = :precomputedto treatdatalike a precomputed distance matrix.n_epochs::Integer = 300: the number of training epochs for embedding optimizationlearning_rate::Real = 1: the initial learning rate during optimizationinit::AbstractInitialization = UMAPA.SpectralInitialization(): how to initialize the output embedding; valid options areUMAP.SpectralInitialization()andUMAP.UniformInitialization()min_dist::Real = 0.1: the minimum spacing of points in the output embeddingspread::Real = 1: the effective scale of embedded points. Determines how clustered embedded points are in combination withmin_dist.set_operation_ratio::Real = 1: interpolates between fuzzy set union and fuzzy set intersection when constructing the UMAP graph (global fuzzy simplicial set). The value of this parameter should be between 1.0 and 0.0: 1.0 indicates pure fuzzy union, while 0.0 indicates pure fuzzy intersection.local_connectivity::Integer = 1: the number of nearest neighbors that should be assumed to be locally connected. The higher this value, the more connected the manifold becomes. This should not be set higher than the intrinsic dimension of the manifold.repulsion_strength::Real = 1: the weighting of negative samples during the optimization process.neg_sample_rate::Integer = 5: the number of negative samples to select for each positive sample. Higher values will increase computational cost but result in slightly more accuracy.
UMAP.transform — Method
transform(result::UMAPResult, queries, knn_params, src_params, gbl_params, tgt_params, opt_params)Transform the UMAP result for new queries. This method allows overriding the transform-time parameters by passing in configuration structs directly.
UMAP.transform — Method
transform(result::UMAPResult, queries) -> UMAPTransformResultUse the given UMAP result to embed new points into an existing embedding. queries is a matrix or vector of some number of points in the same space as result.data. The returned embedding is the embedding of these points in n-dimensional space, where n is the dimensionality of result.embedding. This embedding is created by finding neighbors of queries in result.embedding and optimizing cross entropy according to membership strengths according to these neighbors.
The transform is parameterized by the config found in result. For that reason, the type of result must match exactly result.data - including as a named tuple if necessary.
UMAP.UMAPConfig — Type
Configuration struct for the UMAP algorithm.
UMAP.UMAPResult — Type
Return result of the UMAP algorithm.