tensor_methods

Tensor Methods for DS and SC
git clone git://popovic.xyz/tensor_methods.git
Log | Files | Refs

main.tex (6245B)


      1 \documentclass[a4paper]{article}
      2 
      3 
      4 \usepackage[T1]{fontenc}
      5 \usepackage[utf8]{inputenc}
      6 \usepackage{mathptmx}
      7 
      8 %\usepackage{ngerman}	% Sprachanpassung Deutsch
      9 
     10 \usepackage{graphicx}
     11 \usepackage{geometry}
     12 \geometry{a4paper, top=15mm}
     13 
     14 \usepackage{subcaption}
     15 \usepackage[shortlabels]{enumitem}
     16 \usepackage{amssymb}
     17 \usepackage{amsthm}
     18 \usepackage{mathtools}
     19 \usepackage{braket}
     20 \usepackage{bbm}
     21 \usepackage{graphicx}
     22 \usepackage{float}
     23 \usepackage{yhmath}
     24 \usepackage{tikz}
     25 \usetikzlibrary{patterns,decorations.pathmorphing,positioning}
     26 \usetikzlibrary{calc,decorations.markings}
     27 
     28 \usepackage[backend=biber, sorting=none]{biblatex}
     29 \addbibresource{uni.bib}
     30 
     31 \usepackage[framemethod=TikZ]{mdframed}
     32 
     33 \tikzstyle{titlered} =
     34     [draw=black, thick, fill=white,%
     35         text=black, rectangle,
     36         right, minimum height=.7cm]
     37 
     38 
     39 \usepackage[colorlinks=true,naturalnames=true,plainpages=false,pdfpagelabels=true]{hyperref}
     40 \usepackage[parfill]{parskip}
     41 \usepackage{lipsum}
     42 
     43 
     44 \usepackage{tcolorbox}
     45 \tcbuselibrary{skins,breakable}
     46 
     47 \pagestyle{myheadings}
     48 
     49 \markright{Popović\hfill Tensor Methods \hfill}
     50 
     51 
     52 \title{University of Vienna\\ Faculty of Mathematics\\
     53 \vspace{1cm}TENSOR METHODS FOR DATA SCIENCE AND SCIENTIFIC COMPUTING
     54 }
     55 \author{Milutin Popovic}
     56 
     57 \begin{document}
     58 \maketitle
     59 \tableofcontents
     60 \section{Assignment 3}
     61 \subsection{Implementing the CP-ALS Algorithm}
     62 The main idea of the algorithm is that we have a rank $R\in \mathbb{N}$ CPD
     63 of a tensor $A \in \mathbb{R}^{n_1 \times \cdot n_d}$ given in terms of
     64 $U_k \in \mathbb{R}^{n_k \times R}$, for $\mathbb{N}\ni k>2$, $k =
     65 \{1,\dots,d\}$ and for $n_1, \dots, n_d\; \in \mathbb{N}$. We want to find a
     66 rank $r\in \mathbb{N}$ CPD of the tensor $A$, by taking an initial guess
     67 for some $V_k \in \mathbb{R}^{n_k \times r}$ and updating it for each $k$ by
     68 optimizing the problem
     69 \begin{align}
     70     \phi(V_1, \dots, V_d) \rightarrow \min_{V_k \in \mathbb{R}^{n_k\times
     71     r}},
     72 \end{align}
     73 where $\phi(V_1, \dots, V_d)$ is the error function, determined by
     74 \begin{align}
     75     \phi(V_1, \dots, V_d) = \left\|\Psi_r(V_1,\dots,V_d) - \Psi_R(U_1,
     76     \dots,U_d) \right\|_F .
     77 \end{align}
     78 For $\Psi_r$ and $\Psi_R$ denote the CPD multilinear representation maps
     79 transforming the CP decomposition into the tensor
     80 \begin{align}
     81     U &= \Psi_R(U_1, \dots, U_d)\\
     82     V &= \Psi_r(V_1, \dots, V_d)\\
     83  \end{align}
     84 for $k = 1, \dots d, d-1, \dots 2$ sequentially by updating $V_k$ at each
     85 step of the optimization. \textbf{This is one iteration step}.
     86 
     87 For $\Psi$ we will use the implementation constructed in the last exercise,
     88 which consists of applying the Kronecker product of the rows and then
     89 summing over them. Once we define the following matrices we may rewrite the
     90 optimality condition in a linear system which is solvable since it is a least
     91 square problem. We define matrices $\mathcal{V}_k \in \mathbb{R}^{n_1\cdots n_d
     92 \times n_k\cdot r}$ and $\mathcal{U}_k \in \mathbb{R}^{n_1\cdots n_d \times
     93 n_k\cdot R}$ for $k \in \{1, \dots,d\}$  by the following
     94 \begin{align}
     95     \mathcal{U}_k &= \prod_{\substack{l=1 \\ l!=k}}^{d} U_l \otimes
     96     \mathbbm{1}_{\mathbb{R}^{n_k\times n_k}},\\
     97     \mathcal{V}_k &= \prod_{\substack{l=1 \\ l!=k}}^{d} V_l \otimes
     98     \mathbbm{1}_{\mathbb{R}^{n_k\times n_k}},
     99 \end{align}
    100 note that $\prod$ represents the Hadamard product (elementwise multiplication) and
    101 $\otimes$ the Kronecker product. The product of these two new defined
    102 matrices is then simply
    103 \begin{align}
    104     \mathcal{V}_k^*\mathcal{V}_k &=  \prod_{\substack{l=1 \\ l!=k}}^{d}
    105     (V_l^*V_l) \otimes \mathbbm{1}_{\mathbb{R}^{n_k\times n_k}}
    106     \label{eq: VV}    \\
    107     \mathcal{V}_k^*\mathcal{U}_k &=  \prod_{\substack{l=1 \\ l!=k}}^{d}
    108     (V_l^*U_l) \otimes \mathbbm{1}_{\mathbb{R}^{n_k\times n_k}}.\label{eq:
    109         VU}
    110 \end{align}
    111 Do note that $\mathcal{V}_k^*\mathcal{V}_k \in \mathbb{R}^{r\cdot n_k \times
    112 r\cdot n_k}$ and $\mathcal{V}_k^*\mathcal{U}_k \in \mathbb{R}^{r\cdot n_k
    113 \times R\cdot n_k}$. For convenience let us define $F_k := V_k^* U_k$ and
    114 $G_k := V_k^* V_k$, which will become useful when talking about storing the
    115 computation. The we can solve for $V_k$ with the following least square
    116 problem
    117 \begin{align}
    118     (\mathcal{V}_k^*\mathcal{V}_k)\hat{v}_k = (\mathcal{V}_k^*\mathcal{U}_k)u_k,
    119 \end{align}
    120 where $\hat{v}_k = \text{vec}(\hat{V}_k)$ and $u_k = \text{vec}(U_k)$.
    121 
    122 To make the computation cost linear with respect to $d$, we can compute
    123 $\mathcal{V}_k^*\mathcal{V}_k$ and $\mathcal{V}_k^*\mathcal{U}_k$ for
    124 each $k$ and update them in the iteration in $k$ with the new $\hat{V}_k$, by
    125 computing the product in equations \ref{eq: VV} and \ref{eq: VU} again.
    126 Additionally we compute the error $\phi$ and $\|\nabla \frac{1}{2}
    127 \phi^2\|_2$ after each iteration (after going through
    128 $k=2,\dots,d,d-1,\dots,1$). The implementation is in \cite{code}.
    129 
    130 \subsection{CP-ALS for the matrix multiplication tensor}
    131 In this section we will use the matrix multiplication tensor to play around
    132 the CP-ALS algorithm we implemented in the last section. We consider $n = 2$
    133 and $r = 7$, $n = 3$ and $r = 23$, $n = 4$ and $r = 49$ for the
    134 multiplication tensor and its CP decomposition. The implementation of them we
    135 have done in the last exercise together with their rank $R = n^3$ CP
    136 decomposition. We test the our implementation of the CP-ALS algorithm of
    137 these three multiplication tensor for three random initial guesses $V_1, V_2,
    138 V_3$, where each matrix has unitary columns (i.e. of norm one).
    139 
    140 Our procedure will be that we do seven different guesses if they are good
    141 enough, i.e. if they do not produce a \textit{SingularValueError} after 10
    142 iterations we go for 10000 iterations and plot both $\phi$ and $\|\nabla
    143 \frac{1}{2} \phi\|_2$ with respect to the number of iterations for each
    144 multiplication tensor. We get the following curves
    145 \begin{figure}[H]
    146     \centering \includegraphics[width=0.33\textwidth]{./../src/plots/err_2.png}
    147     \includegraphics[width=0.33\textwidth]{./../src/plots/err_3.png}
    148     \includegraphics[width=0.32\textwidth]{./../src/plots/err_4.png}
    149     \caption{From left to right, rank $r\in\{7, 23, 49\}$ CP-ALS error for the
    150     $n\in \{2, 3, 4\}$ multiplication tensor at 10000 iteration steps}
    151 \end{figure}
    152 
    153 
    154 
    155 \printbibliography
    156 \end{document}