<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>seongyeon_park.log</title>
        <link>https://velog.io/</link>
        <description>General Engineer</description>
        <lastBuildDate>Mon, 10 Oct 2022 04:23:35 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>seongyeon_park.log</title>
            <url>https://velog.velcdn.com/images/seongyeon_park/profile/0726e7a9-6408-4e4b-8ab8-823d47df70e8/social_profile.png</url>
            <link>https://velog.io/</link>
        </image>
        <copyright>Copyright (C) 2019. seongyeon_park.log. All rights reserved.</copyright>
        <atom:link href="https://v2.velog.io/rss/seongyeon_park" rel="self" type="application/rss+xml"/>
        <item>
            <title><![CDATA[[논문리뷰] STP-UDGAT: Spatial-Temporal-Preference User Dimensional Graph Attention Network for Next POI Recommendation (2020)]]></title>
            <link>https://velog.io/@seongyeon_park/%EB%85%BC%EB%AC%B8%EB%A6%AC%EB%B7%B0-STP-UDGAT-Spatial-Temporal-Preference-User-Dimensional-Graph-Attention-Network-for-Next-POI-Recommendation</link>
            <guid>https://velog.io/@seongyeon_park/%EB%85%BC%EB%AC%B8%EB%A6%AC%EB%B7%B0-STP-UDGAT-Spatial-Temporal-Preference-User-Dimensional-Graph-Attention-Network-for-Next-POI-Recommendation</guid>
            <pubDate>Mon, 10 Oct 2022 04:23:35 GMT</pubDate>
            <description><![CDATA[<h2 id="세-줄-요약">세 줄 요약</h2>
<ul>
<li>User의 PoI historical data로부터 spatial, temporal, preference 세 측면의 새로운 PoI exploration을 수행함</li>
<li>Exploit user historical data(local view) + Explore new PoIs(global view)</li>
<li>Masked self-attention을 통한 random walk으로 higher-order PoI exploration을 가능하게 함(option)</li>
</ul>
<h2 id="problem">Problem</h2>
<ul>
<li>기존에는(RNN) PoI-PoI relationship을 user의 visit sequential data를 기반으로 학습함</li>
</ul>
<h2 id="solution">Solution</h2>
<ul>
<li>PoI-PoI relationship을 학습하는 데에 다른 요소(STP)를 반영하겠다.</li>
</ul>
<h2 id="approach">Approach</h2>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/ccb0038b-d681-4dd1-a7c9-c743e79fe796/image.png" alt=""></p>
<h3 id="dgat">DGAT</h3>
<p>~~알고리즘 어쩌구</p>
<ul>
<li>GAT의 변형 버전인 것 같음, DGAT layer를 모든 embedding에 사용함</li>
</ul>
<h3 id="pp-dgatpersonalized-preference-dgat">PP-DGAT(Personalized Preference-DGAT)</h3>
<ul>
<li>user의 historical PoI visits를 graph로 construction</li>
<li>다만 모든 PoI들이 완전 연결됨</li>
<li>이 파트는 exploitation of users&#39; historical PoIs에 해당함(local view)</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/e6d699af-affa-416a-a741-dd1964420726/image.png" alt=""></p>
<h3 id="stp-dgat">STP-DGAT</h3>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/b4736dfc-9048-41c1-917c-df58dd9db29e/image.png" alt=""></p>
<h4 id="spatial-graph">Spatial Graph</h4>
<ul>
<li>PoI로부터 가까운 top k개의 PoI들에 adjacency 부여</li>
<li>edge weight은 euclidean distance의 역수</li>
</ul>
<h4 id="temporal-graph">Temporal Graph</h4>
<ul>
<li>연속적으로 방문된 PoI들에 adjacency 부여</li>
<li>모든 user들의 consecutive visits pair들의 time interval을 통합해서 평균함</li>
<li>averaged time interval의 역수가 edge weight</li>
</ul>
<h4 id="preference-graph">Preference Graph</h4>
<ul>
<li>연속적으로 방문된 PoI들에 adjacency 부여</li>
<li>edge weight는 PoI들 사이의 frequency (방문 횟수)</li>
</ul>
<h3 id="exploring-new-stp-neighbours">Exploring New STP Neighbours</h3>
<ul>
<li>위에서 construction한 STP graph에서 relevant new PoIs를 찾음</li>
<li>user의 PP graph에 있는 node를 seed set으로 함</li>
<li><strong>Random Walk Masked Self-Attention</strong>을 이용하여 STP graph에서도 higher-order exploration이 가능하게 함</li>
</ul>
<h3 id="stp-udgatuser-dgat">STP-UDGAT(User DGAT)</h3>
<ul>
<li>비슷한 다른 user들을 참고하기 위함</li>
<li>Jaccard similarity coefficient가 0.2이상이면 두 유저가 adjacency하다고 정의</li>
</ul>
<h2 id="results">Results</h2>
<ul>
<li>Baseline에 GNN 계열이 없어서 아쉬움</li>
</ul>
]]></description>
        </item>
        <item>
            <title><![CDATA[cs224w | Lecture 4. Link Analysis: PageRank]]></title>
            <link>https://velog.io/@seongyeon_park/cs224w-Lecture-4.-Link-Analysis-PageRank</link>
            <guid>https://velog.io/@seongyeon_park/cs224w-Lecture-4.-Link-Analysis-PageRank</guid>
            <pubDate>Sun, 09 Oct 2022 03:26:46 GMT</pubDate>
            <description><![CDATA[<h2 id="link-analysis-algorithms">Link Analysis Algorithms</h2>
<h3 id="pagerank-the-flow-model">PageRank: The &quot;Flow&quot; Model</h3>
<ul>
<li>A page is important if it is pointed to by other important pages</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/fb486d3e-5e70-4f9f-991c-7a81bafc4c1c/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/69663582-c899-4f7c-93bf-35b81ff98425/image.png" alt=""></p>
<ul>
<li>Connection to random walk</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/8baa652c-6e4e-477b-b508-86c7b640fcdb/image.png" alt=""></p>
<ul>
<li>What is the long-term distribution of the surfers?</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/a8949235-25f5-4bf5-b15b-0a3a96998742/image.png" alt=""></p>
<ul>
<li><p>PageRank is the stationary distribution of random-walk</p>
</li>
<li><p>결국 eigenvalue가 1인 행렬 M의 eigenvector를 구하는 문제</p>
</li>
<li><p>Summary</p>
</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/ddc6571e-e58f-4182-8767-b6716f9e4812/image.png" alt=""></p>
<h3 id="pagerank-how-to-solve">PageRank: How to solve?</h3>
<ul>
<li>Power Iteration</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/bcb63ede-a874-4894-9ac9-e8c224e921bf/image.png" alt=""></p>
<ul>
<li>Problems of Spider Traps / Dead ends</li>
<li><blockquote>
<p>Use <strong>Teleports</strong></p>
</blockquote>
</li>
<li>PageRank에서는 구글 행렬을 사용함: $\beta$를 설정하여 다른 노드로 랜덤하게 teleport함</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/37c90bdd-b4a0-45ad-8fef-4803749bf44f/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/c80c0582-1693-485e-94ac-2e2b2c3c0303/image.png" alt=""></p>
<h3 id="random-walk-with-restarts--personalized-pagerank">Random Walk with Restarts / Personalized PageRank</h3>
<ul>
<li>Personalized PageRank는 다른 노드로 teleport하는 확률이 uniform하지 않음</li>
</ul>
<h3 id="matrix-factorization-and-node-embeddings">Matrix Factorization and Node Embeddings</h3>
<ul>
<li>Node가 유사하다는 건 서로 연결되어 있다는 것. 즉,
$$
\bold{Z}^{T}\bold{Z} = A
$$</li>
<li>하지만 실제로 embedding dimension d보다 number of node n이 훨씬 큼. 따라서, factorization으로 A size가 나오는건 현실적으로 불가능.</li>
<li>하지만 둘의 차이가 작아지도록 $\bold{Z}$를 근사할 수는 있음.</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/be18e655-7a39-4a8f-9beb-8192f6e3a5de/image.png" alt=""></p>
<ul>
<li><p>DeepWalk, Node2Vec</p>
</li>
<li><p>Limitations</p>
<ol>
<li>Cannot obtain embeddings for nodes not in the training set</li>
<li>Cannot capture structural similarity</li>
<li>Cannot utilize node, edge and graph features</li>
</ol>
</li>
</ul>
]]></description>
        </item>
        <item>
            <title><![CDATA[cs224w | Lecture 3. Node Embeddings]]></title>
            <link>https://velog.io/@seongyeon_park/cs224w-Lecture-3.-Node-Embeddings</link>
            <guid>https://velog.io/@seongyeon_park/cs224w-Lecture-3.-Node-Embeddings</guid>
            <pubDate>Sat, 08 Oct 2022 08:07:20 GMT</pubDate>
            <description><![CDATA[<h2 id="embedding-nodes">Embedding Nodes</h2>
<h3 id="encoder-and-decoder">Encoder and Decoder</h3>
<ul>
<li>Goal: embedding space에서 가깝에 위치하게 embedding하기 위함
$$
similarity(u,v) \approx \bold{z}<em>{v}^{T} \bold{z}</em>{u}
$$</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/67ddbdd7-807b-4650-8e2a-a70b64ab87c3/image.png" alt=""></p>
<h3 id="shallow-encoding">&quot;Shallow&quot; Encoding</h3>
<ul>
<li>Encoder is just an ebmbedding-lookup</li>
<li>Each node is assigned a unique embedding vector</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/33c45caa-6763-4ff6-9e91-4fb7f5c1974d/image.png" alt=""></p>
<p>similarity 를 maximize하는 parameter $\bold{Z}$를 optimize하는게 목표</p>
<h2 id="random-walk-embeddings">Random-Walk Embeddings</h2>
<p>$ \bold{Z}<em>{u}^{T}\bold{Z}</em>{v} \approx $ $u$, $v$가 random walk에서 co-occur 할 확률</p>
<ul>
<li>Predicted probability와 실제 random-walk probability와 비슷하게 학습</li>
<li>Random walk을 했을 때 나타난 neighbor set node들의 확률이 높게 나타나야 함</li>
<li>Generally work most efficiently</li>
<li>Expressivity: incorporate high-order, multi-hop info</li>
<li>Efficiency: only need to consider pairs that co-occur</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/922b4e77-fc07-4993-bf3c-490cf38c8565/image.png" alt=""></p>
<ul>
<li>모든 노드를 탐색하는건 expensive</li>
<li><blockquote>
<p>Solution: Negative sampling</p>
</blockquote>
<ul>
<li>Sample k Negative sampling</li>
<li>Sample k with prob. proportional to is degree</li>
<li>In practice 5~20</li>
</ul>
</li>
<li>Optimize embeddings using Stochastic Gradient Descent</li>
</ul>
<h4 id="node2vec">Node2Vec</h4>
<ul>
<li>그래서 random walk을 어떻게 할건지?</li>
<li>node2vec은 graph에서 node를 vector로 sampling하는 strategy</li>
<li>Breadth-First Search</li>
<li>Depth-First Search</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/0ff93a1e-c7af-4f7d-8e49-a3e795b7551a/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/cc392fed-0e3b-4e4c-b7db-3931a059c6e6/image.png" alt=""></p>
<ul>
<li>Parameters<ul>
<li>Return parameter p</li>
<li>In-out parameter q: ratio between BFS and DFS</li>
</ul>
</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/1b45881c-ae4d-49e2-81b2-82584578365f/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/e2984a8f-0828-46d8-a09d-8e9a0ae7493c/image.png" alt=""></p>
<h2 id="embedding-entire-graphs">Embedding Entire Graphs</h2>
<h3 id="approaches">Approaches</h3>
<ul>
<li>Node(Sub-graph) embedding을 합함</li>
<li>Virtual node의 embedding</li>
<li>Anonymous Walks: 가능한 anonymous walks의 확률 벡터</li>
<li>Learn Walk Embeddings: Predict the next anonymous walk</li>
</ul>
]]></description>
        </item>
        <item>
            <title><![CDATA[cs224w | Lecture 2. Traditional Methods for ML on Graphs]]></title>
            <link>https://velog.io/@seongyeon_park/cs224w-Lecture-2.-Traditional-Methods-for-ML-on-Graphs</link>
            <guid>https://velog.io/@seongyeon_park/cs224w-Lecture-2.-Traditional-Methods-for-ML-on-Graphs</guid>
            <pubDate>Sat, 08 Oct 2022 07:14:47 GMT</pubDate>
            <description><![CDATA[<h2 id="graph-level-features">Graph-Level Features</h2>
<h3 id="kernel-methods">Kernel Methods</h3>
<ul>
<li>Graph Kernels: Measure similarity between graphs</li>
<li>Goal: Design graph feature vector $\phi (G)$</li>
<li>Key Idea: <strong>Bag of Words</strong></li>
</ul>
<h4 id="graphlet-kernels">Graphlet Kernels</h4>
<ul>
<li>Key Idea: Count the number of different graphlets in a graph</li>
<li>Don&#39;t have to be connected</li>
<li>Don&#39;t have to be rooted</li>
<li>Given two graphs, G and G&#39;, graphlet kernel is computed as
$$
K(G,G&#39;)=\mathit {\bold{f}}<em>{G}^{T} \mathit{\bold{f}}</em>{G&#39;}
$$</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/955170e3-9cf8-46ad-a892-54882694b814/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/e776d49c-1fe4-4598-9959-00ca1cf07d85/image.png" alt=""></p>
<ul>
<li>Limitations: Counting graphlets is expensive!</li>
</ul>
<h4 id="weisfeiler-lehman-kernel">Weisfeiler-Lehman Kernel</h4>
<ul>
<li>Color Refinement: iterate color aggregation and obtain color count vectors</li>
<li>Closely related to Graph Neural Networks</li>
<li>Counting colors takes linear-time w.r.t. #(nodes)</li>
</ul>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/21e890e8-7e22-4990-9ea8-b0097b1ceef7/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/7825ceda-c4be-4a9d-9f4a-45a0e79b1c36/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/d9e85d18-4036-4170-a2c9-5a1cdfd522fa/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[논문리뷰] LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation (2020)]]></title>
            <link>https://velog.io/@seongyeon_park/LightGCN-Simplifying-and-Powering-Graph-Convolution-Network-for-Recommendation</link>
            <guid>https://velog.io/@seongyeon_park/LightGCN-Simplifying-and-Powering-Graph-Convolution-Network-for-Recommendation</guid>
            <pubDate>Tue, 04 Oct 2022 08:25:50 GMT</pubDate>
            <description><![CDATA[<h2 id="세-줄-요약">세 줄 요약</h2>
<ul>
<li>collaborative filtering의 성능을 저하시키는 기존 GCN 구조 (feature transformation, nonlinear activation) 의 단순화를 목적으로 함</li>
<li>neighborhood aggregation과 같은 주 요소만 포함한 LightGCN을 제안</li>
<li>SOTA GCN-based recommender 보다 향상된 결과를 보임</li>
</ul>
<h2 id="problem-definition">Problem definition</h2>
<ul>
<li>GCN은 node feature가 풍부한 그래프의 node classification 태스크에 주로 이용되는 방법론</li>
<li>하지만, CF를 위한 user-item interaction graph에서 각 node는 ID embedding으로 이루어짐 (no concrete semantic)</li>
<li>따라서, GCN의 feature transformation 과 nonlinear activation은 Neural Graph Collaborative Filtering (NGCF)에 거의 기여하는 바가 없음</li>
</ul>
<h2 id="solution">Solution</h2>
<ul>
<li>LightGCN은 GCN의 가장 중요한 요소인 neighborhood aggregation 만을 포함하도록 함</li>
</ul>
<h2 id="empirical-explorations-on-ngcf">Empirical Explorations on NGCF</h2>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/ba2d9aa6-68db-4fbb-8ab0-46ea3e5059b0/image.png" alt=""></p>
<ul>
<li>NGCF-f(removes feature transformation)</li>
<li>NGCF-n(removes non-linear activation function)</li>
</ul>
<h3 id="finding">Finding</h3>
<p>(1) Feature transformation imposes negative effect on NGCF
(2) Nonlinear activation affects slightly
(3) Removing them simultaneously improves largely (9.57% relative improvement on recall)
<strong>-&gt; Deterioration of NGCF stems from the training difficulty, rather than overfitting</strong></p>
<h2 id="lightgcn">LightGCN</h2>
<p><img src="https://velog.velcdn.com/images/seongyeon_park/post/9965b79f-a536-4720-a3de-e9180a980636/image.png" alt=""></p>
<p>GCN에서 item or user embedding의 기본은 아래와 같음
$$ 
\bold{e}<em>{u}^{(k+1)}=AGG(\bold{e}</em>{u}^{(k)},\left{\bold{e}<em>{i}^{(k)}:i\in \mathcal{N}</em>{u}\right}). 
$$
여기서 self-connction, feature transformation, nonlinear activation을 제거함
따라서, LightGCN에서는 neighborhood aggregation은 아래와 같음
$$
\bold{e}<em>{u}^{(k+1)}=\sum</em>{i \in \mathcal{N}<em>{u}} \frac {1} {\sqrt{\left\vert \mathcal{N}</em>{u} \right\vert} \sqrt{\left\vert \mathcal{N}<em>{i} \right\vert}} \bold{e}</em>{i}^{(k)} \
\bold{e}<em>{i}^{(k+1)}=\sum</em>{i \in \mathcal{N}<em>{i}} \frac {1} {\sqrt{\left\vert \mathcal{N}</em>{u} \right\vert} \sqrt{\left\vert \mathcal{N}<em>{i} \right\vert}} \bold{e}</em>{u}^{(k)}
$$
Aggregation이 단순 합이므로, trainable model parameter는 0-th layer, $\bold{e}<em>{u}^{(0)}$과 $\bold{e}</em>{i}^{(0)}$에만 존재한다. 
Final representation을 얻기 위해서 $K$개의 layer를 합하는 <strong>layer combination</strong>을 수행하는데 이는 아래 수식과 같다. 
$$
\bold{e}<em>{u}=\sum^{K}</em>{k=0}\alpha_{k}\bold{e}<em>{u}^{(k)};
\bold{e}</em>{i}=\sum^{K}<em>{k=0}\alpha</em>{k}\bold{e}<em>{i}^{(k)}
$$
이때 hyperparameter로 지정하는 $\alpha</em>{k}$는 k-th layer embedding의 importance이다. 
본 논문에서는 layer마다 동일하게 $1/(K+1)$로 설정하는 것이 전반적으로 좋은 성능을 냈다고 한다. 
<strong>layer combination</strong>을 수행한 이유로는 아래의 세 가지 이유를 든다. 
(1) 레이어의 수가 증가할수록 over-smoothing되는 문제를 해결하기 위해
(2) 각 레이어의 embedding이 다른 semantic을 포착하기 때문
(3) weighted sum을 이용한 layer combination이 self-connection의 효과를 내기 때문 (uniform한 alpha를 사용한다면서..)</p>
<p>마지막으로 model prediction을 위해 user와 item의 inner product를 수행해 recommendation을 위한 ranking score를 구한다.
$$
\hat{y}<em>{ui}=\bold{e}</em>{u}^{T}\bold{e}_{i}
$$
이러한 과정을 matrix form으로 나타내면 아래와 같다. </p>
<p>user-item interaction matrix $\bold{R} \in \mathbb{R}^{M\times N}$ 에 대해 </p>
<p>$$
\bold{A}=\begin{pmatrix} \bold{0} &amp; \bold{R} \ \bold{R}^{T} &amp; \bold{0} \end{pmatrix}, \
$$
embedding size $T$ 에 대해
$$
\bold{E}^{(0)} \in \mathbb{R}^{(M+N)\times T}, \
$$
symmetrically normalized matrix $\tilde{A}=\bold{D}^{-\frac{1}{2}}\bold{A}\bold{D}^{-\frac{1}{2}}$에 대해
$$
\bold{E}^{(k+1)}=\bold{D}^{-\frac{1}{2}}\bold{A}\bold{D}^{-\frac{1}{2}}\bold{E}^{(k)}, 
$$
따라서,
$$
\begin{aligned} 
\bold{E} &amp; = \alpha_{0}\bold{E}^{(0)}+\alpha_{1}\bold{E}^{(1)}+\alpha_{2}\bold{E}^{(2)}+...++\alpha_{K}\bold{E}^{(K)} \<br>&amp; =\alpha_{0}\bold{E}^{(0)}+\alpha_{1}\tilde{A}\bold{E}^{(1)}+\alpha_{2}\tilde{A}^{2}\bold{E}^{(2)}+...+\alpha_{K}\tilde{A}^{K}\bold{E}^{(K)} 
\end{aligned}
$$</p>
]]></description>
        </item>
    </channel>
</rss>