<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>gugu_dragon.log</title>
        <link>https://velog.io/</link>
        <description>Lifetime Value Creator</description>
        <lastBuildDate>Wed, 23 Jun 2021 14:03:15 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>gugu_dragon.log</title>
            <url>https://images.velog.io/images/gugu_dragon/profile/9f299ff6-0cc3-453d-ae71-970396a3e04e/social.png</url>
            <link>https://velog.io/</link>
        </image>
        <copyright>Copyright (C) 2019. gugu_dragon.log. All rights reserved.</copyright>
        <atom:link href="https://v2.velog.io/rss/gugu_dragon" rel="self" type="application/rss+xml"/>
        <item>
            <title><![CDATA[데.분.레 - 복습 17]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-17</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-17</guid>
            <pubDate>Wed, 23 Jun 2021 14:03:15 GMT</pubDate>
            <description><![CDATA[<p><strong>지속률</strong> : 등록일 기준으로 이후 지정일 동안 사용자가 서비스를 얼마나 이용했는지 나타내는 지표</p>
<p><strong>정착률</strong> : 등록일 기준으로 지정한 7일 동안 사용자가 서비스를 사용했는지 나타내는 지표</p>
<pre><code>WITH action_log_with_mst_users AS (
  SELECT u.user_id
       , u.register_date
       , CAST(a.stamp AS date) AS action_date
       , MAX(CAST(a.stamp AS date)) OVER() AS latest_date
       , DATEADD(day, 1, u.register_date::date) AS next_day_1
    FROM mst_users AS u
  LEFT OUTER JOIN action_log AS a 
                  ON u.user_id = a.user_id
  )
SELECT *
  FROM action_log_with_mst_users
  ORDER BY register_date</code></pre><p><strong>CAST</strong> : 데이터 형식을 다른 데이터 형식으로 변환하는 역할
<strong>DATEADD</strong> : 날짜, 시간, 타임스탬프를 지정하는 값만큼 늘리는 역할
<strong>OUTER JOIN</strong> : A, B 테이블을 조인할 경우, 조건에 맞지 않는 데이터도 표시하고 싶을 때 사용한다. </p>
<pre><code>WITH action_log_with_mst_users AS (
  SELECT u.user_id
       , u.register_date
       , CAST(a.stamp AS date) AS action_date
       , MAX(CAST(a.stamp AS date)) OVER() AS latest_date
       , dateadd(day, 1, u.register_date::date) AS next_day_1
    FROM mst_users AS u
  LEFT OUTER JOIN action_log AS a 
                  ON u.user_id = a.user_id
  )
, user_action_flag AS (
  SELECT user_id
       , register_date
       , SIGN(SUM(CASE WHEN next_day_1 &lt;= latest_date THEN
                   CASE WHEN next_day_1 = action_date THEN 1 ELSE 0 END END))
         AS next_1_day_action
    FROM action_log_with_mst_users
    GROUP BY user_id, register_date
  )
SELECT *
  FROM user_action_flag
  ORDER BY register_date, user_id</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/dff88aa6-1e26-4127-b4e9-d068dd1895ae/image.png" alt="">
지정한 날의 다음날에 액션을 했는지 0과 1의 플래그로 표현했다. </p>
<pre><code>WITH action_log_with_mst_users AS (
  SELECT u.user_id
       , u.register_date
       , CAST(a.stamp AS date) AS action_date
       , MAX(CAST(a.stamp AS date)) OVER() AS latest_date
       , dateadd(day, 1, u.register_date::date) AS next_day_1
    FROM mst_users AS u
  LEFT OUTER JOIN action_log AS a 
                  ON u.user_id = a.user_id
  )
, user_action_flag AS (
  SELECT user_id
       , register_date
       , SIGN(SUM(CASE WHEN next_day_1 &lt;= latest_date THEN
                   CASE WHEN next_day_1 = action_date THEN 1 ELSE 0 END END))
         AS next_1_day_action
    FROM action_log_with_mst_users
    GROUP BY user_id, register_date
  )
SELECT register_date
     , AVG(100.0 * next_1_day_action) AS repeat_rate_1_day
  FROM user_action_flag
  GROUP BY register_date
  ORDER BY register_date</code></pre>]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 16]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-16</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-16</guid>
            <pubDate>Sun, 20 Jun 2021 13:14:59 GMT</pubDate>
            <description><![CDATA[<h2 id="시계열의-변화에-따른-사용자-전체의-상태-변화-찾기">시계열의 변화에 따른 사용자 전체의 상태 변화 찾기</h2>
<p><img src="https://images.velog.io/images/gugu_dragon/post/1763a2f7-c809-4524-a3ca-949fdbf44ad0/image.png" alt="">
<img src="https://images.velog.io/images/gugu_dragon/post/65f2eb51-0372-4551-b0bd-c31b3290445f/image.png" alt=""></p>
<h3 id="등록-수의-추이와-경향-보기">등록 수의 추이와 경향 보기</h3>
<pre><code>SELECT register_date
     , COUNT(DISTINCT user_id) AS register_count
  FROM mst_users
  GROUP BY register_date
  ORDER BY register_date</code></pre><pre><code>WITH mst_users_with_year_month AS (
  SELECT *
       , SUBSTRING(register_date, 1, 7) AS year_month
    FROM mst_users
  )
  SELECT year_month
       , COUNT(DISTINCT user_id) AS register_count
       , LAG(COUNT(DISTINCT user_id)) OVER(ORDER BY year_month) AS last_month_count
       , 1.0 * COUNT(DISTINCT user_id) / LAG(COUNT(DISTINCT user_id)) OVER(ORDER BY year_month) - 1
         AS month_over_month_ratio
  FROM mst_users_with_year_month
  GROUP BY year_month</code></pre><p>월별 등록수와 전월비를 구하는 쿼리이다. 
<strong>LAG</strong> : 명시된 값을 기준으로 이전 로우의 값을 반환</p>
<pre><code>WITH mst_users_with_year_month AS (
  SELECT *
       , SUBSTRING(register_date, 1, 7) AS year_month
    FROM mst_users
  )
SELECT year_month
     , COUNT(DISTINCT user_id) AS register_count
     , COUNT(DISTINCT CASE WHEN register_device = &#39;pc&#39; THEN user_id END) AS register_pc
     , COUNT(DISTINCT CASE WHEN register_device = &#39;sp&#39; THEN user_id END) AS register_sp
     , COUNT(DISTINCT CASE WHEN register_device = &#39;app&#39; THEN user_id END) AS register_app
  FROM mst_users_with_year_month
  GROUP BY year_month</code></pre><p>등록 디바이스별 추이를 확인하는 쿼리이다.</p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 15]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-15</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-15</guid>
            <pubDate>Tue, 15 Jun 2021 12:57:05 GMT</pubDate>
            <description><![CDATA[<h3 id="decile-분석을-사용해-사용자를-10단계-그룹으로-나누기">Decile 분석을 사용해 사용자를 10단계 그룹으로 나누기</h3>
<pre><code>WITH user_purchase_amount AS (
  SELECT user_id
       , SUM(amount) AS purchase_amount
    FROM action_log
    WHERE action = &#39;purchase&#39;
    GROUP BY user_id
  )
, users_with_decile AS (
  SELECT user_id
       , purchase_amount
       , ntile(10) OVER(ORDER BY purchase_amount DESC) AS decile
    FROM user_purchase_amount
  )
, decile_with_purchase_amount AS (
  SELECT decile
       , SUM(purchase_amount) AS amount
       , AVG(purchase_amount) AS avg_amount
       , SUM(SUM(purchase_amount)) OVER(ORDER BY decile) AS cumulative_amount
       , SUM(SUM(purchase_amount)) OVER () AS total_amount
    FROM users_with_decile
    GROUP BY decile
  )

SELECT *
  FROM decile_with_purchase_amount</code></pre><p><strong>NTILE</strong> : 정해준 값을 ~~개로 나누어라</p>
<h3 id="rfm-분석으로-사용자를-3가지-관점의-그룹으로-나누기">RFM 분석으로 사용자를 3가지 관점의 그룹으로 나누기</h3>
<p>R : Recency 최근 구매일
F : Frequency 구매 횟수
M : Monetary 구매금액 합계</p>
<pre><code>WITH purchase_log AS (
  SELECT user_id
       , amount
       , SUBSTRING(stamp, 1, 10) AS dt
    FROM action_log
    WHERE action = &#39;purchase&#39;
  )
, user_rfm AS (
  SELECT user_id
       , MAX(dt) AS recent_date
       , CURRENT_DATE - MAX(dt::date) AS recency
       , COUNT(dt) AS frequency
       , SUM(amount) AS monetary
    FROM purchase_log
    GROUP BY user_id
  )
SELECT *
  FROM user_rfm</code></pre>]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 14]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-14</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-14</guid>
            <pubDate>Mon, 14 Jun 2021 13:22:21 GMT</pubDate>
            <description><![CDATA[<h3 id="벤-다이어그램으로-사용자-액션-집계하기">벤 다이어그램으로 사용자 액션 집계하기</h3>
<pre><code>WITH user_action_flag AS (
  SELECT user_id
       , SIGN(SUM(CASE WHEN action = &#39;purchase&#39; THEN 1 ELSE 0 END)) AS has_purchase
       , SIGN(SUM(CASE WHEN action = &#39;review&#39; THEN 1 ELSE 0 END)) AS has_review
       , SIGN(SUM(CASE WHEN action = &#39;favorite&#39; THEN 1 ELSE 0 END)) AS has_favorite
     FROM action_log
     GROUP BY user_id
     )
SELECT *
  FROM user_action_flag</code></pre><p>사용자 단위로 purchase, review, favorite 이라는 3개의 액션을 행한 로그가 존재하는지를 0과 1의 플래그로 부여하였다.
<strong>SIGN</strong> : 함수 안의 인자가 음수면 -1, 0이면 0, 양수면 1을 반환하는 함수</p>
<pre><code>WITH user_action_flag AS (
  SELECT user_id
       , SIGN(SUM(CASE WHEN action = &#39;purchase&#39; THEN 1 ELSE 0 END)) AS has_purchase
       , SIGN(SUM(CASE WHEN action = &#39;review&#39; THEN 1 ELSE 0 END)) AS has_review
       , SIGN(SUM(CASE WHEN action = &#39;favorite&#39; THEN 1 ELSE 0 END)) AS has_favorite
     FROM action_log
     GROUP BY user_id
     )
, action_venn_diagram AS (
  SELECT has_purchase
       , has_review
       , has_favorite
       , COUNT(1) AS users
    FROM user_action_flag
    GROUP BY has_purchase, has_review, has_favorite

  UNION ALL
  SELECT NULL AS has_purchase
       , has_review
       , has_favorite
       , COUNT(1) AS users
    FROM user_action_flag
    GROUP BY has_review, has_favorite

  UNION ALL
  SELECT has_purchase
       , NULL AS has_review
       , has_favorite
       , COUNT(1) AS users
    FROM user_action_flag
    GROUP BY has_purchase, has_favorite

  UNION ALL
  SELECT has_purchase
       , has_review
       , NULL AS has_favorite
       , COUNT(1) AS users
    FROM user_action_flag
    GROUP BY has_purchase, has_review

  UNION ALL
  SELECT NULL AS has_purchase
       , NULL AS has_review
       , has_favorite
       , COUNT(1) AS users
    FROM user_action_flag
    GROUP BY has_favorite

  UNION ALL
  SELECT NULL AS has_purchase
       , has_review
       , NULL AS has_favorite
       , COUNT(1) AS users
    FROM user_action_flag
    GROUP BY has_review

  UNION ALL
  SELECT has_purchase
       , NULL AS has_review
       , NULL AS has_favorite
       , COUNT(1) AS users
    FROM user_action_flag
    GROUP BY has_purchase

  UNION ALL
  SELECT NULL AS has_purchase
       , NULL AS has_review
       , NULL AS has_favorite
       , COUNT(1) AS users
    FROM user_action_flag
)

SELECT *
  FROM action_venn_diagram
  ORDER BY has_purchase, has_review, has_favorite</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/c1c540dc-3451-41e5-9ba1-dd909364a0a9/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 13]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-13</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-13</guid>
            <pubDate>Tue, 08 Jun 2021 12:08:46 GMT</pubDate>
            <description><![CDATA[<h3 id="사용자의-방문-빈도-집계하기">사용자의 방문 빈도 집계하기</h3>
<pre><code>WITH action_log_with_dt AS(
  SELECT *
    , SUBSTRING(stamp, 1, 10) AS dt
    FROM action_log
  )
, action_day_count_per_user AS (
  SELECT user_id
       , COUNT(DISTINCT dt) AS action_day_count
    FROM action_log_with_dt
    WHERE dt BETWEEN &#39;2016-11-01&#39; AND &#39;2016-11-07&#39;
    GROUP BY user_id
  )

SELECT action_day_count
     , COUNT(DISTINCT user_id) AS user_count
  FROM action_day_count_per_user
  GROUP BY action_day_count
  ORDER BY action_day_count</code></pre><p>구성비와 구성비 누계도 추가로 구할 수 있다.</p>
<pre><code>WITH action_log_with_dt AS(
  SELECT *
    , SUBSTRING(stamp, 1, 10) AS dt
    FROM action_log
  )
, action_day_count_per_user AS (
  SELECT user_id
       , COUNT(DISTINCT dt) AS action_day_count
    FROM action_log_with_dt
    WHERE dt BETWEEN &#39;2016-11-01&#39; AND &#39;2016-11-07&#39;
    GROUP BY user_id
  )
SELECT action_day_count
     , COUNT(DISTINCT user_id) AS user_count
     , 100.0
       * COUNT(DISTINCT user_id)/SUM(COUNT(DISTINCT user_id)) OVER()
       AS composition_ratio
     , 100.0
       * SUM(COUNT(DISTINCT user_id)) OVER(ORDER BY action_day_count
                                           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
       / SUM(COUNT(DISTINCT user_id)) OVER()
       AS cumulative_ratio
  FROM action_day_count_per_user
  GROUP BY action_day_count
  ORDER BY action_day_count</code></pre>]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 12]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-12</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-12</guid>
            <pubDate>Thu, 03 Jun 2021 11:30:17 GMT</pubDate>
            <description><![CDATA[<h3 id="연령별-구분-집계하기">연령별 구분 집계하기</h3>
<p><img src="https://images.velog.io/images/gugu_dragon/post/b8887a11-bbf2-49a5-aca1-73d2c72ff4df/image.png" alt="">
mst_users 테이블이다.</p>
<pre><code>WITH mst_users_with_int_birth_date AS (
  SELECT *
       , 20170101 AS int_specific_date
       , CAST(REPLACE(SUBSTRING(birth_date, 1, 10), &#39;-&#39;, &#39;&#39;) AS integer) AS int_birth_date
    FROM mst_users
  )
, mst_users_with_age AS (
  SELECT *
       , FLOOR((int_specific_date-int_birth_date)/10000) AS age
    FROM mst_users_with_int_birth_date
    )
SELECT user_id
     , sex
     , birth_date
     , age
  FROM mst_users_with_age</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/b2e71d34-7ef0-4ed7-9c74-523d2aa16276/image.png" alt="">
<strong>REPLACE</strong> : 바꾸려는 문자열과 바꾸려는 문자를 입력하고, 어떤 문자로 바꿀지를 입력하면 된다. 
<strong>CAST</strong> : 데이터 형식을 다른 데이터 형식으로 바꿔주는 함수이다. 
이 두가지 함수를 사용하여 특정 날짜에 사용자의 나이를 구할 수 있다</p>
<p>이 <strong>WITH</strong> 구문을 사용하여 연령별 구분을 추가할 수 있다.</p>
<pre><code>WITH mst_users_with_int_birth_date AS (
  SELECT *
       , 20170101 AS int_specific_date
       , CAST(REPLACE(SUBSTRING(birth_date, 1, 10), &#39;-&#39;, &#39;&#39;) AS integer) AS int_birth_date
    FROM mst_users
  )
, mst_users_with_age AS (
  SELECT *
       , FLOOR((int_specific_date-int_birth_date)/10000) AS age
    FROM mst_users_with_int_birth_date
    )
, mst_users_with_category AS (
  SELECT user_id
       , sex
       , age
       , CONCAT(CASE WHEN 20&lt;=age THEN sex
                ELSE &#39;&#39; END
       , CASE WHEN age BETWEEN 4 AND 12 THEN &#39;C&#39;
              WHEN age BETWEEN 13 AND 19 THEN &#39;T&#39;
              WHEN age BETWEEN 20 AND 34 THEN &#39;1&#39;
              WHEN age BETWEEN 35 AND 49 THEN &#39;2&#39;
              WHEN age &gt;= 50 THEN &#39;3&#39;
              END) AS category
     FROM mst_users_with_age)

SELECT *
  FROM mst_users_with_category</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/0011ae81-fdbb-4392-9a1b-b51d5e1e00e3/image.png" alt="">
<strong>CONCAT</strong> : 문자열을 합쳐주는 역할을 한다.</p>
<h3 id="연령별-구분의-특징-추출하기">연령별 구분의 특징 추출하기</h3>
<p>이번에는 action_log 테이블과 <strong>JOIN</strong>하여 각각 구매한 상품의 카테고리를 집계해보자.
<img src="https://images.velog.io/images/gugu_dragon/post/4be89797-3816-4a5e-864a-7f2d6a1b6951/image.png" alt=""></p>
<pre><code>WITH mst_users_with_int_birth_date AS (
  SELECT *
       , 20170101 AS int_specific_date
       , CAST(REPLACE(SUBSTRING(birth_date, 1, 10), &#39;-&#39;, &#39;&#39;) AS integer) AS int_birth_date
    FROM mst_users
  )
, mst_users_with_age AS (
  SELECT *
       , FLOOR((int_specific_date-int_birth_date)/10000) AS age
    FROM mst_users_with_int_birth_date
    )
, mst_users_with_category AS (
  SELECT user_id
       , sex
       , age
       , CONCAT(CASE WHEN 20&lt;=age THEN sex
                ELSE &#39;&#39; END
       , CASE WHEN age BETWEEN 4 AND 12 THEN &#39;C&#39;
              WHEN age BETWEEN 13 AND 19 THEN &#39;T&#39;
              WHEN age BETWEEN 20 AND 34 THEN &#39;1&#39;
              WHEN age BETWEEN 35 AND 49 THEN &#39;2&#39;
              WHEN age &gt;= 50 THEN &#39;3&#39;
              END) AS category
     FROM mst_users_with_age)

SELECT p.category AS product_category
     , u.category AS user_category
     , COUNT(*) AS purchase_count
  FROM action_log AS p
  JOIN mst_users_with_category AS u
  ON p.user_id = u.user_id
  WHERE action = &#39;purchase&#39;
  GROUP BY p.category, u.category
  ORDER BY p.category, u.category</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/31f27965-4ada-48a7-97b6-2bb9c2631575/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 11]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-11</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-11</guid>
            <pubDate>Tue, 01 Jun 2021 08:01:16 GMT</pubDate>
            <description><![CDATA[<h2 id="사용자-전체의-특징과-경향-찾기">사용자 전체의 특징과 경향 찾기</h2>
<p>사용자의 속성 또는 행동과 관련된 정보를 집계해서 사용자 행동을 조사하고, 서비스를 개선할 때 실마리가 될 수 있는 리포트를 만들어보자.</p>
<p>mst_users 테이블
<img src="https://images.velog.io/images/gugu_dragon/post/bccc7335-738c-4527-ab3c-3bf56fda34b8/image.png" alt="">
action_log 테이블
<img src="https://images.velog.io/images/gugu_dragon/post/8a492705-91b1-4180-92de-02e046e5445f/image.png" alt=""></p>
<h3 id="사용자의-액션-수-집계하기">사용자의 액션 수 집계하기</h3>
<pre><code>WITH stats AS (
  SELECT COUNT(DISTINCT session) AS total_uu
    FROM action_log
    )

SELECT l.action
     , COUNT(DISTINCT l.session) AS action_uu
     , COUNT(1) AS action_count
     , s.total_uu
     , 100.0 * COUNT(DISTINCT l.session)/s.total_uu AS usage_rate
     , 1.0 * COUNT(1) / COUNT(DISTINCT l.session) AS count_per_user
  FROM action_log AS l
  CROSS JOIN stats AS s
  GROUP BY l.action, s.total_uu</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/249fd4d4-4cc7-40e4-a706-0db8e029ea86/image.png" alt=""></p>
<h3 id="로그인-사용자와-비로그인-사용자를-구분해서-집계하기">로그인 사용자와 비로그인 사용자를 구분해서 집계하기</h3>
<pre><code>WITH action_log_with_status AS (
  SELECT session
       , user_id
       , action
       , CASE WHEN COALESCE(user_id, &#39;&#39;) &lt;&gt; &#39;&#39; THEN &#39;login&#39; ELSE &#39;guest&#39; END
         AS login_status
    FROM action_log
  )
SELECT *
  FROM action_log_with_status</code></pre><p><strong>COALESCE</strong> : 괄호 안의 값 중에서 null이 아닌 값을 먼저 반환한다. 위 쿼리에서는 user_id가 null 값인 경우에는 &#39;&#39;이 출력되고, user_id가 null 값이 아닌 경우에는 user_id를 그대로 출력하게 되는 것이다. </p>
<pre><code>WITH action_log_with_status AS (
  SELECT session
       , user_id
       , action
       , CASE WHEN COALESCE(MAX(user_id)
                            OVER(PARTITION BY session 
                                 ORDER BY stamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
                            , &#39;&#39;) &lt;&gt; &#39;&#39; THEN &#39;member&#39; ELSE &#39;none&#39;
          END AS member_status
    FROM action_log
  )

SELECT *
  FROM action_log_with_status</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/7803c829-fa19-4c96-9ae6-8c627f73930f/image.png" alt="">
단 한번이라도 로그인 이력이 있는 사람은 member로 처리하는 쿼리이다. </p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 10]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-10-k1kmw9jv</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-10-k1kmw9jv</guid>
            <pubDate>Mon, 31 May 2021 13:00:49 GMT</pubDate>
            <description><![CDATA[<pre><code>WITH stats AS (
SELECT MAX(price) AS max_price
     , MIN(price) AS min_price
     , MAX(price) - MIN(price) AS range_price
     , 10 AS bucket_num
  FROM purchase_detail_log
  )

SELECT *
  FROM stats</code></pre><p>최댓값과 최솟값 사이의 차이를 구한다. </p>
<pre><code>WITH stats AS (
SELECT MAX(price) AS max_price
     , MIN(price) AS min_price
     , MAX(price) - MIN(price) AS range_price
     , 10 AS bucket_num
  FROM purchase_detail_log
  )
, purchase_log_with_bucket AS (
  SELECT price
       , min_price
       , price - min_price AS diff
       , 1.0 * range_price / bucket_num AS bucket_range
       , FLOOR(1.0 * (price-min_price)/(1.0*range_price/bucket_num)) + 1 AS bucket
    FROM purchase_detail_log, stats
   )

SELECT *
  FROM purchase_log_with_bucket
  ORDER BY amount</code></pre>]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 9]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-9</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-9</guid>
            <pubDate>Thu, 27 May 2021 13:46:59 GMT</pubDate>
            <description><![CDATA[<pre><code>WITH daily_purchase AS (
  SELECT dt
       , SUBSTRING(dt, 1, 4) AS year
       , SUBSTRING(dt, 6, 2) AS month
       , SUBSTRING(dt, 9, 2) AS date
       , SUM(purchase_amount) AS purchase_amount
       , COUNT(order_id) AS orders
    FROM purchase_log
    GROUP BY dt)

SELECT *
  FROM daily_purchase</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/53c136df-9fc1-4eb5-be51-f16db7b972e1/image.png" alt="">
위 쿼리를 이용하여 월별 매출과 작대비(作對比)를 구해보겠다.</p>
<pre><code>WITH daily_purchase AS (
  SELECT dt
       , SUBSTRING(dt, 1, 4) AS year
       , SUBSTRING(dt, 6, 2) AS month
       , SUBSTRING(dt, 9, 2) AS date
       , SUM(purchase_amount) AS purchase_amount
       , COUNT(order_id) AS orders
    FROM purchase_log
    GROUP BY dt)

SELECT month
     , SUM(CASE year WHEN &#39;2014&#39; THEN purchase_amount END) AS amount_2014
     , SUM(CASE year WHEN &#39;2015&#39; THEN purchase_amount END) AS amount_2015
     , 100.0 * SUM(CASE year WHEN &#39;2015&#39; THEN purchase_amount END)
             / SUM(CASE year WHEN &#39;2014&#39; THEN purchase_amount END)
             AS rate
  FROM daily_purchase
  GROUP BY month
  ORDER BY month</code></pre><h3 id="z차트로-업적의-추이-확인하기">Z차트로 업적의 추이 확인하기</h3>
<p>Z차트는 다음 세가지 요소로 구성된다.</p>
<p><strong>월차매출</strong> : 월별 매출 합계
<strong>매출누계</strong> : 매출을 집계한 시점부터 누적한 매출
<strong>이동년계</strong> : 해당 월의 매출에 과거 11개월의 매출을 합한 값</p>
<pre><code>WITH daily_purchase AS (
  SELECT dt
       , SUBSTRING(dt, 1, 4) AS year
       , SUBSTRING(dt, 6, 2) AS month
       , SUBSTRING(dt, 9, 2) AS date
       , SUM(purchase_amount) AS purchase_amount
       , COUNT(order_id) AS orders
    FROM purchase_log
    GROUP BY dt)
, monthly_amount AS (
   SELECT year
        , month
        , SUM(purchase_amount) AS amount
     FROM daily_purchase
     GROUP BY year, month)
, calc_index AS (
    SELECT year
         , month
         , amount
         , SUM(CASE WHEN year = &#39;2015&#39; THEN amount END)
               OVER(ORDER BY year, month ROWS UNBOUNDED PRECEDING)
               AS agg_amount
         , SUM(amount) OVER(ORDER BY year, month
                             ROWS BETWEEN 11 PRECEDING AND CURRENT ROW)
               AS year_avg_amount
      FROM monthly_amount
      ORDER BY year, month)

SELECT CONCAT(CONCAT(year, &#39;-&#39;), month) AS year_month
     , amount
     , agg_amount
     , year_avg_amount
  FROM calc_index
  WHERE year = &#39;2015&#39;
  ORDER BY year_month</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/ee74bcc2-a25f-409d-9cfe-4319190a7603/image.png" alt=""></p>
<h2 id="다면적인-축을-사용해-데이터-집약하기">다면적인 축을 사용해 데이터 집약하기</h2>
<p><img src="https://images.velog.io/images/gugu_dragon/post/79ca309a-1ecc-4d89-a8f3-bbe0f35a8be5/image.png" alt=""></p>
<pre><code>WITH sub_category_amount AS(
  SELECT category AS category
       , sub_category AS sub_category
       , SUM(price) AS amount
    FROM purchase_detail_log
    GROUP BY category, sub_category
  )
, category_amount AS (
  SELECT category
       , &#39;all&#39; AS sub_category
       , SUM(price) AS amount
    FROM purchase_detail_log
    GROUP BY category
  )
, total_amount AS (
  SELECT &#39;all&#39; AS category
       , &#39;all&#39; AS sub_category
       , SUM(price) AS amount
    FROM purchase_detail_log)

          SELECT category, sub_category, amount FROM sub_category_amount
UNION ALL SELECT category, sub_category, amount FROM category_amount
UNION ALL SELECT category, sub_category, amount FROM total_amount</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/c0b3726d-da95-44b5-828d-c8bd21ee6c05/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 8]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-8</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-8</guid>
            <pubDate>Wed, 26 May 2021 12:22:56 GMT</pubDate>
            <description><![CDATA[<h2 id="유사-테이블-만들기">유사 테이블 만들기</h2>
<h3 id="임의의-레코드를-가진-유사-테이블-만들기">임의의 레코드를 가진 유사 테이블 만들기</h3>
<pre><code>WITH mst_devices AS(
          SELECT 1 AS device_id, &#39;PC&#39; AS device_name
UNION ALL SELECT 2 AS device_id, &#39;SP&#39; AS device_name
UNION ALL SELECT 3 AS device_id, &#39;애플리케이션&#39; AS device_name
)

SELECT *
  FROM mst_devices</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/ab68e2a1-f6a9-4906-bea8-d1d7cda85782/image.png" alt=""></p>
<pre><code>WITH mst_devices AS(
          SELECT 1 AS device_id, &#39;PC&#39; AS device_name
UNION ALL SELECT 2 AS device_id, &#39;SP&#39; AS device_name
UNION ALL SELECT 3 AS device_id, &#39;애플리케이션&#39; AS device_name
)

SELECT u.user_id
     , d.device_name
  FROM mst_users AS u
  LEFT JOIN mst_devices AS d
  ON u.register_device = d.device_id</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/a50c4adb-b5d5-41b9-adc9-d0923dacce48/image.png" alt="">
위 두 쿼리의 결과는 똑같다. <strong>UNION ALL</strong>은 처리가 비교적 무거우므로 레코드 수가 많아지면 성능 문제가 발생할 수 있다.</p>
<h1 id="매출을-파악하기-위한-데이터-추출">매출을 파악하기 위한 데이터 추출</h1>
<p><img src="https://images.velog.io/images/gugu_dragon/post/5ef9b85d-617f-464c-9671-992b70d23a96/image.png" alt=""></p>
<pre><code>SELECT dt
     , COUNT(*) AS purchase_count
     , SUM(purchase_amount) AS total_amount
     , AVG(purchase_amount) AS avg_amount
  FROM purchase_log
  GROUP BY dt
  ORDER BY dt</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/943ddd5b-8902-4b00-a10a-e47dcbf90d7d/image.png" alt=""></p>
<h3 id="이동평균을-사용한-날짜별-추이-보기">이동평균을 사용한 날짜별 추이 보기</h3>
<pre><code>SELECT dt
     , SUM(purchase_amount) AS total_amount
     , AVG(SUM(purchase_amount))
       OVER(ORDER BY dt ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) 
       AS seven_day_avg
     , CASE WHEN 7 = COUNT(*)
                     OVER(ORDER BY dt ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
            THEN AVG(SUM(purchase_amount))
                 OVER(ORDER BY dt ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
            END AS seven_day_avg_strict
  FROM purchase_log
  GROUP BY dt
  ORDER BY dt</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/444f14b2-93b4-4560-925e-591132cc009a/image.png" alt=""></p>
<h3 id="당월-매출-누계-구하기">당월 매출 누계 구하기</h3>
<pre><code>SELECT dt
     , SUBSTRING(dt, 1, 7) AS year_month
     , SUM(purchase_amount) AS total_amount
     , SUM(SUM(purchase_amount))
           OVER(PARTITION BY SUBSTRING(dt, 1, 7)
                ORDER BY dt ROWS UNBOUNDED PRECEDING)
            AS agg_amount
  FROM purchase_log
  GROUP BY dt
  ORDER BY dt</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/7326eeee-9111-400e-9e37-b3907a356d1a/image.png" alt="">
위 쿼리의 <strong>ROWS UNBOUNDED PRECEDING</strong>은 <strong>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</strong>로 바꾸어도 된다.
위 쿼리는 가독성 측면에서 수정할 수 있는 부분이 있다.</p>
<pre><code>WITH daily_purchase AS (
  SELECT dt
       , SUBSTRING(dt, 1, 4) AS year
       , SUBSTRING(dt, 6, 2) AS month
       , SUBSTRING(dt, 9, 2) AS date
       , SUM(purchase_amount) AS purchase_amount
       , COUNT(order_id) AS orders
    FROM purchase_log
    GROUP BY dt)

SELECT *
  FROM daily_purchase
  ORDER BY dt</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/669b0d1e-495a-4f0b-9a86-74e541e84022/image.png" alt="">
위 쿼리를 이용해서 당월 누계 매출을 구할 수 있다.</p>
<pre><code>WITH daily_purchase AS (
  SELECT dt
       , SUBSTRING(dt, 1, 4) AS year
       , SUBSTRING(dt, 6, 2) AS month
       , SUBSTRING(dt, 9, 2) AS date
       , SUM(purchase_amount) AS purchase_amount
       , COUNT(order_id) AS orders
    FROM purchase_log
    GROUP BY dt)

SELECT dt
     , year || &#39;-&#39; || month AS year_month
     , purchase_amount
     , SUM(purchase_amount)
           OVER(PARTITION BY year_month
                ORDER BY dt ROWS UNBOUNDED PRECEDING)
                AS agg_amount
  FROM daily_purchase
  ORDER BY dt</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/047e3b56-a6bc-41de-acc9-43710fa1d20a/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 7]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-7</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-7</guid>
            <pubDate>Tue, 25 May 2021 12:39:13 GMT</pubDate>
            <description><![CDATA[<h3 id="조건-플래그를-0과-1로-표현하기">조건 플래그를 0과 1로 표현하기</h3>
<p><img src="https://images.velog.io/images/gugu_dragon/post/0d0f5e09-6677-4c85-bbcf-acf45abf3080/image.png" alt="">
<img src="https://images.velog.io/images/gugu_dragon/post/7af5bdda-e643-4106-b8e1-1cfae76c411a/image.png" alt=""></p>
<pre><code>SELECT m.user_id
     , m.card_number
     , COUNT(p.user_id) AS purchase_count
     , CASE WHEN m.card_number IS NOT NULL THEN 1
            ELSE 0 END AS has_card
     , SIGN(COUNT(p.user_id)) AS has_purchased
  FROM mst_users_with_card_number AS m
  LEFT JOIN purchase_log AS p
       ON m.user_id = p.user_id
  GROUP BY m.user_id, m.card_number</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/221905a0-2372-4ae6-9b8a-4b45dc872aec/image.png" alt="">
<strong>SIGN</strong> : 괄호 안의 값의 음수/양수/0 여부를 -1/+1/0 으로 보여준다. </p>
<h3 id="계산한-테이블에-이름-붙여-재사용하기">계산한 테이블에 이름 붙여 재사용하기</h3>
<p><strong>CTE</strong> : Common Table Expression 의 준말로, 공통 테이블 식을 뜻한다. 일시적인 테이블에 이름을 붙여 재사용할 수 있는 기능이다. 이를 활용하면 복잡한 쿼리의 가독성이 크게 높아진다. 
<img src="https://images.velog.io/images/gugu_dragon/post/1d2210a3-650e-4584-87b8-d30830f4138b/image.png" alt=""></p>
<pre><code>WITH product_sales_ranking AS (
SELECT category_name
     , product_id
     , sales
     , ROW_NUMBER() OVER (PARTITION BY category_name
                          ORDER BY sales DESC) AS rank
  FROM product_sales
  )
  SELECT *
    FROM product_sales_ranking</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/158021fd-814e-47c9-9fcb-4b3bd0516f79/image.png" alt="">
<strong>WITH</strong> : 이름을 가진 서브쿼리를 정의한 후 사용하는 구문이다. 쿼리의 전체적인 가독성을 높이고, 재사용할 수 있는 장점이 있다. 
<strong>WITH</strong> 별명 <strong>AS (SUB QUERY)</strong> 와 같은 형식으로 사용한다. </p>
<pre><code>WITH product_sales_ranking AS (
SELECT category_name
     , product_id
     , sales
     , ROW_NUMBER() OVER (PARTITION BY category_name
                          ORDER BY sales DESC) AS rank
  FROM product_sales
  )
, mst_rank AS (
  SELECT DISTINCT rank
    FROM product_sales_ranking
  )

SELECT *
  FROM mst_rank</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/a51fe581-18fc-4219-a362-147c5ddbdeea/image.png" alt="">
<strong>WITH</strong>절에 기입된 서브쿼리의 경우, 먼저 사용된 쿼리는 뒤의 서브쿼리에서 사용될 수 있다. <strong>WITH</strong> 구문을 사용해서 여러 테이블을 정의할 떄는 쉼표를 사용해 테이블을 나열한다. </p>
<pre><code>WITH product_sales_ranking AS (
SELECT category_name
     , product_id
     , sales
     , ROW_NUMBER() OVER (PARTITION BY category_name
                          ORDER BY sales DESC) AS rank
  FROM product_sales
  )
, mst_rank AS (
  SELECT DISTINCT rank
    FROM product_sales_ranking
  )
SELECT m.rank
     , r1.product_id AS dvd
     , r1.sales AS dvd_sales
     , r2.product_id AS cd
     , r2.sales AS cd_sales
     , r3.product_id AS book
     , r3.sales AS book_sales
  FROM mst_rank AS m
  LEFT JOIN
    product_sales_ranking AS r1
    ON m.rank = r1.rank
    AND r1.category_name = &#39;dvd&#39;
  LEFT JOIN
    product_sales_ranking AS r2
    ON m.rank = r2.rank
    AND r2.category_name = &#39;cd&#39;
  LEFT JOIN
    product_sale_ranking AS r3
    ON m.rank = r3.rank
    AND r3.category_name = &#39;book&#39;
  ORDER BY m.rank</code></pre><h3 id="유사-테이블-만들기">유사 테이블 만들기</h3>
<p>이건 다음 포스트에서!</p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 6]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-5-2wmv78lv</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-5-2wmv78lv</guid>
            <pubDate>Sun, 23 May 2021 05:02:07 GMT</pubDate>
            <description><![CDATA[<h3 id="세로-기반-데이터를-가로-기반으로-변환하기">세로 기반 데이터를 가로 기반으로 변환하기</h3>
<p>SQL은 행 기반으로 처리하는 것이 기본이다. 행 단위로 저장된 &#39;세로기반&#39;을 열 또는 쉼표로 구분된 문자열 등의 &#39;가로 기반&#39;으로 변환해보자.</p>
<p>다음과 같은 테이블이 있다.
<img src="https://images.velog.io/images/gugu_dragon/post/34d79c6f-466c-47d3-8028-023819c40cfb/image.png" alt=""></p>
<pre><code>SELECT dt
     , MAX(CASE WHEN indicator = &#39;impressions&#39; THEN val END) AS impressions
     , MAX(CASE WHEN indicator = &#39;sessions&#39; THEN val END) AS sessions
     , MAX(CASE WHEN indicator = &#39;users&#39; THEN val END) AS users
  FROM daily_kpi
  GROUP BY dt
  ORDER BY dt</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/f9f69408-7d47-4692-b259-93d4700cee79/image.png" alt=""></p>
<p>다음과 같은 테이블에서는 위의 방법이 유효하지 않다. 열의 수를 미리 정할 수 없기 때문이다. 따라서 이 경우에는 데이터를 쉼표 등으로 구분한 문자열로 변환하는 방법을 생각해 볼 수 있다.
<img src="https://images.velog.io/images/gugu_dragon/post/a0828c9b-41f2-4b76-9696-72afe598b072/image.png" alt=""></p>
<pre><code>SELECT purchase_id
     , listagg(product_id, &#39;, &#39;) AS product_ids
     , SUM(price) AS amount
  FROM purchase_detail_log
  GROUP BY purchase_id
  ORDER BY purchase_id</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/fc560aa8-a55f-4232-b451-8bb3e8140d04/image.png" alt="">
<strong>LISTAGG</strong> : 행을 문자열로 집약하는 함수. REDSHIFT에서 사용된다. PostgreSQL 에서는 <strong>STRING_AGG</strong> 함수를 사용한다. </p>
<h3 id="가로-기반-데이터를-세로-기반으로-변환하기">가로 기반 데이터를 세로 기반으로 변환하기</h3>
<p>가로 기반 데이터를 세로 기반으로 변환하는 것은 간단한 일이 아니다. 세로 기반 데이터를 가로 기반으로 변환해서 가공이 쉬운 데이터 형식으로 만드는 방법을 알아보자. 
<img src="https://images.velog.io/images/gugu_dragon/post/0ef8a420-e1c3-4d38-8462-85ba7a4a5e4c/image.png" alt="">
위 테이블은 q1부터 q4까지 행으로 전개할 데이터 수가 고정되었다. 그러한 데이터 수와 같은 수의 일련 번호를 가진 피벗 테이블을 만들고 <strong>CROSS JOIN</strong>하면 된다. </p>
<pre><code>SELECT q.year
     , CASE WHEN p.idx = 1 THEN &#39;q1&#39;
            WHEN p.idx = 2 THEN &#39;q2&#39;
            WHEN p.idx = 3 THEN &#39;q3&#39;
            WHEN p.idx = 4 THEN &#39;q4&#39;
            END AS quarter
     , CASE WHEN p.idx = 1 THEN q.q1
            WHEN p.idx = 2 THEN q.q2
            WHEN p.idx = 3 THEN q.q3
            WHEN p.idx = 4 THEN q.q4
            END AS sales
  FROM quarterly_sales AS q
  CROSS JOIN (          SELECT 1 AS idx
              UNION ALL SELECT 2 AS idx
              UNION ALL SELECT 3 AS idx
              UNION ALL SELECT 4 AS idx) AS p
  ORDER BY year, quarter</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/46e68667-b6a0-472d-af8c-0349a147feea/image.png" alt="">
자세한 설명도 없이 <strong>CROSS JOIN</strong> 과 <strong>UNION</strong> 구문이 나와서 이해가 안 된 부분이다. 뒤에 챕터에서 자세히 설명이 나온다고 하니... 일단 그런가보다 하고 패스.</p>
<h2 id="여러-개의-테이블-조작하기">여러 개의 테이블 조작하기</h2>
<h3 id="여러-개의-테이블을-세로로-결합하기">여러 개의 테이블을 세로로 결합하기</h3>
<p><img src="https://images.velog.io/images/gugu_dragon/post/47b69e98-9843-4b40-ab21-4ccf352ea7e6/image.png" alt="">
<img src="https://images.velog.io/images/gugu_dragon/post/739c4d1f-21c7-4674-9166-51ae8b088b72/image.png" alt="">
위와 같은 2개의 테이블이 있다. </p>
<pre><code>SELECT &#39;app1&#39; AS app_name
     , user_id
     , name
     , email
  FROM app1_mst_users
UNION ALL
SELECT &#39;app2&#39; AS app_name
     , user_id
     , name
     , NULL AS email
  FROM app2_mst_users</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/fbbca177-90c4-4236-9592-f10da452dd4f/image.png" alt="">
이처럼 비슷한 구조를 가지는 테이블의 데이터를 일괄 처리하고 싶은 경우, <strong>UNION ALL</strong> 구문을 사용해 여러 개의 테이블을 세로로 결합하면 된다. 결합할 때는 테이블의 컬럼이 완전히 일치해야 하므로, 한쪽 테이블에만 존재하는 컬럼은 <strong>SELECT</strong>구문에서 포함시키지 않거나, email 컬럼처럼 디폴트 값을 주면 된다. </p>
<blockquote>
<p><strong>UNION ALL</strong> 구문 대신 <strong>UNION</strong> 또는 <strong>UNION DISTINCT</strong> 구문을 사용하면 데이터의 중복을 제외한 결과를 얻을 수 있다. </p>
</blockquote>
<h3 id="여러-개의-테이블을-가로로-정렬하기">여러 개의 테이블을 가로로 정렬하기</h3>
<p>다음과 같은 3개의 테이블이 있다.
<img src="https://images.velog.io/images/gugu_dragon/post/ec4d87c1-1053-446c-b2db-3969435b960b/image.png" alt="">
<img src="https://images.velog.io/images/gugu_dragon/post/cce8d344-e36f-4d8e-857d-f3083edcd4a8/image.png" alt="">
<img src="https://images.velog.io/images/gugu_dragon/post/e31b4945-6980-49b2-804c-5a8c23c3c8e4/image.png" alt="">
맨 위의 테이블이 마스터 테이블이다. 다음 코드 예는 마스터 테이블의 행 수를 유지한 상태로 여러 개의 테이블을 가로롤 정렬하는 쿼리이다. </p>
<pre><code>SELECT m.category_id
     , m.name
     , s.sales
     , r.product_id AS top_sales_product
  FROM mst_categories AS m
  LEFT JOIN category_sales AS s
            ON m.category_id = s.category_id
  LEFT JOIN product_sale_ranking AS r
            ON m.category_id = r.category_id
            AND r.rank = 1</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/bdd7c3b3-1b16-4af5-87d0-1439344e9d9d/image.png" alt="">
<strong>JOIN</strong> : 서로 다른 테이블을 결합해준다. <strong>ON</strong>을 이용해 <strong>JOIN</strong> 구문의 조건을 줄 수 있다. </p>
<pre><code>SELECT m.category_id
     , m.name
     , (SELECT s.sales
          FROM category_sales AS s
          WHERE m.category_id = s.category_id) AS sales
     , (SELECT r.product_id
          FROM product_sale_ranking AS r
          WHERE m.category_id = r.category_id
          ORDER BY sales DESC
          LIMIT 1) AS top_sales_product
  FROM mst_categories AS m
  ORDER BY 1</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/ad0c4459-f961-462d-b5bf-29ceac6c8964/image.png" alt="">
위 예시는 서브 쿼리를 사용해서 이전의 코드와 같은 결과를 내는 코드를 만든 것이다. <strong>JOIN</strong>을 사용하지 않아 원래 마스터 테이블의 행 수가 변할 걱정 자체가 없으므로, 테이블의 누락과 중복을 회피할 수 있다. </p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 5]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-5</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-5</guid>
            <pubDate>Fri, 21 May 2021 14:18:45 GMT</pubDate>
            <description><![CDATA[<pre><code>SELECT product_id
     , score
     , ROW_NUMBER() OVER(ORDER BY score DESC) AS row
     , SUM(score) OVER(ORDER BY score DESC
                        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
                        AS cum_score
     , AVG(score) OVER(ORDER BY score DESC
                        ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
                        AS local_avg
     , FIRST_VALUE(product_id) OVER (ORDER BY score DESC
                                      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
                                      AS first_value
     , LAST_VALUE(product_id) OVER(ORDER BY score DESC
                                    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
                                    AS last_value
  FROM popular_products
  ORDER BY row</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/d5152de3-d04a-465b-ae98-0b7dfa7d9e10/image.png" alt="">
<strong>ORDER BY</strong> 구문과 <strong>SUM/AVG</strong> 등의 집약함수를 조합하면, 집약 함수의 적용 범위를 유연하게 지정할 수 있다.
<strong>FIRST_VALUE / LAST_VALUE</strong> : 각각 윈도 내부의 가장 첫번째 레코드와 가장 마지막 레코드를 추출해주는 함수.</p>
<blockquote>
<p>프레임 지정 구문에는 여러가지 종류가 있다.
가장 기본적인 것은 <strong>ROWS BETWEEN start AND end</strong> 이다.
start와 end에는 다음과 같은 키워드가 올 수 있다.
<strong>CURRENT ROW</strong> : 현재의 행
<strong>n PRECEDING</strong> : n행 앞
<strong>n FOLLOWING</strong> : n행 뒤
<strong>UNBOUNDED PRECEDING</strong> : 이전 행 전부
<strong>UNBOUNDED FOLLOWING</strong> : 이후 행 전부</p>
</blockquote>
<pre><code>SELECT category
     , product_id
     , score
     , ROW_NUMBER() OVER(PARTITION BY category 
                         ORDER BY score DESC) AS row
     , RANK() OVER(PARTITION BY category
                   ORDER BY score DESC) AS rank
     , DENSE_RANK() OVER(PARTITION BY category
                         ORDER BY score DESC) AS dense_rank
  FROM popular_products
  ORDER BY category, row</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/ca418410-cc29-4794-b69c-63914708bfe5/image.png" alt=""></p>
<pre><code>SELECT *
  FROM (SELECT category
             , product_id
             , score
             , ROW_NUMBER() OVER(PARTITION BY category
                                 ORDER BY score DESC) AS rank
          FROM popular_products) AS popular_products_with_rank
  WHERE rank &lt;= 2
  ORDER BY category, rank</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/12df84c8-59ab-4eae-910f-ec19b44a7439/image.png" alt="">
카테고리들의 순위 상위 2개까지의 상품을 추출하는 쿼리이다.</p>
<pre><code>SELECT DISTINCT category
     , FIRST_VALUE(product_id) OVER(PARTITION BY category
                                    ORDER BY score DESC
                                    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
                                    AS product_id
  FROM popular_products</code></pre><p>카테고리별 순위 순서에서 상위 1개의 상품 ID를 추출할 경우, 위처럼 <strong>FIRST_VALUE</strong> 윈도 함수를 사용하고, <strong>SELECT DISTINCT</strong> 구문으로 결과를 집약해도 된다. </p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 4]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-4</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-4</guid>
            <pubDate>Thu, 20 May 2021 12:29:25 GMT</pubDate>
            <description><![CDATA[<h3 id="그룹의-특징-잡기">그룹의 특징 잡기</h3>
<p><img src="https://images.velog.io/images/gugu_dragon/post/b9a8be86-3460-4fb6-8b1f-9392bc945b22/image.png" alt=""></p>
<pre><code>SELECT COUNT(*) AS total_count
     , COUNT(DISTINCT user_id) AS user_count
     , COUNT(DISTINCT product_id) AS product_count
     , SUM(score) AS sum
     , AVG(score) AS avg
     , MAX(score) AS max
     , MIN(score) AS min
  FROM review</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/b56cf386-2535-42d0-bf81-93fdaaef0db8/image.png" alt="">
이미 한번씩 공부했던 쿼리이다</p>
<pre><code>SELECT user_id
     , COUNT(*) AS total_count
     , COUNT(DISTINCT product_id) AS product_count
     , SUM(score) AS sum
     , AVG(score) AS avg
     , MAX(score) AS max
     , MIN(score) AS min
  FROM review
  GROUP BY user_id</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/17647703-c832-4701-98ed-1098d842d7b0/image.png" alt="">
<strong>GROUP BY</strong> 구문을 사용한 쿼리에서는, <strong>GROUP BY</strong> 구문에 지정한 컬럼 또는 집약 함수만 <strong>SELECT</strong> 구문의 컬럼으로 지정할 수 있다. <strong>GROUP BY</strong> 구문에 지정한 컬럼을 유니크 키로 새로운 테이블을 만들기 때문이다. </p>
<pre><code>SELECT user_id
     , product_id
     , score
     , AVG(score) OVER() AS avg_score
     , AVG(score) OVER(PARTITION BY user_id) AS user_avg_score
     , score - AVG(score) OVER(PARTITION BY user_id) AS user_avg_score_diff
  FROM review</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/fad384b9-e954-4d37-8c91-6498dffec9a8/image.png" alt="">
<strong>OVER</strong> 구문에 매개 변수를 지정하지 않으면 테이블 전체에 집약 함수를 적용한 값이 리턴된다. 매개변수에 <strong>PARTITION BY</strong> &lt;컬럼이름&gt;을 지정하면 해당 컬럼 값을 기반으로 그룹화하고 집약 함수를 적용한다. </p>
<h3 id="그룹-내부의-순서">그룹 내부의 순서</h3>
<p><img src="https://images.velog.io/images/gugu_dragon/post/882b57b3-c95c-495f-a799-5b621ff80193/image.png" alt=""></p>
<pre><code>SELECT product_id
     , score
     , ROW_NUMBER() OVER(ORDER BY score DESC) AS row
     , RANK() OVER(ORDER BY score DESC) AS rank
     , DENSE_RANK() OVER(ORDER BY score DESC) AS dense_rank
  FROM popular_products
  ORDER BY row</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/7b5e3481-a491-4e67-a4cf-807f39b40e89/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 3]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-3</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-3</guid>
            <pubDate>Thu, 20 May 2021 11:54:17 GMT</pubDate>
            <description><![CDATA[<h3 id="두-값의-거리-계산하기">두 값의 거리 계산하기</h3>
<p><img src="https://images.velog.io/images/gugu_dragon/post/675f8899-febc-420a-a361-828b5bab2329/image.png" alt=""></p>
<pre><code>SELECT ABS(x1-x2) AS abs
     , SQRT(POWER(x1-x2,2)) AS rms
  FROM location_1d</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/280a6e69-fef5-40d3-9405-8078aaa3bfb7/image.png" alt="">
1차원에 있는 점 사이의 거리를 구하는 쿼리이다.
<strong>ABS</strong> : 절댓값을 계산하는 함수
<strong>POWER</strong> : 제곱을 하는 함수
<strong>SQRT</strong> : 제곱근을 구할 때 사용</p>
<p><img src="https://images.velog.io/images/gugu_dragon/post/9ab36de7-0b2d-498e-8abe-6af0a066379e/image.png" alt="">
2차원 평면위의 두점 사이의 거리를 구해보자</p>
<pre><code>SELECT SQRT(POWER(x1-x2, 2)+POWER(y1-y2,2)) AS dist
  FROM location_2d</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/8b755ea7-2ebf-4a9b-bfec-b7faa5e26b4c/image.png" alt="">
피타고라스 공식을 응용한 유클리드 거리를 계산하는 쿼리이다.</p>
<h3 id="날짜시간-계산하기">날짜/시간 계산하기</h3>
<p><img src="https://images.velog.io/images/gugu_dragon/post/7afa1957-14b8-4011-916d-6693f276d883/image.png" alt=""></p>
<pre><code>SELECT user_id
     , register_stamp::timestamp AS register_stamp
     , DATEADD(hour, 1, register_stamp::timestamp) AS after_1_hour
     , DATEADD(minute, -30, register_stamp::timestamp) AS before_30_minutes

     , register_stamp::date AS register_date
     , DATEADD(day, 1, register_stamp::date) AS after_1_day
     , DATEADD(month, -1, register_stamp::date) AS before_1_month
  FROM mst_users_with_dates</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/9fbd0f9c-6170-44e8-9e00-52d85da91a28/image.png" alt=""></p>
<pre><code>SELECT user_id
     , CURRENT_DATE AS today
     , register_stamp::date AS register_date
     , CURRENT_DATE - register_stamp::date AS diff_days
  FROM mst_users_with_dates</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/e571071c-d8d8-47ce-a338-8eee05c0adce/image.png" alt="">
회원 등록일과 현재 날짜의 차이, 회원 등록일과 생년월일과의 날짜 차이를 계산하는 쿼리이다. </p>
<p>IP주소에서 4개의 10진수 부분을 추출하는 쿼리</p>
<pre><code>SELECT ip
     , CAST(SPLIT_PART(ip, &#39;.&#39;, 1) AS integer) AS ip_part_1
     , CAST(SPLIT_PART(ip, &#39;.&#39;, 2) AS integer) AS ip_part_2
     , CAST(SPLIT_PART(ip, &#39;.&#39;, 3) AS integer) AS ip_part_3
     , CAST(SPLIT_PART(ip, &#39;.&#39;, 4) AS integer) AS ip_part_4
  FROM (SELECT &#39;192.168.0.1&#39; AS ip) AS t</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/9c0649cb-ae77-4d55-8aa5-44bc78a3cb79/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 복습 2]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-2</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%B3%B5%EC%8A%B5-2</guid>
            <pubDate>Wed, 19 May 2021 11:56:08 GMT</pubDate>
            <description><![CDATA[<p>Amazon Redshift의 무료 2개월 체험버전이 있어서 사용중이다.
원래는 PostgreSQL을 사용하려고 했지만, 알 수 없는 오류로 일단 보류하고 있다.</p>
<h3 id="문자열-연결하기">문자열 연결하기</h3>
<p><img src="https://images.velog.io/images/gugu_dragon/post/a05ef03f-5384-4ba7-9a94-d1a4b7565e50/image.png" alt="">
다음과 같은 테이블에서 문자열을 연결하고자 한다</p>
<pre><code>SELECT user_id
     , CONCAT(pref_name, city_name) AS pref_city
     , pref_name || city_name AS pref_city
  FROM mst_user_location</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/03c5ee88-7577-48d5-91cc-799bda430234/image.png" alt="">
두번째, 세번째 행이 모두 동일한 값을 도출한다.
<strong>CONCAT</strong> : 여러 문자열을 하나로 합쳐주는 역할을 한다. <strong>PostgreSQL</strong> 과 <strong>Redshift</strong>에서는 <strong>||</strong>를 이용해서도 동일한 값을 추출할 수 있다.</p>
<h3 id="여러-개의-값-비교하기">여러 개의 값 비교하기</h3>
<p><img src="https://images.velog.io/images/gugu_dragon/post/70c93779-3ad4-4231-9414-f755c401b03d/image.png" alt="">
하나의 레코드에 포함된 여러 개의 값을 비교하는 방법을 소개해보자. </p>
<pre><code>SELECT year
     , q1
     , q2
     , CASE WHEN q1 &lt; q2 THEN &#39;+&#39;
            WHEN q1 = q2 THEN &#39; &#39;
            ELSE &#39;-&#39;
            END AS judge_q1_q2
     , q2-q1 AS diff_q2_q1
     , SIGN(q2-q1) AS sign_q2_q1
  FROM quarterly_sales
  ORDER BY 1</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/157981ed-57fa-4ae8-a7d7-3e1ba506b13c/image.png" alt="">
<strong>CASE</strong> 함수를 이용해, 1분기/2분기의 대소비교를 하였다.
<strong>SIGN</strong> 함수를 이용하면 <strong>CASE</strong> 식보다 간단하게 값의 증감 판정을 할 수 있다. <strong>SIGN</strong> 함수는 매개변수가 양수라면 1, 0이라면 0, 음수라면 -1을 리턴한다. </p>
<pre><code>SELECT year
     , GREATEST(q1, q2, q3, q4) as greatest_sales
     , LEAST(q1, q2, q3, q4) as least_sales
  FROM quarterly_sales
  ORDER BY 1</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/4e4267a3-d04b-48b6-9f82-56097ea92e5a/image.png" alt="">
<strong>GREATEST, LEAST</strong> 함수는 표준 SQL에 포함되지 않지만, 대부분의 SQL 쿼리 엔진에서 구현하고 있다. </p>
<pre><code>SELECT year
     , (COALESCE(q1, 0) + COALESCE(q2, 0) + COALESCE(q3, 0) + COALESCE(q4, 0))
       / (SIGN(COALESCE(q1, 0)) + SIGN(COALESCE(q2, 0)) + SIGN(COALESCE(q3, 0)) + SIGN(COALESCE(q4, 0))) AS average
  FROM quarterly_sales
  ORDER BY year</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/e878c909-451b-4155-bad1-6ceb4f9f110f/image.png" alt="">
각 연도의 평균 분기매출을 구하는 쿼리이다. <strong>COALESCE</strong>는 Null 값을 어떤 값으로 리턴할지 지정하는 함수이다. </p>
<h3 id="2개의-값-비율-계산하기">2개의 값 비율 계산하기</h3>
<p><img src="https://images.velog.io/images/gugu_dragon/post/89ce7cec-2332-45d9-b765-a618087e9ea4/image.png" alt=""></p>
<pre><code>SELECT dt
     , ad_id
     , CAST(clicks AS double precision)/impressions AS ctr
     , 100.0 * clicks / impressions AS ctr_as_percent
  FROM advertising_stats
  WHERE dt = &#39;2017-04-01&#39;
  ORDER BY dt, ad_id</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/c94e6f55-511d-45fa-b89e-ccad3c202b3e/image.png" alt="">
CTR(Click Through Rate)를 구하는 쿼리이다.
<strong>CAST</strong> 함수를 이용해, 클릭수를 &#39;더블 프리시전(double precision)&#39; 자료형으로 변환하고 계산해야 결과도 소수점 둘째자리까지 나오게 된다. </p>
<pre><code>SELECT dt
     , ad_id
     , CASE WHEN impressions &gt;0 THEN 100.0 * clicks / impressions
            END AS ctr_as_percent_by_case
  FROM advertising_stats
  ORDER BY dt, ad_id</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/d72457ea-7d12-4f3c-9807-079a451ae0b5/image.png" alt="">
0으로 나누는 것을 방지하기 위해 <strong>CASE</strong> 함수를 사용하였다. </p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 데이터 가공 1]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%8D%B0%EC%9D%B4%ED%84%B0-%EA%B0%80%EA%B3%B5-1</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%8D%B0%EC%9D%B4%ED%84%B0-%EA%B0%80%EA%B3%B5-1</guid>
            <pubDate>Sun, 16 May 2021 11:41:18 GMT</pubDate>
            <description><![CDATA[<p>로그 데이터 또는 업무 데이터로 저장된 코드 값을 그대로 집계에 사용하면 리포트의 가독성이 낮아진다. 따라서 리포트를 작성할 때 변환하는 등의 작접을 해야 한다. </p>
<blockquote>
<ol>
<li>코드 값을 레이블로 변경하는 방법 : <strong>CASE</strong>문 활용하기</li>
</ol>
</blockquote>
<pre><code></code></pre>]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 도입 2]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%8F%84%EC%9E%85-2</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%8F%84%EC%9E%85-2</guid>
            <pubDate>Sun, 16 May 2021 05:33:12 GMT</pubDate>
            <description><![CDATA[<h1 id="데이터에-대해서">데이터에 대해서</h1>
<h3 id="데이터의-종류">데이터의 종류</h3>
<p><strong>업무 데이터</strong> : 업무에 필요한 데이터
<strong>로그 데이터</strong> : 업무에 직접적으로 필요하지는 않지만 분석을 위해 추출해야 하는 데이터</p>
<blockquote>
<p><strong>업무 데이터</strong>
: 서비스와 시스템을 운용하기 위한 목적으로 구축된 데이터베이스에 존재하는 데이터. &#39;트랜젝션 데이터&#39;와 &#39;마스터 데이터&#39;로 분류할 수 있다.</p>
</blockquote>
<p><strong>업무 데이터의 특징</strong></p>
<p>1) 데이터의 정밀도가 높다
: 여러 데이터 처리를 하는 중에 문제가 발생하면, 트랜잭션과 롤백이라는 기능을 사용해 이러한 문제를 제거할 수 있다. 따라서 데이터의 정합성이 보증된다. 
2) 갱신형 데이터
: 매일 다양한 데이터 추가, 갱신, 제거 등이 실행된다. 따라서 데이터를 추출하는 시점에 따라 바뀔 수 있다는 점을 의식해야 한다.
3) 다뤄야 하는 테이블의 수가 많다</p>
<blockquote>
<blockquote>
<p><strong>트랜잭션 데이터</strong>
: 구매, 리뷰, 게임 플레이 데이터처럼 서비스와 시스템을 통해 사용자의 행동을 기록한 데이터. 
<strong>마스터 데이터</strong>
: 마테고리 마스터, 상품 마스터처럼 서비스와 시스템이 정의하고 있는 데이터. </p>
<blockquote>
<p>트랜잭션 데이터는 회원 ID와 상품 코드 등이 저장되어 있으므로, 트랜잭션 데이터만으로는 잘 팔리는 상품의 명칭 또는 카테고리 등을 알 수 없는 경우가 많다. 따라서 트랜잭션 데이터의 상품 ID와 마스터 데이터를 결합해서 상품이름, 상품 카테고리, 발매일 등을 명확하게 만들어야 리포트 업무의 폭을 넓힐 수 있다.</p>
</blockquote>
</blockquote>
</blockquote>
<blockquote>
<p><strong>로그 데이터</strong>
: 통계 또는 분석을 주 용도로 설계된 데이터, 특정 태그를 포함해서 전송된 데이터, 특정 행동을 서버 측에 출력한 데이터. 누적형 데이터 이므로 로그 출력 이후에 가격이 변경되거나 사용자 정보가 변경되더라도 기존의 데이터를 수정하지 않는다. </p>
<p><strong>로그 데이터의 특징</strong></p>
</blockquote>
<p>1) 시간, 사용자 엔드 포인트, IP, URL, 레퍼러, Cookie 등의 정보 저장하기
: 업무 데이터는 서비스와 시스템을 구축할 때 필요한 데이터이지만, 로그 데이터는 서비스의 처리에 영향이 거의 없는 정보를 저장한 것이다. 
2) 로그 데이터는 추출 방법에 따라 데이터의 정밀도가 달라진다
3) 계속 기록을 추가하는 것뿐이므로 과거의 데이터가 변경되지 않음</p>
]]></description>
        </item>
        <item>
            <title><![CDATA[데.분.레 - 도입 1]]></title>
            <link>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%8F%84%EC%9E%85-1</link>
            <guid>https://velog.io/@gugu_dragon/%EB%8D%B0.%EB%B6%84.%EB%A0%88-%EB%8F%84%EC%9E%85-1</guid>
            <pubDate>Sun, 16 May 2021 04:36:27 GMT</pubDate>
            <description><![CDATA[<p>[데이터분석을 위한 SQL 레시피]라는 새로운 책을 구입하였다.
기존에 공부하던 책은, 앞으로 이 책을 공부하면서 보조교재로 활용할 것이다.</p>
<p>회사의 선배가 추천해 준 책인데, 수준이 다소 높아보인다.
공부하는 과정에서 많이 막힐 것 같은데... 우선 해봐야 알지 않겠는가?</p>
<p>만약 책의 수준이 너무 높아, 지금 내가 하기에 버겁다면
중간에 다른 기본서를 한번 더 봐야하지 않을까싶다.</p>
]]></description>
        </item>
        <item>
            <title><![CDATA[SQL 응용 - 복습 2]]></title>
            <link>https://velog.io/@gugu_dragon/SQL-%EC%9D%91%EC%9A%A9-%EB%B3%B5%EC%8A%B5-2</link>
            <guid>https://velog.io/@gugu_dragon/SQL-%EC%9D%91%EC%9A%A9-%EB%B3%B5%EC%8A%B5-2</guid>
            <pubDate>Thu, 13 May 2021 12:15:37 GMT</pubDate>
            <description><![CDATA[<blockquote>
</blockquote>
<pre><code>SELECT *
  FROM ( SELECT ACC_LOC_NAME 사고장소
              , ACC_CNT 사고건수
              , RANK() OVER(ORDER BY ACC_CNT DESC NULLS LAST) as 순위
           FROM ACC_LOC_DATA
           WHERE ACC_YEAR = 2017)
  WHERE 순위 &lt;= 5</code></pre><p><strong>FROM</strong>절의 서브쿼리에서 교통사고 건수가 많은 순으로 순위를 부여하여 결과를 출력한다. 그리고 메인 쿼리에서 서브 쿼리의 결과 중 순위 5위까지만 제한을 걸어 출력한다. </p>
<blockquote>
</blockquote>
<pre><code>SELECT 년도 &quot;치킨집 폐업 연도&quot;
     , 치킨집 &quot;건수&quot;
  FROM (SELECT 년도
             , 치킨집
             , RANK() OVER (ORDER BY 치킨집 DESC NULLS LAST) 순위
          FROM CLOSING)
  WHERE 순위 = 1</code></pre><p><img src="https://images.velog.io/images/gugu_dragon/post/da1ef808-2585-46ea-beb6-2510ba6cca12/image.png" alt="">
<strong>FROM</strong>절의 서브쿼리문에서 치킨집 폐업 건수가 높은 순으로 순위를 출력한다. 그리고 메인 쿼리문의 <strong>WHERE</strong> 절에서 순위가 1위의 데이터만 출력합니다. </p>
]]></description>
        </item>
    </channel>
</rss>