<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>denver_almighty.log</title>
        <link>https://velog.io/</link>
        <description>까먹었을 미래의 나를 위해</description>
        <lastBuildDate>Sun, 12 Feb 2023 06:09:36 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>denver_almighty.log</title>
            <url>https://velog.velcdn.com/images/denver_almighty/profile/ecd10afb-1c9e-4096-a2c3-225faddcdf20/social_profile.jpeg</url>
            <link>https://velog.io/</link>
        </image>
        <copyright>Copyright (C) 2019. denver_almighty.log. All rights reserved.</copyright>
        <atom:link href="https://v2.velog.io/rss/denver_almighty" rel="self" type="application/rss+xml"/>
        <item>
            <title><![CDATA[[Kafka] Python  Kafka 프로그래밍]]></title>
            <link>https://velog.io/@denver_almighty/Kafka-Python-Kafka-%ED%94%84%EB%A1%9C%EA%B7%B8%EB%9E%98%EB%B0%8D</link>
            <guid>https://velog.io/@denver_almighty/Kafka-Python-Kafka-%ED%94%84%EB%A1%9C%EA%B7%B8%EB%9E%98%EB%B0%8D</guid>
            <pubDate>Sun, 12 Feb 2023 06:09:36 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
</blockquote>
<p>AWS EC2 t2.xlarge
OS : Red Hat 9.1
Kafka :3.3.1
Scala : 2.13</p>
<br>

<h1 id="1-python으로-kafka-producer-consumer-만들기">1. Python으로 Kafka producer, consumer 만들기</h1>
<h2 id="producerpy">producer.py</h2>
<pre><code class="language-python">
from kafka import KafkaProducer

# create kafka producer instance
producer = KafkaProducer(bootstrap_servers = [&#39;localhost:9092&#39;])

# set topic name
producer.send(&#39;first-topic&#39;, b&#39;hello world&#39;)
# reset buffer
producer.flush()
</code></pre>
<h2 id="comsumerpy">comsumer.py</h2>
<pre><code class="language-python">
from kafka import KafkaConsumer

# creae kafka consumer instance
consumer = KafkaConsumer(&#39;first-topic&#39;, bootstrap_servers=[&#39;localhost:9092&#39;])

# print message
for msg in consumer:
    print(msg)</code></pre>
<pre><code class="language-bash">python consumer.py
python producer.py</code></pre>
<p>consumer.py 실행해 놓고  다른 창에서 producer.py 실행
<img src="https://velog.velcdn.com/images/denver_almighty/post/c13cc02c-c011-4c49-b541-76f294a9a20c/image.png" alt=""></p>
<p>토픽 이름, offset(메세지 온 순서), timestamp, 내용 등이 전달된다.</p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Kafka] Topic 만들기]]></title>
            <link>https://velog.io/@denver_almighty/Kafka-Topic-%EB%A7%8C%EB%93%A4%EA%B8%B0</link>
            <guid>https://velog.io/@denver_almighty/Kafka-Topic-%EB%A7%8C%EB%93%A4%EA%B8%B0</guid>
            <pubDate>Sun, 12 Feb 2023 05:44:28 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS EC2 t2.xlarge
OS : Red Hat 9.1
Kafka :3.3.1
Scala : 2.13</p>
</blockquote>
<br>

<h1 id="1-topic-만들기">1. Topic 만들기</h1>
<h2 id="토픽-생성">토픽 생성</h2>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/ac239401-374d-41bd-999b-7130da5f945d/image.png" alt=""></p>
<pre><code>bin/kafka-topics.sh --create --topic &lt;test-topic&gt; --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1</code></pre><h2 id="토픽-리스트-확인">토픽 리스트 확인</h2>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/8d48e438-1fd5-4daa-aecb-54bdd1674f7b/image.png" alt=""></p>
<h2 id="producer-실행">producer 실행</h2>
<pre><code>./bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic test-topic</code></pre><p>여기 입력한 메세지가 토픽에 post됨
<img src="https://velog.velcdn.com/images/denver_almighty/post/2a65937a-7aee-4bf7-b04d-b666a0daabdc/image.png" alt=""></p>
<h2 id="consumer-실행">Consumer 실행</h2>
<pre><code>/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/3c78e41f-c45e-4425-ac82-80bfeec97523/image.png" alt=""></p>
<p>왼쪽 producer 창에서 메세지를 입력하면
오른쪽 conspumer 창에 메세지가 출력된다.</p>
<p><br><br></p>
<h1 id="2-컨슈머-그룹">2. 컨슈머 그룹</h1>
<p>consumer 그룹 지정하지 않으면 Unique한 컨슈머 그룹이 생성된다.</p>
<pre><code>./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/7c9992d5-644b-46c7-afc4-eef3521c03ea/image.png" alt=""></p>
<h2 id="그룹-지정-컨슈머-생성">그룹 지정 컨슈머 생성</h2>
<pre><code>./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --group test-group</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/03366987-221c-4a65-ad57-1bdc239bfa78/image.png" alt=""></p>
<h2 id="컨슈머-그룹-리스트">컨슈머 그룹 리스트</h2>
<pre><code>./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/5c17f093-b15b-4a27-8ce6-529eb6ea3fb1/image.png" alt=""></p>
<h2 id="컨슈머-그룹-상세">컨슈머 그룹 상세</h2>
<pre><code>./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group test-group</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/83453537-0e1e-4348-903a-dbad930a7fd4/image.png" alt=""></p>
<p><br><br></p>
<h1 id="3-consumer-와-partition">3. Consumer 와 Partition</h1>
<h2 id="1-파티션-1개짜리-토픽">1) 파티션 1개짜리 토픽</h2>
<h3 id="1-producer-2-consumer">1 producer 2 consumer</h3>
<p>컨슈머 그룹을 지정하지 않으면 유니크한 컨슈머 그룹이 생성된다(2번 참고)
프로듀서가 메세지를 보내면 모든 컨슈머가 동일하게 메세지를 받는다. </p>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/3755e493-4a93-4d06-a089-884fb427a188/image.png" alt="">
<br></p>
<h3 id="1-producer-1-consumer-group2-consumer">1 producer 1 consumer group(2 consumer)</h3>
<p>컨슈머 그룹으로 묶어주면 한 컨슈머만 메세지 받는다.
<img src="https://velog.velcdn.com/images/denver_almighty/post/018f863d-699e-4e5a-8022-be76b2419cd8/image.png" alt=""></p>
<br>

<h3 id="2-producer-1-consumer-group-2-consumer">2 producer 1 consumer group (2 consumer)</h3>
<p>2번째 프로듀서가 메세지를 보내도 한 컨슈머만 메세지를 받는다.
<img src="https://velog.velcdn.com/images/denver_almighty/post/1e872fdc-4d5c-4bf2-8c78-4d1f01f0f861/image.png" alt=""></p>
<p>-&gt; first-topic 토픽에 파티션이 1개 이기때문에
파티션 1개는 컨슈머 그룹 내 1컨슈머와 매핑된다
-&gt; 리소스 효율화위해 파티션 여러개로 설정한다.</p>
<h2 id="partition-2개짜리-topic">partition 2개짜리 topic</h2>
<h3 id="토픽-리스트-확인-1">토픽 리스트 확인</h3>
<pre><code>./bin/kafka-topics.sh --list --bootstrap-server localhost:9092</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/643ebd5f-6a71-4be1-8510-21ce59e788c0/image.png" alt=""></p>
<h3 id="partition-2개-짜리-토픽-생성">partition 2개 짜리 토픽 생성</h3>
<pre><code>./bin/kafka-topics.sh --create --topic topic-multi-partition --bootstrap-server localhost:9092 --partitions 2 --replication-factor 1</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/971da9b8-06f2-4410-adb0-8f34476e73a2/image.png" alt=""></p>
<h3 id="2-producer-1-consumer-group2-consumer">2 producer 1 consumer group(2 consumer)</h3>
<pre><code># producer
./bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic topic-multi-partition

# consumer
./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic-multi-partition --group my-newgroup</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/21ac8e41-a9c3-468c-bec3-6d550193891a/image.png" alt=""></p>
<p>메세지 분배되어서간다.
캡쳐에서는, 대부분 1프로듀서 -&gt; 1 컨슈머에게, 2프로듀서 -&gt;2컨슈머에게 가는 것만 나왔데 2프로듀서 메세지가 1컨슈머에게도 간다</p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[ERROR] Kafka Zookeeper 실행 오류 : Classpath is empty. Please build the project first e.g. by running './gradlew jar -PscalaVersion=2.13.8']]></title>
            <link>https://velog.io/@denver_almighty/ERROR-Kafka-Zookeeper-%EC%8B%A4%ED%96%89-%EC%98%A4%EB%A5%98-Classpath-is-empty.-Please-build-the-project-first-e.g.-by-running-.gradlew-jar-PscalaVersion2.13.8</link>
            <guid>https://velog.io/@denver_almighty/ERROR-Kafka-Zookeeper-%EC%8B%A4%ED%96%89-%EC%98%A4%EB%A5%98-Classpath-is-empty.-Please-build-the-project-first-e.g.-by-running-.gradlew-jar-PscalaVersion2.13.8</guid>
            <pubDate>Sun, 15 Jan 2023 06:44:53 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS EC2 t2.xlarge
OS : Red Hat 9.1
Kafka :3.3.1
Scala : 2.13</p>
</blockquote>
<h1 id="1-error">1. ERROR</h1>
<p>zookeeper 실행 오류<img src="https://velog.velcdn.com/images/denver_almighty/post/8d7549c1-cf62-483d-8e61-ee6c377d976a/image.png" alt=""></p>
<pre><code class="language-bash">./bin/zookeeper-server-start.sh --daemon  ./config/zookeeper.properties
=&gt; 
Classpath is empty. Please build the project first e.g. by running &#39;./gradlew jar -PscalaVersion=2.13.8&#39;</code></pre>
<p>=&gt; source 파일 말고 Binary 파일을 다운로드하기 
<img src="https://velog.velcdn.com/images/denver_almighty/post/b2a670b7-25af-425d-ae52-8b3b6c818aab/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Snowflake] Badge 1 획득]]></title>
            <link>https://velog.io/@denver_almighty/Snowflake-Badge-1-%ED%9A%8D%EB%93%9D-1ked1qc5</link>
            <guid>https://velog.io/@denver_almighty/Snowflake-Badge-1-%ED%9A%8D%EB%93%9D-1ked1qc5</guid>
            <pubDate>Sun, 08 Jan 2023 07:28:51 GMT</pubDate>
            <description><![CDATA[<p><img src="https://velog.velcdn.com/images/denver_almighty/post/f94bb001-85bb-4b89-81d9-3470650b624d/image.png" alt=""></p>
<p>Snowflake 웨비나, 핸즈온 랩1을 완료하고 뱃지를 받았다.
랩 1(데이터 웨어하우징)까지 했을 때는 스키마도 사전에 정해야하고 SnowflakeSQL 쿼리로 질의하는데 DB랑 뭐가 다른거지 싶었다.
랩 2는  Snowflake를 백엔드로 사용하는 애플리케이션 구축에 관한 내용인데 여기부터가 진짜인가보다</p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[AWS] EC2 인스턴스 자동 중지 설정]]></title>
            <link>https://velog.io/@denver_almighty/AWS-EC2-%EC%9D%B8%EC%8A%A4%ED%84%B4%EC%8A%A4-%EC%9E%90%EB%8F%99-%EC%A4%91%EC%A7%80-%EC%84%A4%EC%A0%95-ixzo5x8f</link>
            <guid>https://velog.io/@denver_almighty/AWS-EC2-%EC%9D%B8%EC%8A%A4%ED%84%B4%EC%8A%A4-%EC%9E%90%EB%8F%99-%EC%A4%91%EC%A7%80-%EC%84%A4%EC%A0%95-ixzo5x8f</guid>
            <pubDate>Sun, 08 Jan 2023 06:51:14 GMT</pubDate>
            <description><![CDATA[<p><img src="https://velog.velcdn.com/images/denver_almighty/post/2fedbdc2-7524-4ca1-9f6e-dafcc1a4a87c/image.png" alt=""> 이미지 by 다락원</p>
</br>

<h1 id="0-배경">0. 배경</h1>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/269914bc-2874-4190-880f-bac37420194e/image.png" alt=""></p>
<p>aws 결제 대시보드에 갔더니 이용 요금이 생각보다 많이 나와있었다.
테스트용 인스턴스 종료하는 것을 깜박했나보다..
사용한만큼 과금되는거는 괜찮은데 이렇게 안썼는데 돈 나가는건 너무 아깝다ㅜㅜ
다음 소를 잃지않기 위해 외양간을 고쳐본다.
</br></br></p>
<h1 id="1-설정하기">1. 설정하기</h1>
<h2 id="1-policy-생성">1) Policy 생성</h2>
<pre><code class="language-json">{
    &quot;Version&quot;: &quot;2012-10-17&quot;,
    &quot;Statement&quot;: [
        {
            &quot;Sid&quot;: &quot;VisualEditor0&quot;,
            &quot;Effect&quot;: &quot;Allow&quot;,
            &quot;Action&quot;: [
                &quot;ec2:DescribeInstances&quot;,
                &quot;ec2:StartInstances&quot;,
                &quot;ec2:DescribeTags&quot;,
                &quot;logs:*&quot;,
                &quot;ec2:DescribeInstanceTypes&quot;,
                &quot;ec2:StopInstances&quot;,
                &quot;ec2:DescribeInstanceStatus&quot;
            ],
            &quot;Resource&quot;: &quot;*&quot;
        }
    ]
}</code></pre>
<h2 id="2-role-생성">2) Role 생성</h2>
<p>target은 Lambda, 1) 에서 생성한 policy로 Role을 생성한다
<img src="https://velog.velcdn.com/images/denver_almighty/post/34a45e73-8af4-451c-ba6b-3ffdd52911a6/image.png" alt=""></p>
<h2 id="3-lambda-생성">3) Lambda 생성</h2>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/2f1d9ed0-791a-482b-8181-ae65c20dfdad/image.png" alt=""></p>
<p>runtime : python
실행 역할 : 기존 역할 선택 -&gt; 2) 에서 만든 Role 선택</p>
<pre><code class="language-python">import boto3

region = &#39;ap-northeast-2&#39;
ec2 = boto3.resource(&#39;ec2&#39;, region_name=region)

def lambda_handler(event, context):
    # Get running instance list with tag AutoStop=True
    # instance-state-name : ( pending | running | shutting-down | terminated | stopping | stopped )
    instances = ec2.instances.filter(Filters=[
        {
            &#39;Name&#39;: &#39;instance-state-name&#39;, 
            &#39;Values&#39;: [&#39;running&#39;]
        }
        ,{
            &#39;Name&#39;: &#39;tag:AutoStop&#39;,
            &#39;Values&#39;:[&#39;True&#39;]
        }
    ])

    # Stop instance
    for instance in instances:
        id=instance.id
        # ec2.instances.filter(InstanceIds=[id]).start()
        ec2.instances.filter(InstanceIds=[id]).stop()
        print(&#39;Instance ID is stopped :- &#39;+instance.id)

    return &#39;success&#39;</code></pre>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/4e7a74c9-1c93-44ba-93c3-4e4015fc3191/image.png" alt=""></p>
<h2 id="4-eventbridge-설정">4) EventBridge 설정</h2>
<h3 id="규칙-세부-정보-정의">규칙 세부 정보 정의</h3>
<p>EventBridge 규칙 생성 -&gt; 규칙일정 -&gt; 일정 선택</p>
<h3 id="일정-정의">일정 정의</h3>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/9b7181c1-70ff-4bf5-98b8-5914ac710635/image.png" alt=""></p>
<p>cron 패턴 정의 표시 시간이 UTC인지 현지시간인지 확인</p>
<h3 id="대상-선택">대상 선택</h3>
<p>3) 에서 만든 Lambda 생선택
<img src="https://velog.velcdn.com/images/denver_almighty/post/5d2cb53e-73d8-445f-a9a8-3b64bcb91cda/image.png" alt=""></p>
<p>EventBrige 시간을 바꿔서 테스트해봤다.
EC2 대시보드에서 확인하니 종료되었고
CloudWatch Log 확인해보면 test 로그와 동일하게 남겨져있다.
<img src="https://velog.velcdn.com/images/denver_almighty/post/4890bd0e-5a2a-4802-a340-bf193187272b/image.png" alt=""> </p>
<p>테스트했으니 시간 다시 01시로 맞춰놓기!
<img src="https://velog.velcdn.com/images/denver_almighty/post/557682b2-5392-4ac0-bfae-088c638fe9e3/image.png" alt=""></p>
<h3 id="외양간-고치기-끝-🐮🐮🐮">외양간 고치기 끝 🐮🐮🐮</h3>
<p></br></br></p>
<h1 id="참고-자료">참고 자료</h1>
<p><a href="https://aws.amazon.com/ko/premiumsupport/knowledge-center/start-stop-lambda-eventbridge/">Lambda를 사용하여 Amazon EC2 인스턴스를 정기적으로 중지하고 시작하려면 어떻게 해야 하나요?(권한 안맞음)</a></p>
<p><a href="https://dheeraj3choudhary.com/aws-lambda-and-eventbridge-or-schedule-start-and-stop-of-ec2-instances-based-on-tags">Dheeraj Choudhary&#39;s Blog - AWS Lambda &amp; EventBridge | Schedule Start And Stop Of EC2 Instances Based On Tags...</a></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Python] Spotify API token 생성]]></title>
            <link>https://velog.io/@denver_almighty/Python-Spotify-API-token-%EC%83%9D%EC%84%B1</link>
            <guid>https://velog.io/@denver_almighty/Python-Spotify-API-token-%EC%83%9D%EC%84%B1</guid>
            <pubDate>Sun, 01 Jan 2023 10:09:43 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>Python : 3.9</p>
</blockquote>
</br>

<h1 id="1-토큰-생성">1. 토큰 생성</h1>
<pre><code class="language-python">import requests
import base64
import json

client_id = &#39;&lt;Client ID&gt;&#39;
client_secret = &#39;&lt;Client Secret&gt;&#39;
endpoint = &#39;https://accounts.spotify.com/api/token&#39;


encoded = base64.b64encode(f&#39;{client_id}:{client_secret}&#39;.encode(&#39;utf-8&#39;)).decode(&#39;ascii&#39;)

headers = {&#39;Authorization&#39;: f&#39;Basic {encoded}&#39;}
payload = {&#39;grant_type&#39;: &#39;client_credentials&#39;}

response = requests.post(endpoint, data=payload, headers=headers)
# print(json.loads(response.text))
access_token = json.loads(response.text)[&#39;access_token&#39;]
print(access_token)
</code></pre>
]]></description>
        </item>
        <item>
            <title><![CDATA[[ERROR] (Not Solved) Airflow HttpSensor 400 Client Error: Bad Request for url]]></title>
            <link>https://velog.io/@denver_almighty/ERROR-Airflow-HttpSensor-400-Client-Error-Bad-Request-for-url</link>
            <guid>https://velog.io/@denver_almighty/ERROR-Airflow-HttpSensor-400-Client-Error-Bad-Request-for-url</guid>
            <pubDate>Sun, 01 Jan 2023 09:57:03 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS EC2 t2.xlarge
OS : Red Hat 9.1
Python : 3.9
Airflow : 2.5.0</p>
</blockquote>
<h1 id="1-code">1. Code</h1>
<pre><code class="language-python">with DAG(
    is_api_available = HttpSensor(
        task_id = &#39;is_api_available&#39;,
        http_conn_id = &#39;spotify_api&#39;,
        # method=&quot;GET&quot;,
        headers = {
            # &#39;Accept&#39;: &#39;application/json&#39;,
            # &#39;Content-Type&#39;: &#39;application/json&#39;,
            &#39;Authorization&#39;: 
                &#39;Bearer &lt;MYTOKEN&gt;&#39;,
        },

        request_params = {
            &#39;q&#39;: &#39;BTS&#39;,
            &#39;type&#39;: &#39;artist&#39;,
            &#39;limit&#39;: &#39;1&#39;,
        },
        method=&quot;GET&quot;,
        endpoint=&#39;v1/search&#39;
    )</code></pre>
<p>curl, request로 하면 결과가 나오는데 task 실행시키니까 400 에러가난다.
HttpSensor, SimpleHttpOperator 로 해봐도 안된다.
response_check=False를 넣어도 안된다.
providers/http/hooks/http.py, requests/models.py를봐도 response code를 확인하는거 밖에 없는데 원인을 모르겠다.
task정의에서 header를 제거해도 같은 오류가난다.</p>
<h1 id="2-connection">2. Connection</h1>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/39ccfab0-a7b6-4485-838b-5d4edf7a7610/image.png" alt=""></p>
<h1 id="3-api-test">3. API Test</h1>
<h2 id="1-curl-결과">1) Curl 결과</h2>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/f41ed904-e806-49b1-a101-64ab1d30ecb9/image.png" alt=""></p>
<h2 id="2-requests">2) Requests</h2>
<pre><code>import requests

headers = {
    &#39;Accept&#39;: &#39;application/json&#39;,
    &#39;Content-Type&#39;: &#39;application/json&#39;,
    &#39;Authorization&#39;: &#39;Bearer &lt;MYTOKEN&gt;&#39;,
}

params = {
    &#39;q&#39;: &#39;BTS&#39;,
    &#39;type&#39;: &#39;artist&#39;,
    &#39;limit&#39;: &#39;1&#39;,
}

response = requests.get(&#39;https://api.spotify.com/v1/search&#39;, params=params, headers=headers)
print(response)</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/9ed5a1d6-6184-4519-8f07-08818be13037/image.png" alt=""></p>
<h1 id="4-error">4. Error</h1>
<pre><code>requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api.spotify.com/v1/search?q=BTS&amp;type=artist&amp;limit=1</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/24c0282f-3b51-4ba9-a214-346196706e38/image.png" alt=""></p>
<pre><code class="language-bash">[2023-01-01 08:52:44,836] {http.py:122} INFO - Poking: v1/search
[2023-01-01 08:52:44,839] {base.py:73} INFO - Using connection ID &#39;spotify_api&#39; for task execution.
[2023-01-01 08:52:44,840] {http.py:150} INFO - Sending &#39;GET&#39; to url: https://api.spotify.com/v1/search
[2023-01-01 08:52:44,988] {http.py:163} ERROR - HTTP error: Bad Request
[2023-01-01 08:52:44,989] {http.py:164} ERROR - {
  &quot;error&quot;: {
    &quot;status&quot;: 400,
    &quot;message&quot;: &quot;Only valid bearer authentication supported&quot;
  }
}
[2023-01-01 08:52:44,989] {taskinstance.py:1772} ERROR - Task failed with exception
Traceback (most recent call last):
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/hooks/http.py&quot;, line 161, in check_response
    response.raise_for_status()
  File &quot;/opt/anaconda/anaconda3/lib/python3.9/site-packages/requests/models.py&quot;, line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api.spotify.com/v1/search?q=BTS&amp;type=artist&amp;limit=1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/sensors/base.py&quot;, line 199, in execute
    poke_return = self.poke(context)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/sensors/http.py&quot;, line 137, in poke
    raise exc
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/sensors/http.py&quot;, line 124, in poke
    response = hook.run(
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/hooks/http.py&quot;, line 151, in run
    return self.run_and_check(session, prepped_request, extra_options)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/hooks/http.py&quot;, line 204, in run_and_check
    self.check_response(response)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/hooks/http.py&quot;, line 165, in check_response
    raise AirflowException(str(response.status_code) + &quot;:&quot; + response.reason)
airflow.exceptions.AirflowException: 400:Bad Request
[2023-01-01 08:52:44,990] {taskinstance.py:1322} INFO - Marking task as FAILED. dag_id=nft-pipeline, task_id=is_api_available, execution_date=20230101T085244, start_date=, end_date=20230101T085244
Traceback (most recent call last):
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/hooks/http.py&quot;, line 161, in check_response
    response.raise_for_status()
  File &quot;/opt/anaconda/anaconda3/lib/python3.9/site-packages/requests/models.py&quot;, line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api.spotify.com/v1/search?q=BTS&amp;type=artist&amp;limit=1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File &quot;/home/ec2-user/.local/bin/airflow&quot;, line 8, in &lt;module&gt;
    sys.exit(main())
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/__main__.py&quot;, line 39, in main
    args.func(args)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/cli/cli_parser.py&quot;, line 52, in command
    return func(*args, **kwargs)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/utils/cli.py&quot;, line 108, in wrapper
    return f(*args, **kwargs)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/cli/commands/task_command.py&quot;, line 576, in task_test
    ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/utils/session.py&quot;, line 75, in wrapper
    return func(*args, session=session, **kwargs)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py&quot;, line 1673, in run
    self._run_raw_task(
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/utils/session.py&quot;, line 72, in wrapper
    return func(*args, **kwargs)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py&quot;, line 1378, in _run_raw_task
    self._execute_task_with_callbacks(context, test_mode)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py&quot;, line 1524, in _execute_task_with_callbacks
    result = self._execute_task(context, task_orig)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py&quot;, line 1585, in _execute_task
    result = execute_callable(context=context)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/sensors/base.py&quot;, line 199, in execute
    poke_return = self.poke(context)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/sensors/http.py&quot;, line 137, in poke
    raise exc
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/sensors/http.py&quot;, line 124, in poke
    response = hook.run(
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/hooks/http.py&quot;, line 151, in run
    return self.run_and_check(session, prepped_request, extra_options)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/hooks/http.py&quot;, line 204, in run_and_check
    self.check_response(response)
  File &quot;/home/ec2-user/.local/lib/python3.9/site-packages/airflow/providers/http/hooks/http.py&quot;, line 165, in check_response
    raise AirflowException(str(response.status_code) + &quot;:&quot; + response.reason)
airflow.exceptions.AirflowException: 400:Bad Request</code></pre>
<h1 id="참고-자료">참고 자료</h1>
<p><a href="https://github.com/apache/airflow/blob/main/airflow/providers/http/hooks/http.py">Airflow : providers/http/hooks/http.py</a>
<a href="https://github.com/psf/requests/blob/main/requests/models.py">Python : requests/models.py</a>
<a href="https://airflow-apache.readthedocs.io/en/latest/_api/airflow/sensors/http_sensor/index.html">Airflow : HttpSensor</a>
<a href="https://curlconverter.com/python/">CurlConverter</a></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Airflow] Airflow 설치하기(pip)]]></title>
            <link>https://velog.io/@denver_almighty/Airflow-Airflow-%EC%84%A4%EC%B9%98%ED%95%98%EA%B8%B0pip</link>
            <guid>https://velog.io/@denver_almighty/Airflow-Airflow-%EC%84%A4%EC%B9%98%ED%95%98%EA%B8%B0pip</guid>
            <pubDate>Thu, 29 Dec 2022 15:08:45 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS EC2 t2.xlarge
OS : Red Hat 9.1
Python : 3.9
Airflow : 2.5.0</p>
</blockquote>
</br>

<h1 id="1-설치하기">1. 설치하기</h1>
<pre><code class="language-bash"># python3.6이상, anaconda3 경로에 pip 인지 확인
pip --version

# 설치
pip install apache-airflow

# home에 airflow 가 생성되었다
cd /home/ec2-user/airflow

# db초기화
airflow db init

# webserver 8080포트로 실행
airflow webserver -p 8080

# 새 세션에서 실행. ssh 포트포워딩
ssh -i &quot;&lt;my_key.pem&gt;&quot; -L 8080:localhost:8080 ec2-user@&lt;my.instance.ip&gt;

# admin 계정 생성
airflow users create --role Admin --username &lt;USERNAME&gt; \
--password &lt;PASSWORD&gt; --firstname &lt;FIRSTNAME&gt; \
--lastname &lt;LASTNAME&gt; --email &lt;MYEMAIL&gt;
</code></pre>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/f2e9e30c-87bd-4746-9af0-603b29c541ae/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/bc57acb9-7e79-470e-8f4a-4a8413438bb0/image.png" alt="">
flask 로 web서버 구성</p>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/b61cd5a7-fa32-42c2-919d-157969fb39af/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/065edef4-ccfd-4367-bdd5-3d2ce950b7cb/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/00e40fae-362a-4c1c-9a96-1602b4bf17e1/image.png" alt=""></p>
<p>local이 아니라 aws에 설치한거라 <a href="https://almighty-denver.tistory.com/entry/Python-EC2%EC%97%90-Jupyter-Notebook-%EC%8B%A4%ED%96%89%ED%95%98%EA%B8%B0ssh-%ED%8F%AC%ED%8A%B8%ED%8F%AC%EC%9B%8C%EB%94%A9">ssh 포트포워딩</a>이 필요하다</p>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/e387019d-bd42-4e5b-8110-e995e2484984/image.png" alt=""></p>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/e474aeae-d3d3-4102-aa97-7e51de9eae17/image.png" alt=""></p>
</br>

<hr>
<h1 id="참고-자료">참고 자료</h1>
<p><a href="https://velog.io/@denver_almighty/Python-EC2%EC%97%90-Jupyter-Notebook-%EC%8B%A4%ED%96%89%ED%95%98%EA%B8%B0ssh-%ED%8F%AC%ED%8A%B8%ED%8F%AC%EC%9B%8C%EB%94%A9">ssh 포트포워딩</a></p>
<hr>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Spark] Spark Streaming]]></title>
            <link>https://velog.io/@denver_almighty/Spark-Spark-Streaming</link>
            <guid>https://velog.io/@denver_almighty/Spark-Spark-Streaming</guid>
            <pubDate>Sun, 18 Dec 2022 11:21:19 GMT</pubDate>
            <description><![CDATA[<p>Spark Docs에 나오는 Spark Streaming 예제
localhost:9999에서 입력받은 글자 단어 세기</p>
<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS EC2 t2.xlarge
OS : Red Hat 9.1
Python : 3.9
Spark : 3.3.1
Scala : 2.12.15
Java : OpenJDK 64-Bit Server VM, 1.8.0_352</p>
</blockquote>
<h1 id="1-streaming-test">1. Streaming Test</h1>
<h2 id="1-1-streamingpy-생성">1-1. streaming.py 생성</h2>
<pre><code>vi streaming.py</code></pre><pre><code class="language-python">from pyspark.sql import SparkSession
from pyspark.sql.functions import *

# Create SparkSession
spark = SparkSession \
    .builder \
    .appName(&quot;StructuredNetworkWordCount&quot;) \
    .getOrCreate()

# localhost:9999 streaming input -&gt; Create DataFrame
lines = spark \
    .readStream \
    .format(&quot;socket&quot;) \
    .option(&quot;host&quot;, &quot;localhost&quot;) \
    .option(&quot;port&quot;, 9999) \
    .load()


# Split input by &quot; &quot; as word
words = lines.select(
   explode(
       split(lines.value, &quot; &quot;)
   ).alias(&quot;word&quot;)
)

# Count words
wordCounts = words.groupBy(&quot;word&quot;).count()

# Print number of words
query = wordCounts \
    .writeStream \
    .outputMode(&quot;complete&quot;) \
    .format(&quot;console&quot;) \
    .start()

query.awaitTermination()



# DataFrame으로 실행
words_df = lines_df.select(expr(&quot;explode(split(value, &#39; &#39;)) as word&quot;))
counts_df = words_df.groupBy(&quot;word&quot;).count()
word_count_query = counts_df.writeStream.format(&quot;console&quot;)\
                            .outputMode(&quot;complete&quot;)\
                            .option(&quot;checkpointLocation&quot;, &quot;.checkpoint&quot;)\
                            .start()
word_count_query.awaitTermination()
</code></pre>
<h2 id="1-2-streaming-실행">1-2. streaming 실행</h2>
<pre><code class="language-bash">spark-submit structured_network_wordcount.py localhost 9999
</code></pre>
<h2 id="1-3-netcat실행">1-3. Netcat실행</h2>
<pre><code class="language-bash"># 추가 세션 실행 후 명령어 입력
nc -lk 9999
# -&gt; 글자 입력</code></pre>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/a64532f4-0659-4537-acea-25c2897a1884/image.png" alt=""><img src="https://velog.velcdn.com/images/denver_almighty/post/c71352bd-c6c1-4aac-8b40-308a40fbded2/image.png" alt=""></p>
<h2 id="1-4-결과">1-4. 결과</h2>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/9a2b7e84-6c62-42a3-b351-518e2c1778d2/image.png" alt=""><img src="https://velog.velcdn.com/images/denver_almighty/post/79dc1d25-1106-4e6e-a3f0-21f6bdc11b39/image.png" alt=""></p>
</br>

<h1 id="2-readsteam-options">2. readSteam Options</h1>
<pre><code class="language-python"># socket(테스트용 : UTF-8 읽어옴. fault-tolerant 보장 x) 
readStream(&quot;socket&quot;) \
.option(&quot;host&quot;, &quot;localhost&quot;)\
.option(&quot;port&quot;, 9999)\

# rate source(테스트용 : 초당 지정된 수 만큼 데이터 생성)

# kafka source
readStream(&quot;kafka&quot;)\
.option(&quot;subscribe&quot;, &quot;topic1&quot;) \
.load()

# file source
# 지원 파일 형식 : text, csv, json, orc, parquet 
serSchema = StructType().add(&quot;name&quot;, &quot;string&quot;).add(&quot;age&quot;, &quot;integer&quot;)
csvDF = spark \
    .readStream \
    .option(&quot;sep&quot;, &quot;;&quot;) \
    .schema(userSchema) \
    .csv(&quot;/path/to/directory&quot;) </code></pre>
<h1 id="q">Q.</h1>
<ol>
<li><p>readstream(&quot;socket&quot;).option(&quot;host&quot;, HOST)
HOST에 locahost말고 되는지?</p>
</li>
<li><p>kafka, 파일로 테스트해보기</p>
</li>
<li><p>checkpointLocation 오류 해결</p>
<pre><code class="language-python">word_count_query = df.writeStream.format(&quot;console&quot;)\
                         .outputMode(&quot;complete&quot;)\
                         .option(&quot;checkpointLocation&quot;, &quot;.checkpoint&quot;)\
                         .start()</code></pre>
</li>
</ol>
<h1 id="참고-자료">참고 자료</h1>
<p><a href="https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html">structured-streaming-programming-guide</a>
<a href="https://spark-korea.github.io/docs/structured-streaming-programming-guide.html">structured-streaming-programming-guide - KO</a></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Spark] SQL 연습하기]]></title>
            <link>https://velog.io/@denver_almighty/Spark-SQL</link>
            <guid>https://velog.io/@denver_almighty/Spark-SQL</guid>
            <pubDate>Sun, 18 Dec 2022 07:44:16 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS EC2 t2.xlarge
OS : Red Hat 9.1
Python : 3.9
Spark : 3.3.1
Scala : 2.12.15
Java : OpenJDK 64-Bit Server VM, 1.8.0_352</p>
</blockquote>
<h1 id="1-sql-연습">1. SQL 연습</h1>
<pre><code class="language-python"># create data list
stockSchema = [&quot;name&quot;, &quot;ticker&quot;, &quot;country&quot;, &quot;price&quot;, &quot;currency&quot;]
stocks = [
    (&#39;Google&#39;, &#39;GOOGL&#39;, &#39;USA&#39;, 2984, &#39;USD&#39;), 
    (&#39;Netflix&#39;, &#39;NFLX&#39;, &#39;USA&#39;, 645, &#39;USD&#39;),
    (&#39;Amazon&#39;, &#39;AMZN&#39;, &#39;USA&#39;, 3518, &#39;USD&#39;),
    (&#39;Tesla&#39;, &#39;TSLA&#39;, &#39;USA&#39;, 1222, &#39;USD&#39;),
    (&#39;Tencent&#39;, &#39;0700&#39;, &#39;Hong Kong&#39;, 483, &#39;HKD&#39;),
    (&#39;Toyota&#39;, &#39;7203&#39;, &#39;Japan&#39;, 2006, &#39;JPY&#39;),
    (&#39;Samsung&#39;, &#39;005930&#39;, &#39;Korea&#39;, 70600, &#39;KRW&#39;),
    (&#39;Kakao&#39;, &#39;035720&#39;, &#39;Korea&#39;, 125000, &#39;KRW&#39;),
]

# create DataFrame (list to dataframe)
df = spark.createDataFrame(data=stocks, schema=stockSchema)

# create DatFrame (read csv file)
filename = &quot;/my/dir/filename.csv&quot;
# 파일 여러개 인 경우
filename = &quot;/my/dir/*.csv&quot;
df = spark.read.csv(f&quot;file:///{filename}&quot;, inferSchema=True, header=True)

# show data type
df.dtypes
&quot;&quot;&quot;
[(&#39;name&#39;, &#39;string&#39;),
 (&#39;ticker&#39;, &#39;string&#39;),
 (&#39;country&#39;, &#39;string&#39;),
 (&#39;price&#39;, &#39;bigint&#39;),
 (&#39;currency&#39;, &#39;string&#39;)]
&quot;&quot;&quot;

# describe() : 기본 통계 값 출력
df.describe().show()
df.select(&quot;total_amount&quot;).describe().show()
&quot;&quot;&quot;
+-------+------------------+
|summary|      total_amount|
+-------+------------------+
|  count|           9344926|
|   mean|18.217332152376397|
| stddev|184.27259172356767|
|    min|            -647.8|
|    max|          398469.2|
+-------+------------------+
&quot;&quot;&quot;

# print DataFrame
df.show()
&quot;&quot;&quot;                                                        
+-------+------+---------+------+--------+
|   name|ticker|  country| price|currency|
+-------+------+---------+------+--------+
| Google| GOOGL|      USA|  2984|     USD|
|Netflix|  NFLX|      USA|   645|     USD|
| Amazon|  AMZN|      USA|  3518|     USD|
|  Tesla|  TSLA|      USA|  1222|     USD|
|Tencent|  0700|Hong Kong|   483|     HKD|
| Toyota|  7203|    Japan|  2006|     JPY|
|Samsung|005930|    Korea| 70600|     KRW|
|  Kakao|035720|    Korea|125000|     KRW|
+-------+------+---------+------+--------+
&quot;&quot;&quot;

# &quot;stocks&quot;라는 Spark Temporary View 생성.
df.createOrReplaceTempView(&quot;stocks&quot;)

# SQL 사용
spark.sql(&quot;select name from stocks&quot;)
&quot;&quot;&quot;
DataFrame[name: string]
&quot;&quot;&quot;
spark.sql(&quot;select price from stocks&quot;)
&quot;&quot;&quot;
DataFrame[price: bigint]
&quot;&quot;&quot;

# spark.sql.(&quot;SQL&quot;).show() : show(n) n rows를 출력한다. default 20
spark.sql(&quot;select name from stocks&quot;).show()
&quot;&quot;&quot;
+-------+
|   name|
+-------+
| Google|
|Netflix|
| Amazon|
|  Tesla|
|Tencent|
| Toyota|
|Samsung|
|  Kakao|
+-------+
&quot;&quot;&quot;

spark.sql(&quot;select name, country from stocks where name like &#39;S%&#39;&quot;).show()
&quot;&quot;&quot;
+-------+-------+
|   name|country|
+-------+-------+
|Samsung|  Korea|
+-------+-------+
&quot;&quot;&quot;

# JOIN
spark.sql(&quot;select A.name, (A.price/B.eps) from A join B on A.name = B.name &quot;).show()

# explain(True)
spark.sql(&quot;select A.name, (A.price/B.eps) from A join B on A.name = B.name &quot;).explain()

# Datetime Format
# EEE : 요일 3글자 (ex. Wed)
# EEEE : 요일 (ex. Wednesday)
query = &quot;&quot;&quot;
SELECT 
    d.datetime,
    DATE_FORMAT(d.datetime, &#39;EEEE&#39;) AS day_of_week,
    COUNT(*) AS cnt
FROM
    df as d
GROUP BY
    d.datetime,
    day_of_week
&quot;&quot;&quot;

# DataFrame to pandas DataFrame
# pd_df 는 그냥 판다스 사용하는 것 처럼 seaborn, matplotlib 등등에 사용하면 된다.
pd_df = spark.sql(query).toPandas()

</code></pre>
</br>
### biging
df.dtypes 를 실행하면 price가 bigint 라는 타입이라고 출력된다.
bigint는 8 바이트 크기의 SQL 서버에서 가장 큰 정수 데이터 타입이다.
(-9,223,372,036,854,775,808 ~ 9,223,372,036,854,775,807) 


<h3 id="sparksession">SparkSession</h3>
<p>pyspark.sql.SparkSession
Dataset, DatFrame API 로 Spark 프로그래밍하기 위한 진입점
SparkSession은 DataFrame 생성, DataFrame을 table로 등록, parquet 파일 읽기에 사용된다.
SparkSession을 생성하기위해서는 builder 패턴을 사용해야한다.</p>
<pre><code class="language-python">spark = SparkSession.builder \
    .master(&quot;local&quot;) \
    .appName(&quot;Word Count&quot;) \
    .config(&quot;spark.some.config.option&quot;, &quot;some-value&quot;) \
    .getOrCreate()</code></pre>
</br>

<h3 id="createorreplacetempview">createOrReplaceTempView</h3>
<p>DATAFRAME.createOrReplaceTempView(&quot;VIEW_NAME&quot;)
데이터프레임(DATAFRAME)으로 로컬 임시 뷰(VIEW_NAME) 생성/대체.
임시 테이프블의 수명은 이 데이터프레임을 생성하는데 사용된 SparkSession에 달려잇다. 세션이 종료되면 View Table은 Drop된다.</p>
<h2 id="spark-function">Spark Function</h2>
<h3 id="date_truncdate-fmt">date_trunc(date, fmt)</h3>
<p> : date에서 fmt 다음 단위를 00 으로 자른 값 반환.</p>
<blockquote>
<p> date_trunc(date, fmt)
: fmt 모델 형식에 지정된 단위로 잘린 timestamp를 반환한다. fmt는 [&quot;YEAR&quot;, &quot;YYYY&quot;, &quot;YY&quot;, &quot;MON&quot;, &quot;MONTH&quot;, &quot;MM&quot;, &quot;DAY&quot;, &quot;DD&quot;, &quot;HOUR&quot;, &quot;MINUTE&quot;, &quot;SECOND&quot;, &quot;WEEK&quot;, &quot;QUARTER&quot;] 중에 하나여야 한다.</p>
</blockquote>
<pre><code class="language-sql">SELECT date_trunc(&#39;2015-03-05T09:32:05.359&#39;, &#39;YEAR&#39;);
# -&gt; 2015-01-01T00:00:00
SELECT date_trunc(&#39;2015-03-05T09:32:05.359&#39;, &#39;MM&#39;);
# -&gt; 2015-03-01T00:00:00
SELECT date_trunc(&#39;2015-03-05T09:32:05.359&#39;, &#39;DD&#39;);
# -&gt; 2015-03-05T00:00:00
SELECT date_trunc(&#39;2015-03-05T09:32:05.359&#39;, &#39;HOUR&#39;);
# -&gt; 2015-03-05T09:00:00</code></pre>
</br>

<h1 id="q">Q</h1>
<ul>
<li>파일 여러개인 경우, 아래와 같이 데이터프레임을 만든다.<pre><code class="language-python">filename = &quot;*.csv&quot;
df = spark.read.csv(f&quot;file:///{filename}&quot;, inferSchema=True, header=True)</code></pre>
스키마가 동일하면 상관없는데
스키마가 다른 파일이(A,B라고 가정) 같이 있다면<pre><code class="language-python">df.printSchema()
# -&gt; A의 스키마만 출력
</code></pre>
</li>
</ul>
<p>trips_df.select(&quot;B_COLUMN&quot;).show() </p>
<h1 id="--column-b_column-does-not-exist">-&gt; Column &#39;B_COLUMN&#39; does not exist.</h1>
<pre><code>&lt;/br&gt;
- spark.sql(&quot;QUERY&quot;) VS df.select(&quot;&quot;).describe().show()



&lt;/br&gt;&lt;/br&gt;

# 참고 자료
[bigint](https://www.dofactory.com/sql/bigint)
[Spark Docs : pyspark.sql.SparkSession ](https://spark.apache.org/docs/3.1.3/api/python/reference/api/pyspark.sql.SparkSession.html#pyspark.sql.SparkSession)
[Spark Docs : pyspark.sql.DataFrame.createOrReplaceTempView](https://spark.apache.org/docs/3.1.3/api/python/reference/api/pyspark.sql.DataFrame.createOrReplaceTempView.html)
[Spark Docs : Spark functions](https://spark.apache.org/docs/2.3.0/api/sql/index.html)
[Spark Docs : Datetime Pattern](https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html)
[PySparkn Datetime Format](https://dbmstutorials.com/pyspark/spark-dataframe-format-timestamp.html)
</code></pre>]]></description>
        </item>
        <item>
            <title><![CDATA[[Python] EC2에 Jupyter Notebook 실행하기(ssh 포트포워딩)]]></title>
            <link>https://velog.io/@denver_almighty/Python-EC2%EC%97%90-Jupyter-Notebook-%EC%8B%A4%ED%96%89%ED%95%98%EA%B8%B0ssh-%ED%8F%AC%ED%8A%B8%ED%8F%AC%EC%9B%8C%EB%94%A9</link>
            <guid>https://velog.io/@denver_almighty/Python-EC2%EC%97%90-Jupyter-Notebook-%EC%8B%A4%ED%96%89%ED%95%98%EA%B8%B0ssh-%ED%8F%AC%ED%8A%B8%ED%8F%AC%EC%9B%8C%EB%94%A9</guid>
            <pubDate>Sat, 10 Dec 2022 13:40:33 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS EC2 t2.xlarge
OS : Red Hat 9.1
Python :3.9
Jupyter Notebook 6.4.12</p>
</blockquote>
<h1 id="1-jupyter-notebook-설정">1. Jupyter Notebook 설정</h1>
<p>Jupyter Notebook 설치 후 비밀번호를 설정한다.
꼭 필요한 과정은 아니다.
다만 비밀번호를 생성하지 않으면 실행 시 마다 생성되는 token 값으로 접속해야한다.</p>
<h2 id="1-1-jupyter-notebook-비밀번호-생성">1-1. Jupyter Notebook 비밀번호 생성</h2>
<pre><code class="language-python">python
&gt;&gt;&gt; from notebook.auth import passwd
&gt;&gt;&gt; passwd()
Enter password: # 비밀번호 입력
Verify password: # 비밀번호 입력
&#39;&lt;agon2/sha ... PASSWORD_HASH&gt;&#39;
# 비밀번호 2번 입력하면 비밀번호 해쉬값이 나오는데 !꼭! 복사해둔다
&gt;&gt;&gt; exit()
</code></pre>
<h2 id="1-2-비밀번호-설정">1-2. 비밀번호 설정</h2>
<pre><code class="language-bash"># 설정 파일 생성
jupyter notebook --generate-config
# 설정 파일 편집
vi /home/ec2-user/.jupyter/jupyter_notebook_config.py
# 위에서 생성된 비밀번호 해쉬값 입력
conf.NotebookApp.password = u&#39;&lt;PASSWORD_HASH&gt;&#39;</code></pre>
</br>

<h1 id="2-jupyter-notebook-실행">2. Jupyter Notebook 실행</h1>
<pre><code>jupyter notebook</code></pre><p>1번을 생략했다면 접속 URL을 출력하는데 복사해둔다.
<img src="https://velog.velcdn.com/images/denver_almighty/post/09172c1d-3a7f-4956-9147-d6dfda87af5f/image.png" alt="">
</br></p>
<h1 id="3-ssh-포트포워딩">3. SSH 포트포워딩</h1>
<pre><code class="language-bash"># 서버 8888포트를 로컬 &lt;LOCAL_PORT&gt; 포트로 포트포워딩
# (jupyter notebook 기본 포트 8888)
ssh -i &quot;&lt;key.pem&gt;&quot; -L &lt;LOCAL_PORT&gt;:localhost:8888 &lt;username&gt;@&lt;public_ip&gt;</code></pre>
<pre><code class="language-bash"># 실행중인 프로세스 확인
ps</code></pre>
<p>  <img src="https://velog.velcdn.com/images/denver_almighty/post/27980798-2e80-4cff-98ce-e4e9455b6cc5/image.png" alt=""></p>
<pre><code class="language-bash"># LISTEN 포트 확인
sudo lsof -i -P -n | grep LISTEN
# 58971 프로세스에서 8888 포트 LISTEN</code></pre>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/a646be75-0c22-4e67-989a-f19f0eb6b24e/image.png" alt="">
</br></p>
<h1 id="4-jupyter-notebook-접속">4. Jupyter Notebook 접속</h1>
<pre><code>localhost:8888</code></pre><p>비밀번호 입력 창이 나오는데 1번에서 입력한 비밀번호를 입력한다.
<img src="https://velog.velcdn.com/images/denver_almighty/post/af12d7a6-d4c9-466a-a738-e476dca41e0f/image.png" alt=""></p>
<p>1번을 생략했다면 2번에서 복사한 URL로 접속한다.
<img src="https://velog.velcdn.com/images/denver_almighty/post/301ab4fc-3da4-4aea-8838-46da574ebecf/image.png" alt=""></p>
<p></br></br></br></p>
<h1 id="실패-기록">실패 기록</h1>
<p>Jupyter Notebook 설치, 실행 후 
http(s)://<PUBLIC_IP>:8888 로 접속하면 연결이 안됐다.
인바운드는 내 ip에서는 모두 허용이었다.
인증 키를 만들고,
비밀번호를 생성하고,
jupyter_notebook_config.py 에 설정하고,
접속하면 된다던데 안된다.</p>
<p>접속 시도하면 아래 로그가 발생한다.
https말고 http로 접속하라는데 그래도 안된다.</p>
<pre><code class="language-bash">handle: &lt;Handle BaseAsyncIOLoop._handle_events(8, 1)&gt;
Traceback (most recent call last):
  File &quot;/opt/anaconda/anaconda3/lib/python3.9/asyncio/events.py&quot;, line 80, in _run
    self._context.run(self._callback, *self._args)
  File &quot;/opt/anaconda/anaconda3/lib/python3.9/site-packages/tornado/platform/asyncio.py&quot;, line 189, in _handle_events
    handler_func(fileobj, events)
  File &quot;/opt/anaconda/anaconda3/lib/python3.9/site-packages/tornado/netutil.py&quot;, line 276, in accept_handler
    callback(connection, address)
  File &quot;/opt/anaconda/anaconda3/lib/python3.9/site-packages/tornado/tcpserver.py&quot;, line 288, in _handle_connection
    connection = ssl_wrap_socket(
  File &quot;/opt/anaconda/anaconda3/lib/python3.9/site-packages/tornado/netutil.py&quot;, line 608, in ssl_wrap_socket
    context = ssl_options_to_context(ssl_options)
  File &quot;/opt/anaconda/anaconda3/lib/python3.9/site-packages/tornado/netutil.py&quot;, line 576, in ssl_options_to_context
    context.load_cert_chain(
ssl.SSLError: [SSL] PEM lib (_ssl.c:4065)</code></pre>
</br>

<h1 id="q">Q.</h1>
<p>ssh는 원격접속할 때나 썼는데 이렇게 포트포워딩은 처음 해봤다.
SSH 포트포워딩 알아보자</p>
<h1 id="참고-자료">참고 자료</h1>
<p><a href="https://docs.jupyter.org/en/latest/">Jupyter Notebook DOCS</a></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Kafka] Docker Compose로 Kafka 멀티 브로커 구성]]></title>
            <link>https://velog.io/@denver_almighty/Kafka-Docker-Compose%EB%A1%9C-Kafka-%EB%A9%80%ED%8B%B0-%EB%B8%8C%EB%A1%9C%EC%BB%A4-%EA%B5%AC%EC%84%B1</link>
            <guid>https://velog.io/@denver_almighty/Kafka-Docker-Compose%EB%A1%9C-Kafka-%EB%A9%80%ED%8B%B0-%EB%B8%8C%EB%A1%9C%EC%BB%A4-%EA%B5%AC%EC%84%B1</guid>
            <pubDate>Sat, 26 Nov 2022 11:51:53 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행환경">0. 실행환경</h1>
<blockquote>
<p>AWS EC2 t2.xlarge
OS : Ubuntu 22.04
Kafka : 
Docker Compose : v2.7.0</p>
</blockquote>
</br>

<h1 id="1-실행">1. 실행</h1>
<h2 id="1-docker-composeyml-생성">1) docker-compose.yml 생성</h2>
<pre><code>vi docker-compose.yml</code></pre><pre><code class="language-yml">version: &#39;2&#39;
services:
  zookeeper-1:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_SERVER_ID: 1
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
    ports:
      - &quot;22181:2181&quot;

  zookeeper-2:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_SERVER_ID: 2
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
    ports:
      - &quot;32181:2181&quot;

  zookeeper-3:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_SERVER_ID: 3
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
    ports:
      - &quot;42181:2181&quot;



  kafka-1:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper-1
      - zookeeper-2
      - zookeeper-3
    ports:
      - 29092:29092
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka-1:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1

  kafka-2:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper-1
      - zookeeper-2
      - zookeeper-3
    ports:
      - &quot;39092:39092&quot;
    environment:
      KAFKA_BROKER_ID: 2
      KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka-2:9092,PLAINTEXT_HOST://localhost:39092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1

  kafka-3:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper-1
      - zookeeper-2
      - zookeeper-3
    ports:
      - &quot;49092:49092&quot;
    environment:
      KAFKA_BROKER_ID: 3
      KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka-3:9092,PLAINTEXT_HOST://localhost:49092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
</code></pre>
<pre><code class="language-docker"># docker-compose 설정 변수
depends_on : 서비스들의 우선순위 지정. depends_on에 설정된 서비스가 실행되어야 해당 서비스가 올라간다.
environment: 환경변수 설정

# Kafka 설정 변수
KAFKA_BROKER_ID: 유일값이어야함. 
KAFKA_ZOOKEEPER_CONNECT: zookeeper 지정
KAFKA_ADVERTISED_LISTENERS: 외부에서 접속용 리스너 설정
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 보안을 위한 프로토콜 매핑. 여기 설정 값이 KAFKA_ADVERTISED_LISTENERS 과 함께 key-value로 매핑됨.
KAFKA_INTER_BROKER_LISTENER_NAME: 도커 내부에서 리스너 이름을 지정
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 트랜잭션 상태에서 복제 수
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 트랜잭션 최소 ISR(InSyncReplicas 설정) 수</code></pre>
<h2 id="2-docker-compose-실행">2) docker compose 실행</h2>
<pre><code>docker-compose -f docker-compose.yml up -d</code></pre><h2 id="3-토픽-생성">3) 토픽 생성</h2>
<pre><code>docker-compose exec kafka-1 kafka-topics --create --topic test-topic --bootstrap-server kafka-1:9092 --replication-factor 3 --partitions 2

=&gt; Created topic test-topic.
</code></pre><pre><code>--bootstrap-server &lt;service:port&gt; : 클라이언트가 접근하는 토픽 파티션의 메타데이터를 요청하기 위한 설정
--replication-factor : 토픽 복제 수
--partition: 토픽내에 파티션 수</code></pre><h2 id="4-토픽-확인">4) 토픽 확인</h2>
<pre><code>docker-compose exec kafka-1 kafka-topics --describe --topic test-topic --bootstrap-server kafka-1:9092 

=&gt;
Topic: test-topic    TopicId: zrU8TR3IQu2l24nQkYZ1jA    PartitionCount: 2    ReplicationFactor: 3    Configs:
    Topic: test-topic    Partition: 0    Leader: 3    Replicas: 3,1,2    Isr: 3,1,2
    Topic: test-topic    Partition: 1    Leader: 1    Replicas: 1,2,3    Isr: 1,2,3
</code></pre><pre><code>Leader : 파티션의 리더 브로커
Replicas : 데이터 복제
Isr : In sync replica (동기화된 복제본)</code></pre><h2 id="5-컨슈머-실행">5) 컨슈머 실행</h2>
<pre><code>docker-compose exec kafka-1 bash
[appuser@6e847b6b1748 ~]$ kafka-console-consumer --topic test-topic --bootstrap-server kafka-1:9092</code></pre><h2 id="6-producer-실행">6) producer 실행</h2>
<pre><code>$ docker-compose exec kafka-1 bash 
[appuser@6e847b6b1748 ~]$ kafka-console-producer --topic test-topic --broker-list kafka-1:9092</code></pre><p>producer
<img src="https://velog.velcdn.com/images/denver_almighty/post/9424545d-fb08-4623-95e4-5cdff20dba51/image.png" alt=""></p>
<p>consumer
<img src="https://velog.velcdn.com/images/denver_almighty/post/ec39d7b6-611d-4968-b627-aa57f86c5cc2/image.png" alt=""></p>
<h1 id="3-replication-수-변경">3. Replication 수 변경</h1>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/52e08879-5a84-4693-820c-e7ac8b4238f4/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[MongoDB] DB, Data CRUD 명령어 모음]]></title>
            <link>https://velog.io/@denver_almighty/MongoDB-DB-Data-CRUD-%EB%AA%85%EB%A0%B9%EC%96%B4-%EB%AA%A8%EC%9D%8C</link>
            <guid>https://velog.io/@denver_almighty/MongoDB-DB-Data-CRUD-%EB%AA%85%EB%A0%B9%EC%96%B4-%EB%AA%A8%EC%9D%8C</guid>
            <pubDate>Sat, 26 Nov 2022 08:18:42 GMT</pubDate>
            <description><![CDATA[<pre><code class="language-bash">mongosh
use admin
show dbs
&gt; 인증오류</code></pre>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/f860b258-5fc8-4901-bdbe-7a10c09b4673/image.png" alt=""></p>
<h1 id="db-생성">DB 생성</h1>
<pre><code>mongosh admin -u &quot;USERNAME&quot; -p &quot;PW&quot;
show dbs</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/1a0c553e-0623-4cf8-8a93-c4ce4c4cc8e5/image.png" alt=""></p>
<pre><code>use test_db
show dbs
&gt; test_db가 안보인다
</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/99075669-6e4e-445f-97c0-3b104a6dd8d7/image.png" alt="">
</br></p>
<h1 id="데이터-추가">데이터 추가</h1>
<pre><code>db.collection.insert()
db.collection.insertOne({})
db.collection.insertMany([{},{}.....])</code></pre><pre><code>db.collection.insert({&lt;DATA&gt;})
db 
&gt; DB 이름 출력
show dbs</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/08c23351-78e0-43e9-a483-b36572665cfd/image.png" alt="">
<img src="https://velog.velcdn.com/images/denver_almighty/post/3003ce2f-f630-4516-854f-4d66817f1a01/image.png" alt="">
</br></p>
<h1 id="데이터-입력update">데이터 입력(Update)</h1>
<pre><code class="language-mmongo">db.user.insert({&lt;DATA&gt;})</code></pre>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/2872b7fd-31e9-4813-adc3-1e6b82fc592b/image.png" alt="">
<img src="https://velog.velcdn.com/images/denver_almighty/post/48a415d8-f30e-494f-922b-8d53f151e3a7/image.png" alt=""></p>
<br>

<h1 id="데이터-읽기">데이터 읽기</h1>
<pre><code>db.collection.find()</code></pre></br>
# 데이터 삭제
```
db.collection.deleteOne()
db.collection.deleteMany()
```
![](https://velog.velcdn.com/images/denver_almighty/post/a46078a0-591a-4974-b356-18524c2faa00/image.png)

<p><img src="https://velog.velcdn.com/images/denver_almighty/post/307adab6-8b19-4b16-bfd3-acff5397dd3a/image.png" alt=""></p>
</br>

<h1 id="db-삭제delete">DB 삭제(Delete)</h1>
<pre><code>db.dropDatabase()</code></pre><p><img src="https://velog.velcdn.com/images/denver_almighty/post/4b72fee9-dad3-4fa5-96fe-475b6d40240c/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[MongoDB] root(admin) 계정 생성하기]]></title>
            <link>https://velog.io/@denver_almighty/MongoDB-rootadmin-%EA%B3%84%EC%A0%95-%EC%83%9D%EC%84%B1%ED%95%98%EA%B8%B0</link>
            <guid>https://velog.io/@denver_almighty/MongoDB-rootadmin-%EA%B3%84%EC%A0%95-%EC%83%9D%EC%84%B1%ED%95%98%EA%B8%B0</guid>
            <pubDate>Sat, 26 Nov 2022 07:38:03 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS t2.xlarge
OS : Redhat 8.6
MongoDB Version : 6.0.3</p>
</blockquote>
</br> 

<h1 id="1-계정-생성">1. 계정 생성</h1>
<pre><code class="language-bash"># mongodb 실행
mongosh

# root권한가진 계정생성
db.createUser({user:&quot;USERNAME&quot;, pwd:&quot;PW&quot;, roles:[&quot;root&quot;]})

#로그인
mongosh admin -u USERNAME -p PW</code></pre>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/a4773bbc-d5d3-4b52-9cc1-ea5755b41a04/image.png" alt="">
<img src="https://velog.velcdn.com/images/denver_almighty/post/5c49cd28-01a0-44f6-af1e-f6000111a53a/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[MongoDB] Redhat8에 MongoDB 설치하기]]></title>
            <link>https://velog.io/@denver_almighty/MongoDB-Redhat8%EC%97%90-MongoDB-%EC%84%A4%EC%B9%98%ED%95%98%EA%B8%B0</link>
            <guid>https://velog.io/@denver_almighty/MongoDB-Redhat8%EC%97%90-MongoDB-%EC%84%A4%EC%B9%98%ED%95%98%EA%B8%B0</guid>
            <pubDate>Sat, 26 Nov 2022 07:29:30 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS t2.xlarge
OS : Redhat 8.6
MongoDB Version : 6.0.3</p>
</blockquote>
</br>

<h1 id="1-설치하기">1. 설치하기</h1>
<h2 id="1-패키지-관리-시스템-yum-설정">1) 패키지 관리 시스템 (yum) 설정</h2>
<pre><code class="language-bash">vi /etc/yum.repos.d/mongodb-org-6.0.repo

# 아래 내용 입력
[mongodb-org-6.0]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/6.0/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-6.0.asc
</code></pre>
<h2 id="2-mongodb-패키지-설치">2) MongoDB 패키지 설치</h2>
<pre><code>sudo yum install -y mongodb-org
sudo yum install -y mongodb-org-6.0.3 mongodb-org-database-6.0.3 mongodb-org-server-6.0.3 mongodb-mongosh-6.0.3 mongodb-org-mongos-6.0.3 mongodb-org-tools-6.0.3

# 의도치 않은 업그레이드 방지(yum 업그레이드 시 패키지 업그레이드 방지)
vi /etc/yum.conf
# 아래 내용 추가
exclude=mongodb-org,mongodb-org-database,mongodb-org-server,mongodb-mongosh,mongodb-org-mongos,mongodb-org-tools</code></pre><h2 id="3-설정">3) 설정</h2>
<h3 id="3-1-ulimit-설정">3-1) ulimit 설정</h3>
<pre><code>MongoDB 4.4부터 ulimit열린 파일 수 값이 64000 미만 이면 시작 오류가 생성된다.
Redhat8에서 ulimit 명령어는 최대 프로세스 값을 구성하기 충분하기 때문에 nproc 값 설정이 필요없다.</code></pre><h3 id="3-2-디렉토리-설정">3-2) 디렉토리 설정</h3>
<p>기본 디렉토리
데이터 : /var/lib/mongo
로그 : /var/log/mongodb</p>
<pre><code># 새 디렉토리 생성 (/my/mongodb/dir/)
mkdir /my/mongodb/dir/
vi /etc/mongod.conf
storage.dbPath=/my/mongodb/dir
systemLog.path=/my/mongodb/dir/mongod.log
sudo chown -R mongod:mongod /my/mongodb/dir/</code></pre><h3 id="3-3-selinux-구성">3-3) SELinux 구성</h3>
<pre><code>sudo yum install git make checkpolicy policycoreutils selinux-policy-devel
git clone https://github.com/mongodb/mongodb-selinux
cd mongodb-selinux
make
sudo make install</code></pre><h3 id="3-4-mongodconf-수정">3-4) mongod.conf 수정</h3>
<pre><code>vi /etc/mongod.conf</code></pre><p>아랫부분 net에 bindIp, security에 authorization 수정</p>
<ul>
<li>bindIp: 연결 허용 IP</li>
<li>authorization : 연결 시 계정 확인<pre><code># mongod.conf
# for documentation of all options, see:
#   http://docs.mongodb.org/manual/reference/configuration-options/
</code></pre></li>
</ul>
<h1 id="where-to-write-logging-data">where to write logging data.</h1>
<p>systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log</p>
<h1 id="where-and-how-to-store-data">Where and how to store data.</h1>
<p>storage:
  dbPath: /var/lib/mongo
  journal:
    enabled: true</p>
<h1 id="engine">engine:</h1>
<h1 id="wiredtiger">wiredTiger:</h1>
<h1 id="how-the-process-runs">how the process runs</h1>
<p>processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongod.pid  # location of pidfile
  timeZoneInfo: /usr/share/zoneinfo</p>
<h1 id="network-interfaces">network interfaces</h1>
<p>net:
  port: 27017
  bindIp : 0.0.0.0</p>
<p>security:
  authorization : enabled</p>
<p>#operationProfiling:</p>
<p>#replication:</p>
<p>#sharding:</p>
<h2 id="enterprise-only-options">Enterprise-Only Options</h2>
<p>#auditLog:</p>
<p>#snmp:</p>
<pre><code>## 4) 실행</code></pre><p>sudo systemctl daemon-reload
sudo systemctl start mongod</p>
<pre><code>
</code></pre><p>mongosh</p>
<pre><code>![](https://velog.velcdn.com/images/denver_almighty/post/5d8cb263-3f4d-4346-96b1-f009a25b0936/image.png)


### 실행 안될 때
[실행 안될 때(status=14 / status=100)](https://velog.io/@denver_almighty/MongoDB-%EC%84%A4%EC%B9%98-%ED%9B%84-%EC%8B%A4%ED%96%89-%EC%95%88%EB%90%A8status14-status100)

&lt;br&gt;

# 참고
[Install MongoDB Community Edition on Red Hat or CentOS](https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-red-hat/)</code></pre>]]></description>
        </item>
        <item>
            <title><![CDATA[[MongoDB] 설치 후 실행 안됨(status=14 , status=100)]]></title>
            <link>https://velog.io/@denver_almighty/MongoDB-%EC%84%A4%EC%B9%98-%ED%9B%84-%EC%8B%A4%ED%96%89-%EC%95%88%EB%90%A8status14-status100</link>
            <guid>https://velog.io/@denver_almighty/MongoDB-%EC%84%A4%EC%B9%98-%ED%9B%84-%EC%8B%A4%ED%96%89-%EC%95%88%EB%90%A8status14-status100</guid>
            <pubDate>Sat, 26 Nov 2022 07:24:12 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS t2.xlarge
OS : Redhat 8.6
MongoDB Version : 6.0.3</p>
</blockquote>
<h1 id="1-error-확인">1. Error 확인</h1>
<pre><code># 서비스 상태 확인
systmectl status mongod
# 로그 확인
tail -50 /var/log/mongodb/mongod.log | grep error</code></pre><h2 id="1-status14">1) status=14</h2>
<blockquote>
<p>systemstatus
(code=exited, status=14)</p>
</blockquote>
<blockquote>
<p>mongod.log
{&quot;t&quot;:{&quot;$date&quot;:&quot;2022-11-26T06:41:23.568+00:00&quot;},&quot;s&quot;:&quot;E&quot;,  &quot;c&quot;:&quot;NETWORK&quot;,  &quot;id&quot;:23024,   &quot;ctx&quot;:&quot;initandlisten&quot;,&quot;msg&quot;:&quot;Failed to unlink socket file&quot;,&quot;attr&quot;:{&quot;path&quot;:&quot;/tmp/mongodb-27017.sock&quot;,&quot;error&quot;:&quot;Operation not permitted&quot;}}</p>
</blockquote>
<h2 id="해결-방법">해결 방법</h2>
<pre><code class="language-bash">rm /tmp/mongodb-27017.sock
systemctl start mongod</code></pre>
<h2 id="2-status100">2) status=100</h2>
<blockquote>
<p>service status
ExecStart=/usr/bin/mongod $OPTIONS (code=exited, status=100)</p>
</blockquote>
<blockquote>
<p>mongod.log
{&quot;t&quot;:{&quot;$date&quot;:&quot;2022-11-26T06:51:14.023+00:00&quot;},&quot;s&quot;:&quot;E&quot;,  &quot;c&quot;:&quot;CONTROL&quot;,  &quot;id&quot;:20557,   &quot;ctx&quot;:&quot;initandlisten&quot;,&quot;msg&quot;:&quot;DBException in initAndListen, terminating&quot;,&quot;attr&quot;:{&quot;error&quot;:&quot;IllegalOperation: Attempted to create a lock file on a read-only directory: /var/lib/mongo&quot;}}</p>
</blockquote>
<h2 id="해결-방법-1">해결 방법</h2>
<pre><code class="language-bash">chown -R mongod:mongod /var/lib/mongo</code></pre>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Kafka] Podman compose으로 Kafka 실행]]></title>
            <link>https://velog.io/@denver_almighty/Kafka-Podman-compose%EC%9C%BC%EB%A1%9C-Kafka-%EC%8B%A4%ED%96%89</link>
            <guid>https://velog.io/@denver_almighty/Kafka-Podman-compose%EC%9C%BC%EB%A1%9C-Kafka-%EC%8B%A4%ED%96%89</guid>
            <pubDate>Sun, 20 Nov 2022 12:22:13 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS t2.xlarge
OS : Redhat 8.6
Kafka : 3.3
Zookeeper : 3.8</p>
</blockquote>
</br>

<h1 id="1-실행하기">1. 실행하기</h1>
<pre><code class="language-bash"># kafka docker-compose.yml 다운로드
curl -sSL https://raw.githubusercontent.com/bitnami/containers/main/bitnami/kafka/docker-compose.yml &gt; docker-compose.yml
# podman-compose로 이름 변경
mv docker-compose.yml podman-compse.yml

# podman compose 실행
podman-compose up</code></pre>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/7bfcec2a-b859-4ed9-a935-9a1495ed2cee/image.png" alt=""></p>
</br>

<h1 id="참고">참고</h1>
<p><a href="https://hub.docker.com/r/bitnami/kafka">Docker Hub - bitnami/kafka</a></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[Podman] RHEL8에 Podman 설치하기 ( + Podman compose)]]></title>
            <link>https://velog.io/@denver_almighty/Podman-RHEL8%EC%97%90-Podman-%EC%84%A4%EC%B9%98%ED%95%98%EA%B8%B0</link>
            <guid>https://velog.io/@denver_almighty/Podman-RHEL8%EC%97%90-Podman-%EC%84%A4%EC%B9%98%ED%95%98%EA%B8%B0</guid>
            <pubDate>Sun, 20 Nov 2022 12:02:58 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS t2.xlarge
OS : Redhat 8.6
Podman : 4.2.0</p>
</blockquote>
</br>

<h1 id="1-redhat8에서-docker">1. Redhat8에서 Docker</h1>
<p>RedHat8 에 Docker를 설치하면 아래와 같은 오류가난다.</p>
<blockquote>
<p>Errors during downloading metadata for repository &#39;docker-ce-stable&#39;</p>
</blockquote>
<p>Docker Docs에 RHEL 설치 페이지에 가면 
&quot;현재는 s390x(IBM Z)에서 Redhat에서만 Docker 설치가 지원된다.&quot; 라는 문구가나온다.
<a href="https://access.redhat.com/discussions/6249651">Redhat Customer Portal</a>에 보면 
CentOS용으로 설치를 사용하거나 / 충돌 날 수 있는 Podman, Buildah 를 삭제하고설치하면 된다는데  CentOS 용으로 설치하면 설치는된다. 
Docker는 사용해봤으니 Podman으로 Kafka와 Airflow를 사용해볼까한다.</p>
</br>

<h1 id="2-podman-설치하기">2. Podman 설치하기</h1>
<pre><code class="language-bash"># Podman 설치 (RHEL8)
sudo yum module enable -y container-tools:rhel8
sudo yum module install -y container-tools:rhel8</code></pre>
<pre><code class="language-bash"># podman-compose 설치
# Python3 설치되어있어야함(pip3)
pip3 install podman-compose --user

# 설치 확인
podman-compose --version
</code></pre>
</br>

<h1 id="참고">참고</h1>
<p><a href="https://podman.io/getting-started/installation">Podman Installation Instructions</a></p>
<p><a href="https://github.com/containers/podman-compose">Github - Podman Compose</a></p>
<p><a href="https://docs.podman.io/en/latest/markdown/podman-rm.1.html">Podman 명령어</a></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[OS] AWS EC2 root 비밀번호 생성]]></title>
            <link>https://velog.io/@denver_almighty/OS-AWS-EC2-root-%EB%B9%84%EB%B0%80%EB%B2%88%ED%98%B8-%EC%83%9D%EC%84%B1</link>
            <guid>https://velog.io/@denver_almighty/OS-AWS-EC2-root-%EB%B9%84%EB%B0%80%EB%B2%88%ED%98%B8-%EC%83%9D%EC%84%B1</guid>
            <pubDate>Sun, 20 Nov 2022 11:02:14 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS t2.xlarge
OS : Redhat 8.6</p>
</blockquote>
</br>

<h1 id="1-비밀번호-생성">1. 비밀번호 생성</h1>
<p>mysqld를 실행하는데
ststemctl start/stop mysql 하면 root 비밀번호가 필요하다
<img src="https://velog.velcdn.com/images/denver_almighty/post/a24bad4a-00cf-45bf-a21f-c41a49e17db1/image.png" alt=""></p>
<p>ec2 인스턴스는 키로만 SSH 접속해왔어서 root 비밀번호를 몰랐는데 다른 계정처럼 아래 명령어로 만들면 된다.</p>
<pre><code class="language-bash">sudo passwd root </code></pre>
<p><img src="https://velog.velcdn.com/images/denver_almighty/post/256ad16f-9eab-4395-944f-8150bbe9448b/image.png" alt=""></p>
]]></description>
        </item>
        <item>
            <title><![CDATA[[MySQL] Redhat8에 MySQL 설치하기]]></title>
            <link>https://velog.io/@denver_almighty/MySQL-%EC%84%A4%EC%B9%98%ED%95%98%EA%B8%B0</link>
            <guid>https://velog.io/@denver_almighty/MySQL-%EC%84%A4%EC%B9%98%ED%95%98%EA%B8%B0</guid>
            <pubDate>Sun, 20 Nov 2022 10:29:48 GMT</pubDate>
            <description><![CDATA[<h1 id="0-실행-환경">0. 실행 환경</h1>
<blockquote>
<p>AWS t2.xlarge
OS : Redhat 8.6
MySQL Version : 8.0.31</p>
</blockquote>
</br>

<h1 id="1-설치하기">1. 설치하기</h1>
<h2 id="1-mysql-다운로드">1) MySQL 다운로드</h2>
<pre><code class="language-bash"># 다운로드 
wget https://repo.mysql.com//mysql80-community-release-el8-4.noarch.rpm

# YUM 레포지토리에 MySQL 추가
sudo yum install mysql80-community-release-el8-{version-number}.noarch.rpm

# MySQL 설치
sudo yum install mysql-community-server</code></pre>
<h2 id="mysql-실행">MySQL 실행</h2>
<pre><code># MySQL 실행 (root 아니고, sudo 없이 명령어 실행하면 OS root 비밀번호 입력해야함)
sudo systemctl start mysqld

# 임시 비밀번호 찾기
sudo grep &#39;temporary password&#39; /var/log/mysqld.log

# MySQL 접속
mysql -uroot -p
-&gt; 임시 비밀번호 입력</code></pre><h2 id="root-비밀번호-변경">root 비밀번호 변경</h2>
<pre><code class="language-sql"># 비밀번호는 대문자, 소문자, 숫자 및 특수 문자가 하나 이상 포함되고 총 암호 길이가 8자 이상이어야 함
mysql&gt; ALTER USER &#39;root&#39;@&#39;localhost&#39; IDENTIFIED BY &#39;MyNewPass4!&#39;;
</code></pre>
</br>

<h1 id="참고">참고</h1>
<p>MySQL Download
<a href="https://dev.mysql.com/downloads/repo/yum/">https://dev.mysql.com/downloads/repo/yum/</a></p>
<p>MySQL Docs - Installing MySQL on Linux Using the MySQL Yum Repository
<a href="https://dev.mysql.com/doc/refman/8.0/en/linux-installation-yum-repo.html">https://dev.mysql.com/doc/refman/8.0/en/linux-installation-yum-repo.html</a></p>
]]></description>
        </item>
    </channel>
</rss>