[패스트캠퍼스 수강 후기] 데이터분석 인강 100% 환급 챌린지 48회차 미션

14. ch 03. 텍스트 마이닝 - 03. 텍스트 데이터 분석 실습 - 15. ch 04. 감성분류 - 01. 텍스트 마이닝을 활용한 감성 분석

14. ch 03. 텍스트 마이닝 - 03. 텍스트 데이터 분석 실습

4) 텍스트 마이닝

4-1) 단어별 빈도 분석

워드 클라우드 시각화

!pip install pytagcloud pygame simplejson

from collections import Counter

import random
import pytagcloud
import webbrowser

ranked_tags = Counter(word_count_dict).most_common(25)
taglist = pytagcloud.make_tags(sorted(word_count_dict.items(), key=operator.itemgetter(1), reverse=True)[:40], maxsize=60)
pytagcloud.create_tag_image(taglist, 'wordcloud_example.jpg',
rectangular=False)

from IPython.display import Image
Image(filename='wordcloud_example.jpg')

상위 빈도수 단어 출력

Counter(word_count_dict).most_common(25)

더블클릭 또는 Enter 키를 눌러 수정

4-2) 장면별 중요 단어 시각화

TF-IDF 변환

from sklearn.feature_extraction.text import TfidfTransformer

tfidf_vectorizer = TfidfTransformer()
tf_idf_vect = tfidf_vectorizer.fit_transform(bow_vect)

print(tf_idf_vect.shape)
print(tf_idf_vect[0])

print(tf_idf_vect[0].toarray().shape)
print(tf_idf_vect[0].toarray())

벡터 : 단어 맵핑

invert_index_vectorizer = {v: k for k, v in vect.vocabulary_.items()}
print(str(invert_index_vectorizer)[:100]+'..')

중요 단어 추출 - Top 3 TF-IDF

np.argsort(tf_idf_vect[0].toarray())[0][-3:]

np.argsort(tf_idf_vect.toarray())[:, -3:]

top_3_word = np.argsort(tf_idf_vect.toarray())[:, -3:]
df['important_word_indexes'] = pd.Series(top_3_word.tolist())
df.head()

def convert_to_word(x):
word_list = []
for word in x:
word_list.append(invert_index_vectorizer[word])
return word_list

df['important_words'] = df['important_word_indexes'].apply(lambda x: convert_to_word(x))
df.head()

15. ch 04. 감성분류 - 01. 텍스트 마이닝을 활용한 감성 분석

패스트캠퍼스 데이터분석 강의 링크
bit.ly/3imy2uN

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

마진사의 공개적 사생활

[패스트캠퍼스 수강 후기] 데이터분석 인강 100% 환급 챌린지 48회차 미션

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역