토크나이징 시 phrase input should be string, not <class 'float'> 오류가 뜹니다
조회수 3902회
phrase input should be string, not <class 'float'>
오류가 뜹니다 어떻게 코드를 더 짜야할지 모르겠네요;;
아래 깃을 참조하여 코드를 돌린 상태이며 직접 크롤링을 한 데이터를 넣어 진행중에 있습니다 https://github.com/threegenie/sentiment_project/blob/main/kakaopage_playstore_review_analysis.ipynb
헤드는 다음과 같습니다.
1 답변
-
>>> import pandas as pd >>> from konlpy.tag import Okt >>> >>> df = pd.DataFrame({"Review":["맛있었어요", "너무 짜요.", "", "."]}) >>> okt = Okt() >>> df["tok"] = df["Review"].apply(okt.morphs) >>> df Review tok 0 맛있었어요 [맛있었어요] 1 너무 짜요. [너무, 짜요, .] 2 [] 3 . [.] >>> import numpy as np >>> df["Review1"] = df["Review"].replace("", np.nan) >>> df["tok1"] = df["Review1"].apply(okt.morphs) Traceback (most recent call last): File "<pyshell#12>", line 1, in <module> df["tok1"] = df["Review1"].apply(okt.morphs) File "C:\PROGRAMS\Python3864\lib\site-packages\pandas\core\series.py", line 4357, in apply return SeriesApply(self, func, convert_dtype, args, kwargs).apply() File "C:\PROGRAMS\Python3864\lib\site-packages\pandas\core\apply.py", line 1043, in apply return self.apply_standard() File "C:\PROGRAMS\Python3864\lib\site-packages\pandas\core\apply.py", line 1098, in apply_standard mapped = lib.map_infer( File "pandas\_libs\lib.pyx", line 2859, in pandas._libs.lib.map_infer File "C:\PROGRAMS\Python3864\lib\site-packages\konlpy\tag\_okt.py", line 89, in morphs return [s for s, t in self.pos(phrase, norm=norm, stem=stem)] File "C:\PROGRAMS\Python3864\lib\site-packages\konlpy\tag\_okt.py", line 69, in pos validate_phrase_inputs(phrase) File "C:\PROGRAMS\Python3864\lib\site-packages\konlpy\tag\_common.py", line 20, in validate_phrase_inputs assert isinstance(phrase, basestring), msg AssertionError: phrase input should be string, not <class 'float'>
replace 를 하지 않아야 하네요. np.nan 에 대해서 okt.morphs 를 실행할 때 발생하는 에러입니다.
댓글 입력