My problem is that I have a dataframe, work on another dataframe and the first edits too. Why could this be?
>untokenized_tweet_tp
text ... screenName
0 [month, open, #postdoc, position, chemical, ch... ... VRiffault
1 [hardworking, biofuel, producers, iowa, state,... ... LindaWa53201017
3 [today, time, imperative, resort, alternate, s... ... ROBRAIPUR
4 [special, gaetanos, beach, club, bell, choosin... ... buffbiodiesel
7 [stena, bulk, introduce, low, carbon, shipping... ... NPortuarias
... ... ...
130060 [reseter, elite, vegan, make, unacceptable, ea... ... Randy_Anglo
130171 [solar, wind, destroy, supply, limited, output... ... RealRichardBail
130331 [renewable, energy, defined, wood, wood, waste... ... PaulSchmehl
130375 [guess, aiding, wood, passion] ... GraceIrene21
130384 [homogenous, white, state, diversity, propagan... ... Randy_Anglo
[52411 rows x 3 columns]
for i in tweet_tp.index.values:
... tweet_tp.text[i] = TreebankWordDetokenizer().detokenize(tweet_tp.text[i])
...
>untokenized_tweet_tp
...
text ... screenName
0 month open #postdoc position chemical characte... ... VRiffault
1 hardworking biofuel producers iowa state worki... ... LindaWa53201017
3 today time imperative resort alternate sources... ... ROBRAIPUR
4 special gaetanos beach club bell choosing #rec... ... buffbiodiesel
7 stena bulk introduce low carbon shipping options ... NPortuarias
... ... ...
130060 reseter elite vegan make unacceptable eat meat... ... Randy_Anglo
130171 solar wind destroy supply limited output backe... ... RealRichardBail
130331 renewable energy defined wood wood waste munic... ... PaulSchmehl
130375 guess aiding wood passion ... GraceIrene21
130384 homogenous white state diversity propaganda wi... ... Randy_Anglo
[52411 rows x 3 columns]
Notice I never mentioned untokenized_tweet_tp
inside the for loop.
>type(tweet_tp)
<class 'pandas.core.frame.DataFrame'>
>type(untokenized_tweet_tp)
<class 'pandas.core.frame.DataFrame'>
untokenized_tweet_tp
first gets declared like this untokenizd_tweet_tp=tweet_tp