Python Pandas: Replacement method for convert_objects()

The DataFrames.convert_objects() in Pandas is a very useful function to try to infer better data types for you imported data.

For example if you have just imported hockey player stats and the data looks like:
df.dtypes
Out[1]: 
PLAYER    object
TEAM      object
GP        object
G         object
A         object
PTS       object
+/-       object
dtype: object

Using convert_objects:
df.convert_objects(convert_numeric=True).dtypes
 __main__:1: FutureWarning: convert_objects is deprecated.  Use the data-type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.
Out[2]: 
PLAYER     object
TEAM       object
GP          int64
G           int64
A           int64
PTS         int64
+/-         int64
dtype: object


The return information indicates that it is deprecated, but isn't clear on a suitable replacement, because while convert_objects() tried to infer all columns in the data frame, pandas.to_numeric() is applied to a specific column. The solution is to combine it with the DataFrame.apply():

df.apply(pd.to_numeric, errors='ignore').dtypes
Out[3]: 
PLAYER     object
TEAM       object
GP          int64
G           int64
A           int64
PTS         int64
+/-         int64
dtype: object

Comments

  1. Thanks for your help. I found I needed:

    df = df.apply(pd.to_numeric, errors='ignore')

    ReplyDelete

Post a Comment

Popular posts from this blog

Apple Pay, Android Pay, contactless credit cards, is it safe?

Failed CUDA Toolkit Install? Ubuntu 18.04 stuck on boot of Gnome Display Manager?

How Salesforce uses AWS to Improve The Support Call Experience