Recent

Sunday, July 24, 2016

Python Pandas: Replacement method for convert_objects()

The DataFrames.convert_objects() in Pandas is a very useful function to try to infer better data types for you imported data.

For example if you have just imported hockey player stats and the data looks like:
df.dtypes
Out[1]: 
PLAYER    object
TEAM      object
GP        object
G         object
A         object
PTS       object
+/-       object
dtype: object

Using convert_objects:
df.convert_objects(convert_numeric=True).dtypes
 __main__:1: FutureWarning: convert_objects is deprecated.  Use the data-type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.
Out[2]: 
PLAYER     object
TEAM       object
GP          int64
G           int64
A           int64
PTS         int64
+/-         int64
dtype: object


The return information indicates that it is deprecated, but isn't clear on a suitable replacement, because while convert_objects() tried to infer all columns in the data frame, pandas.to_numeric() is applied to a specific column. The solution is to combine it with the DataFrame.apply():

df.apply(pd.to_numeric, errors='ignore').dtypes
Out[3]: 
PLAYER     object
TEAM       object
GP          int64
G           int64
A           int64
PTS         int64
+/-         int64
dtype: object

3 comments:

  1. Thanks for your help. I found I needed:

    df = df.apply(pd.to_numeric, errors='ignore')

    ReplyDelete
  2. Just found your post by searching on the Google, I am Impressed and Learned Lot of new thing from your post.
    hotmail signup

    ReplyDelete
  3. Notion examination and other AI procedures empower organizations to more readily address client input or foresee their questions. machine learning certification

    ReplyDelete