Pandas groupby and pick, and data type

Pandas goupby 再 pick 的时候，如果你直接 pick df.iloc[0]，返回的是一个 Series，Pandas 会重新根据这些 Series 组成一个 DataFrame，问题是，Pandas 会根据第一个 Series 猜测数据类型，比如 123 就会被猜测为数字，然而如果这一列后来有 abc 这样的 value，就会被 discard 掉。

解决方法是，在 pick 方法里，return 一个只有一行的 DataFrame ，而不是 return 这一行 (Series)。

def pick(x):
    # use .iloc[0:1] to preserve the dataframe's datatype, so that we are return a dataframe
    # otherwise if only use iloc[0], it will return a serie, and pandas will try to build a new dataframe based
    # on guessed datatype, and will discard later values that doesn't match its initial guess
    x = x.sort_values(['...'], ascending=False).iloc[0:1]
    return x

df = df.groupby('candidate_id').apply(pick)