pandas.DataFrame.drop_duplicates

DataFrame.drop_duplicates(self, subset=None, keep='first', inplace=False)[source]

Return DataFrame with duplicate rows removed, optionally only considering certain columns. Indexes, including time indexes are ignored.

Parameters
subsetcolumn label or sequence of labels, optional

Only consider certain columns for identifying duplicates, by default use all of the columns

keep{‘first’, ‘last’, False}, default ‘first’
  • first : Drop duplicates except for the first occurrence.

  • last : Drop duplicates except for the last occurrence.

  • False : Drop all duplicates.

inplaceboolean, default False

Whether to drop duplicates in place or to return a copy

Returns
DataFrame
Scroll To Top