Table Of Contents

Search

Enter search terms or a module, class or function name.

API Reference

This page gives an overview of all public pandas objects, functions and methods. All classes and functions exposed in pandas.* namespace are public.

Some subpackages are public which include pandas.errors, pandas.plotting, and pandas.testing. Public functions in pandas.io and pandas.tseries submodules are mentioned in the documentation. pandas.api.types subpackage holds some public functions related to data types in pandas.

Warning

The pandas.core, pandas.compat, and pandas.util top-level modules are PRIVATE. Stable functionality in such modules is not guaranteed.

Input/Output

Pickling

read_pickle(path[, compression]) Load pickled pandas object (or any object) from file.

Flat File

read_table(filepath_or_buffer[, sep, …]) (DEPRECATED) Read general delimited file into DataFrame.
read_csv(filepath_or_buffer[, sep, …]) Read a comma-separated values (csv) file into DataFrame.
read_fwf(filepath_or_buffer[, colspecs, …]) Read a table of fixed-width formatted lines into DataFrame.
read_msgpack(path_or_buf[, encoding, iterator]) Load msgpack pandas object from the specified file path

Clipboard

read_clipboard([sep]) Read text from clipboard and pass to read_csv.

Excel

read_excel(io[, sheet_name, header, names, …]) Read an Excel table into a pandas DataFrame
ExcelFile.parse([sheet_name, header, names, …]) Parse specified sheet(s) into a DataFrame
ExcelWriter(path[, engine, date_format, …]) Class for writing DataFrame objects into excel sheets, default is to use xlwt for xls, openpyxl for xlsx.

JSON

read_json([path_or_buf, orient, typ, dtype, …]) Convert a JSON string to pandas object.
json_normalize(data[, record_path, meta, …]) Normalize semi-structured JSON data into a flat table.
build_table_schema(data[, index, …]) Create a Table schema from data.

HTML

read_html(io[, match, flavor, header, …]) Read HTML tables into a list of DataFrame objects.

HDFStore: PyTables (HDF5)

read_hdf(path_or_buf[, key, mode]) Read from the store, close it if we opened it.
HDFStore.put(key, value[, format, append]) Store object in HDFStore
HDFStore.append(key, value[, format, …]) Append to Table in file.
HDFStore.get(key) Retrieve pandas object stored in file
HDFStore.select(key[, where, start, stop, …]) Retrieve pandas object stored in file, optionally based on where criteria
HDFStore.info() Print detailed information on the store.
HDFStore.keys() Return a (potentially unordered) list of the keys corresponding to the objects stored in the HDFStore.
HDFStore.walk([where]) Walk the pytables group hierarchy for pandas objects

Feather

read_feather(path[, columns, use_threads]) Load a feather-format object from the file path

Parquet

read_parquet(path[, engine, columns]) Load a parquet object from the file path, returning a DataFrame.

SAS

read_sas(filepath_or_buffer[, format, …]) Read SAS files stored as either XPORT or SAS7BDAT format files.

SQL

read_sql_table(table_name, con[, schema, …]) Read SQL database table into a DataFrame.
read_sql_query(sql, con[, index_col, …]) Read SQL query into a DataFrame.
read_sql(sql, con[, index_col, …]) Read SQL query or database table into a DataFrame.

Google BigQuery

read_gbq(query[, project_id, index_col, …]) Load data from Google BigQuery.

STATA

read_stata(filepath_or_buffer[, …]) Read Stata file into DataFrame.
StataReader.data(**kwargs) (DEPRECATED) Reads observations from Stata file, converting them into a dataframe
StataReader.data_label() Returns data label of Stata file
StataReader.value_labels() Returns a dict, associating each variable name a dict, associating each value its corresponding label
StataReader.variable_labels() Returns variable labels as a dict, associating each variable name with corresponding label
StataWriter.write_file()

General functions

Data manipulations

melt(frame[, id_vars, value_vars, var_name, …]) Unpivots a DataFrame from wide format to long format, optionally leaving identifier variables set.
pivot(data[, index, columns, values]) Return reshaped DataFrame organized by given index / column values.
pivot_table(data[, values, index, columns, …]) Create a spreadsheet-style pivot table as a DataFrame.
crosstab(index, columns[, values, rownames, …]) Compute a simple cross-tabulation of two (or more) factors.
cut(x, bins[, right, labels, retbins, …]) Bin values into discrete intervals.
qcut(x, q[, labels, retbins, precision, …]) Quantile-based discretization function.
merge(left, right[, how, on, left_on, …]) Merge DataFrame or named Series objects with a database-style join.
merge_ordered(left, right[, on, left_on, …]) Perform merge with optional filling/interpolation designed for ordered data like time series data.
merge_asof(left, right[, on, left_on, …]) Perform an asof merge.
concat(objs[, axis, join, join_axes, …]) Concatenate pandas objects along a particular axis with optional set logic along the other axes.
get_dummies(data[, prefix, prefix_sep, …]) Convert categorical variable into dummy/indicator variables
factorize(values[, sort, order, …]) Encode the object as an enumerated type or categorical variable.
unique(values) Hash table-based unique.
wide_to_long(df, stubnames, i, j[, sep, suffix]) Wide panel to long format.

Top-level missing data

isna(obj) Detect missing values for an array-like object.
isnull(obj) Detect missing values for an array-like object.
notna(obj) Detect non-missing values for an array-like object.
notnull(obj) Detect non-missing values for an array-like object.

Top-level conversions

to_numeric(arg[, errors, downcast]) Convert argument to a numeric type.

Top-level dealing with datetimelike

to_datetime(arg[, errors, dayfirst, …]) Convert argument to datetime.
to_timedelta(arg[, unit, box, errors]) Convert argument to timedelta.
date_range([start, end, periods, freq, tz, …]) Return a fixed frequency DatetimeIndex.
bdate_range([start, end, periods, freq, tz, …]) Return a fixed frequency DatetimeIndex, with business day as the default frequency
period_range([start, end, periods, freq, name]) Return a fixed frequency PeriodIndex, with day (calendar) as the default frequency
timedelta_range([start, end, periods, freq, …]) Return a fixed frequency TimedeltaIndex, with day as the default frequency
infer_freq(index[, warn]) Infer the most likely frequency given the input index.

Top-level dealing with intervals

interval_range([start, end, periods, freq, …]) Return a fixed frequency IntervalIndex

Top-level evaluation

eval(expr[, parser, engine, truediv, …]) Evaluate a Python expression as a string using various backends.

Hashing

util.hash_array(vals[, encoding, hash_key, …]) Given a 1d array, return an array of deterministic integers.
util.hash_pandas_object(obj[, index, …]) Return a data hash of the Index/Series/DataFrame

Testing

test([extra_args])

Series

Constructor

Series([data, index, dtype, name, copy, …]) One-dimensional ndarray with axis labels (including time series).

Attributes

Axes

Series.index The index (axis labels) of the Series.
Series.values Return Series as ndarray or ndarray-like depending on the dtype.
Series.dtype Return the dtype object of the underlying data.
Series.ftype Return if the data is sparse|dense.
Series.shape Return a tuple of the shape of the underlying data.
Series.nbytes Return the number of bytes in the underlying data.
Series.ndim Number of dimensions of the underlying data, by definition 1.
Series.size Return the number of elements in the underlying data.
Series.strides Return the strides of the underlying data.
Series.itemsize Return the size of the dtype of the item of the underlying data.
Series.base Return the base object if the memory of the underlying data is shared.
Series.T Return the transpose, which is by definition self.
Series.memory_usage([index, deep]) Return the memory usage of the Series.
Series.hasnans Return if I have any nans; enables various perf speedups.
Series.flags
Series.empty
Series.dtypes Return the dtype object of the underlying data.
Series.ftypes Return if the data is sparse|dense.
Series.data Return the data pointer of the underlying data.
Series.is_copy Return the copy.
Series.name Return name of the Series.
Series.put(*args, **kwargs) Applies the put method to its values attribute if it has one.

Conversion

Series.astype(dtype[, copy, errors]) Cast a pandas object to a specified dtype dtype.
Series.infer_objects() Attempt to infer better dtypes for object columns.
Series.convert_objects([convert_dates, …]) (DEPRECATED) Attempt to infer better dtype for object columns.
Series.copy([deep]) Make a copy of this object’s indices and data.
Series.bool() Return the bool of a single element PandasObject.
Series.to_period([freq, copy]) Convert Series from DatetimeIndex to PeriodIndex with desired frequency (inferred from index if not passed).
Series.to_timestamp([freq, how, copy]) Cast to datetimeindex of timestamps, at beginning of period.
Series.to_list() Return a list of the values.
Series.get_values() Same as values (but handles sparseness conversions); is a view.

Indexing, iteration

Series.get(key[, default]) Get item from object for given key (DataFrame column, Panel slice, etc.).
Series.at Access a single value for a row/column label pair.
Series.iat Access a single value for a row/column pair by integer position.
Series.loc Access a group of rows and columns by label(s) or a boolean array.
Series.iloc Purely integer-location based indexing for selection by position.
Series.__iter__() Return an iterator of the values.
Series.iteritems() Lazily iterate over (index, value) tuples.
Series.items() Lazily iterate over (index, value) tuples.
Series.keys() Alias for index.
Series.pop(item) Return item and drop from frame.
Series.item() Return the first element of the underlying data as a python scalar.
Series.xs(key[, axis, level, drop_level]) Return cross-section from the Series/DataFrame.

For more information on .at, .iat, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Series.add(other[, level, fill_value, axis]) Addition of series and other, element-wise (binary operator add).
Series.sub(other[, level, fill_value, axis]) Subtraction of series and other, element-wise (binary operator sub).
Series.mul(other[, level, fill_value, axis]) Multiplication of series and other, element-wise (binary operator mul).
Series.div(other[, level, fill_value, axis]) Floating division of series and other, element-wise (binary operator truediv).
Series.truediv(other[, level, fill_value, axis]) Floating division of series and other, element-wise (binary operator truediv).
Series.floordiv(other[, level, fill_value, axis]) Integer division of series and other, element-wise (binary operator floordiv).
Series.mod(other[, level, fill_value, axis]) Modulo of series and other, element-wise (binary operator mod).
Series.pow(other[, level, fill_value, axis]) Exponential power of series and other, element-wise (binary operator pow).
Series.radd(other[, level, fill_value, axis]) Addition of series and other, element-wise (binary operator radd).
Series.rsub(other[, level, fill_value, axis]) Subtraction of series and other, element-wise (binary operator rsub).
Series.rmul(other[, level, fill_value, axis]) Multiplication of series and other, element-wise (binary operator rmul).
Series.rdiv(other[, level, fill_value, axis]) Floating division of series and other, element-wise (binary operator rtruediv).
Series.rtruediv(other[, level, fill_value, axis]) Floating division of series and other, element-wise (binary operator rtruediv).
Series.rfloordiv(other[, level, fill_value, …]) Integer division of series and other, element-wise (binary operator rfloordiv).
Series.rmod(other[, level, fill_value, axis]) Modulo of series and other, element-wise (binary operator rmod).
Series.rpow(other[, level, fill_value, axis]) Exponential power of series and other, element-wise (binary operator rpow).
Series.combine(other, func[, fill_value]) Combine the Series with a Series or scalar according to func.
Series.combine_first(other) Combine Series values, choosing the calling Series’s values first.
Series.round([decimals]) Round each value in a Series to the given number of decimals.
Series.lt(other[, level, fill_value, axis]) Less than of series and other, element-wise (binary operator lt).
Series.gt(other[, level, fill_value, axis]) Greater than of series and other, element-wise (binary operator gt).
Series.le(other[, level, fill_value, axis]) Less than or equal to of series and other, element-wise (binary operator le).
Series.ge(other[, level, fill_value, axis]) Greater than or equal to of series and other, element-wise (binary operator ge).
Series.ne(other[, level, fill_value, axis]) Not equal to of series and other, element-wise (binary operator ne).
Series.eq(other[, level, fill_value, axis]) Equal to of series and other, element-wise (binary operator eq).
Series.product([axis, skipna, level, …]) Return the product of the values for the requested axis.
Series.dot(other) Compute the dot product between the Series and the columns of other.

Function application, GroupBy & Window

Series.apply(func[, convert_dtype, args]) Invoke function on values of Series.
Series.agg(func[, axis]) Aggregate using one or more operations over the specified axis.
Series.aggregate(func[, axis]) Aggregate using one or more operations over the specified axis.
Series.transform(func[, axis]) Call func on self producing a Series with transformed values and that has the same axis length as self.
Series.map(arg[, na_action]) Map values of Series according to input correspondence.
Series.groupby([by, axis, level, as_index, …]) Group DataFrame or Series using a mapper or by a Series of columns.
Series.rolling(window[, min_periods, …]) Provides rolling window calculations.
Series.expanding([min_periods, center, axis]) Provides expanding transformations.
Series.ewm([com, span, halflife, alpha, …]) Provides exponential weighted functions.
Series.pipe(func, *args, **kwargs) Apply func(self, *args, **kwargs).

Computations / Descriptive Stats

Series.abs() Return a Series/DataFrame with absolute numeric value of each element.
Series.all([axis, bool_only, skipna, level]) Return whether all elements are True, potentially over an axis.
Series.any([axis, bool_only, skipna, level]) Return whether any element is True, potentially over an axis.
Series.autocorr([lag]) Compute the lag-N autocorrelation.
Series.between(left, right[, inclusive]) Return boolean Series equivalent to left <= series <= right.
Series.clip([lower, upper, axis, inplace]) Trim values at input threshold(s).
Series.clip_lower(threshold[, axis, inplace]) (DEPRECATED) Trim values below a given threshold.
Series.clip_upper(threshold[, axis, inplace]) (DEPRECATED) Trim values above a given threshold.
Series.corr(other[, method, min_periods]) Compute correlation with other Series, excluding missing values.
Series.count([level]) Return number of non-NA/null observations in the Series.
Series.cov(other[, min_periods]) Compute covariance with Series, excluding missing values.
Series.cummax([axis, skipna]) Return cumulative maximum over a DataFrame or Series axis.
Series.cummin([axis, skipna]) Return cumulative minimum over a DataFrame or Series axis.
Series.cumprod([axis, skipna]) Return cumulative product over a DataFrame or Series axis.
Series.cumsum([axis, skipna]) Return cumulative sum over a DataFrame or Series axis.
Series.describe([percentiles, include, exclude]) Generate descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.
Series.diff([periods]) First discrete difference of element.
Series.factorize([sort, na_sentinel]) Encode the object as an enumerated type or categorical variable.
Series.kurt([axis, skipna, level, numeric_only]) Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0).
Series.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis.
Series.max([axis, skipna, level, numeric_only]) This method returns the maximum of the values in the object.
Series.mean([axis, skipna, level, numeric_only]) Return the mean of the values for the requested axis.
Series.median([axis, skipna, level, …]) Return the median of the values for the requested axis.
Series.min([axis, skipna, level, numeric_only]) This method returns the minimum of the values in the object.
Series.mode([dropna]) Return the mode(s) of the dataset.
Series.nlargest([n, keep]) Return the largest n elements.
Series.nsmallest([n, keep]) Return the smallest n elements.
Series.pct_change([periods, fill_method, …]) Percentage change between the current and a prior element.
Series.prod([axis, skipna, level, …]) Return the product of the values for the requested axis.
Series.quantile([q, interpolation]) Return value at the given quantile.
Series.rank([axis, method, numeric_only, …]) Compute numerical data ranks (1 through n) along axis.
Series.sem([axis, skipna, level, ddof, …]) Return unbiased standard error of the mean over requested axis.
Series.skew([axis, skipna, level, numeric_only]) Return unbiased skew over requested axis Normalized by N-1.
Series.std([axis, skipna, level, ddof, …]) Return sample standard deviation over requested axis.
Series.sum([axis, skipna, level, …]) Return the sum of the values for the requested axis.
Series.var([axis, skipna, level, ddof, …]) Return unbiased variance over requested axis.
Series.kurtosis([axis, skipna, level, …]) Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0).
Series.unique() Return unique values of Series object.
Series.nunique([dropna]) Return number of unique elements in the object.
Series.is_unique Return boolean if values in the object are unique.
Series.is_monotonic Return boolean if values in the object are monotonic_increasing.
Series.is_monotonic_increasing Return boolean if values in the object are monotonic_increasing.
Series.is_monotonic_decreasing Return boolean if values in the object are monotonic_decreasing.
Series.value_counts([normalize, sort, …]) Return a Series containing counts of unique values.
Series.compound([axis, skipna, level]) Return the compound percentage of the values for the requested axis.
Series.nonzero() Return the integer indices of the elements that are non-zero.

Reindexing / Selection / Label manipulation

Series.align(other[, join, axis, level, …]) Align two objects on their axes with the specified join method for each axis Index.
Series.drop([labels, axis, index, columns, …]) Return Series with specified index labels removed.
Series.droplevel(level[, axis]) Return DataFrame with requested index / column level(s) removed.
Series.drop_duplicates([keep, inplace]) Return Series with duplicate values removed.
Series.duplicated([keep]) Indicate duplicate Series values.
Series.equals(other) Test whether two objects contain the same elements.
Series.first(offset) Convenience method for subsetting initial periods of time series data based on a date offset.
Series.head([n]) Return the first n rows.
Series.idxmax([axis, skipna]) Return the row label of the maximum value.
Series.idxmin([axis, skipna]) Return the row label of the minimum value.
Series.isin(values) Check whether values are contained in Series.
Series.last(offset) Convenience method for subsetting final periods of time series data based on a date offset.
Series.reindex([index]) Conform Series to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index.
Series.reindex_like(other[, method, copy, …]) Return an object with matching indices as other object.
Series.rename([index]) Alter Series index labels or name.
Series.rename_axis([mapper, index, columns, …]) Set the name of the axis for the index or columns.
Series.reset_index([level, drop, name, inplace]) Generate a new DataFrame or Series with the index reset.
Series.sample([n, frac, replace, weights, …]) Return a random sample of items from an axis of object.
Series.select(crit[, axis]) (DEPRECATED) Return data corresponding to axis labels matching criteria.
Series.set_axis(labels[, axis, inplace]) Assign desired index to given axis.
Series.take(indices[, axis, convert, is_copy]) Return the elements in the given positional indices along an axis.
Series.tail([n]) Return the last n rows.
Series.truncate([before, after, axis, copy]) Truncate a Series or DataFrame before and after some index value.
Series.where(cond[, other, inplace, axis, …]) Replace values where the condition is False.
Series.mask(cond[, other, inplace, axis, …]) Replace values where the condition is True.
Series.add_prefix(prefix) Prefix labels with string prefix.
Series.add_suffix(suffix) Suffix labels with string suffix.
Series.filter([items, like, regex, axis]) Subset rows or columns of dataframe according to labels in the specified index.

Missing data handling

Series.isna() Detect missing values.
Series.notna() Detect existing (non-missing) values.
Series.dropna([axis, inplace]) Return a new Series with missing values removed.
Series.fillna([value, method, axis, …]) Fill NA/NaN values using the specified method.
Series.interpolate([method, axis, limit, …]) Interpolate values according to different methods.

Reshaping, sorting

Series.argsort([axis, kind, order]) Overrides ndarray.argsort.
Series.argmin([axis, skipna]) (DEPRECATED) Return the row label of the minimum value.
Series.argmax([axis, skipna]) (DEPRECATED) Return the row label of the maximum value.
Series.reorder_levels(order) Rearrange index levels using input order.
Series.sort_values([axis, ascending, …]) Sort by the values.
Series.sort_index([axis, level, ascending, …]) Sort Series by index labels.
Series.swaplevel([i, j, copy]) Swap levels i and j in a MultiIndex.
Series.unstack([level, fill_value]) Unstack, a.k.a.
Series.searchsorted(value[, side, sorter]) Find indices where elements should be inserted to maintain order.
Series.ravel([order]) Return the flattened underlying data as an ndarray.
Series.repeat(repeats, *args, **kwargs) Repeat elements of an Series.
Series.squeeze([axis]) Squeeze 1 dimensional axis objects into scalars.
Series.view([dtype]) Create a new view of the Series.

Combining / joining / merging

Series.append(to_append[, ignore_index, …]) Concatenate two or more Series.
Series.replace([to_replace, value, inplace, …]) Replace values given in to_replace with value.
Series.update(other) Modify Series in place using non-NA values from passed Series.

Datetimelike Properties

Series.dt can be used to access the values of the series as datetimelike and return several properties. These can be accessed like Series.dt.<property>.

Datetime Properties

Series.dt.date Returns numpy array of python datetime.date objects (namely, the date part of Timestamps without timezone information).
Series.dt.time Returns numpy array of datetime.time.
Series.dt.timetz Returns numpy array of datetime.time also containing timezone information.
Series.dt.year The year of the datetime.
Series.dt.month The month as January=1, December=12.
Series.dt.day The days of the datetime.
Series.dt.hour The hours of the datetime.
Series.dt.minute The minutes of the datetime.
Series.dt.second The seconds of the datetime.
Series.dt.microsecond The microseconds of the datetime.
Series.dt.nanosecond The nanoseconds of the datetime.
Series.dt.week The week ordinal of the year.
Series.dt.weekofyear The week ordinal of the year.
Series.dt.dayofweek The day of the week with Monday=0, Sunday=6.
Series.dt.weekday The day of the week with Monday=0, Sunday=6.
Series.dt.dayofyear The ordinal day of the year.
Series.dt.quarter The quarter of the date.
Series.dt.is_month_start Indicates whether the date is the first day of the month.
Series.dt.is_month_end Indicates whether the date is the last day of the month.
Series.dt.is_quarter_start Indicator for whether the date is the first day of a quarter.
Series.dt.is_quarter_end Indicator for whether the date is the last day of a quarter.
Series.dt.is_year_start Indicate whether the date is the first day of a year.
Series.dt.is_year_end Indicate whether the date is the last day of the year.
Series.dt.is_leap_year Boolean indicator if the date belongs to a leap year.
Series.dt.daysinmonth The number of days in the month.
Series.dt.days_in_month The number of days in the month.
Series.dt.tz
Series.dt.freq

Datetime Methods

Series.dt.to_period(*args, **kwargs) Cast to PeriodArray/Index at a particular frequency.
Series.dt.to_pydatetime() Return the data as an array of native Python datetime objects.
Series.dt.tz_localize(*args, **kwargs) Localize tz-naive Datetime Array/Index to tz-aware Datetime Array/Index.
Series.dt.tz_convert(*args, **kwargs) Convert tz-aware Datetime Array/Index from one time zone to another.
Series.dt.normalize(*args, **kwargs) Convert times to midnight.
Series.dt.strftime(*args, **kwargs) Convert to Index using specified date_format.
Series.dt.round(*args, **kwargs) Perform round operation on the data to the specified freq.
Series.dt.floor(*args, **kwargs) Perform floor operation on the data to the specified freq.
Series.dt.ceil(*args, **kwargs) Perform ceil operation on the data to the specified freq.
Series.dt.month_name(*args, **kwargs) Return the month names of the DateTimeIndex with specified locale.
Series.dt.day_name(*args, **kwargs) Return the day names of the DateTimeIndex with specified locale.

Timedelta Properties

Series.dt.days Number of days for each element.
Series.dt.seconds Number of seconds (>= 0 and less than 1 day) for each element.
Series.dt.microseconds Number of microseconds (>= 0 and less than 1 second) for each element.
Series.dt.nanoseconds Number of nanoseconds (>= 0 and less than 1 microsecond) for each element.
Series.dt.components Return a Dataframe of the components of the Timedeltas.

Timedelta Methods

Series.dt.to_pytimedelta() Return an array of native datetime.timedelta objects.
Series.dt.total_seconds(*args, **kwargs) Return total duration of each element expressed in seconds.

String handling

Series.str can be used to access the values of the series as strings and apply several methods to it. These can be accessed like Series.str.<function/property>.

Series.str.capitalize() Convert strings in the Series/Index to be capitalized.
Series.str.cat([others, sep, na_rep, join]) Concatenate strings in the Series/Index with given separator.
Series.str.center(width[, fillchar]) Filling left and right side of strings in the Series/Index with an additional character.
Series.str.contains(pat[, case, flags, na, …]) Test if pattern or regex is contained within a string of a Series or Index.
Series.str.count(pat[, flags]) Count occurrences of pattern in each string of the Series/Index.
Series.str.decode(encoding[, errors]) Decode character string in the Series/Index using indicated encoding.
Series.str.encode(encoding[, errors]) Encode character string in the Series/Index using indicated encoding.
Series.str.endswith(pat[, na]) Test if the end of each string element matches a pattern.
Series.str.extract(pat[, flags, expand]) Extract capture groups in the regex pat as columns in a DataFrame.
Series.str.extractall(pat[, flags]) For each subject string in the Series, extract groups from all matches of regular expression pat.
Series.str.find(sub[, start, end]) Return lowest indexes in each strings in the Series/Index where the substring is fully contained between [start:end].
Series.str.findall(pat[, flags]) Find all occurrences of pattern or regular expression in the Series/Index.
Series.str.get(i) Extract element from each component at specified position.
Series.str.index(sub[, start, end]) Return lowest indexes in each strings where the substring is fully contained between [start:end].
Series.str.join(sep) Join lists contained as elements in the Series/Index with passed delimiter.
Series.str.len() Computes the length of each element in the Series/Index.
Series.str.ljust(width[, fillchar]) Filling right side of strings in the Series/Index with an additional character.
Series.str.lower() Convert strings in the Series/Index to lowercase.
Series.str.lstrip([to_strip]) Remove leading and trailing characters.
Series.str.match(pat[, case, flags, na]) Determine if each string matches a regular expression.
Series.str.normalize(form) Return the Unicode normal form for the strings in the Series/Index.
Series.str.pad(width[, side, fillchar]) Pad strings in the Series/Index up to width.
Series.str.partition([sep, expand]) Split the string at the first occurrence of sep.
Series.str.repeat(repeats) Duplicate each string in the Series or Index.
Series.str.replace(pat, repl[, n, case, …]) Replace occurrences of pattern/regex in the Series/Index with some other string.
Series.str.rfind(sub[, start, end]) Return highest indexes in each strings in the Series/Index where the substring is fully contained between [start:end].
Series.str.rindex(sub[, start, end]) Return highest indexes in each strings where the substring is fully contained between [start:end].
Series.str.rjust(width[, fillchar]) Filling left side of strings in the Series/Index with an additional character.
Series.str.rpartition([sep, expand]) Split the string at the last occurrence of sep.
Series.str.rstrip([to_strip]) Remove leading and trailing characters.
Series.str.slice([start, stop, step]) Slice substrings from each element in the Series or Index.
Series.str.slice_replace([start, stop, repl]) Replace a positional slice of a string with another value.
Series.str.split([pat, n, expand]) Split strings around given separator/delimiter.
Series.str.rsplit([pat, n, expand]) Split strings around given separator/delimiter.
Series.str.startswith(pat[, na]) Test if the start of each string element matches a pattern.
Series.str.strip([to_strip]) Remove leading and trailing characters.
Series.str.swapcase() Convert strings in the Series/Index to be swapcased.
Series.str.title() Convert strings in the Series/Index to titlecase.
Series.str.translate(table[, deletechars]) Map all characters in the string through the given mapping table.
Series.str.upper() Convert strings in the Series/Index to uppercase.
Series.str.wrap(width, **kwargs) Wrap long strings in the Series/Index to be formatted in paragraphs with length less than a given width.
Series.str.zfill(width) Pad strings in the Series/Index by prepending ‘0’ characters.
Series.str.isalnum() Check whether all characters in each string are alphanumeric.
Series.str.isalpha() Check whether all characters in each string are alphabetic.
Series.str.isdigit() Check whether all characters in each string are digits.
Series.str.isspace() Check whether all characters in each string are whitespace.
Series.str.islower() Check whether all characters in each string are lowercase.
Series.str.isupper() Check whether all characters in each string are uppercase.
Series.str.istitle() Check whether all characters in each string are titlecase.
Series.str.isnumeric() Check whether all characters in each string are numeric.
Series.str.isdecimal() Check whether all characters in each string are decimal.
Series.str.get_dummies([sep]) Split each string in the Series by sep and return a frame of dummy/indicator variables.

Categorical

Pandas defines a custom data type for representing data that can take only a limited, fixed set of values. The dtype of a Categorical can be described by a pandas.api.types.CategoricalDtype.

api.types.CategoricalDtype([categories, ordered]) Type for categorical data with the categories and orderedness
api.types.CategoricalDtype.categories An Index containing the unique categories allowed.
api.types.CategoricalDtype.ordered Whether the categories have an ordered relationship.

Categorical data can be stored in a pandas.Categorical

Categorical(values[, categories, ordered, …]) Represents a categorical variable in classic R / S-plus fashion

The alternative Categorical.from_codes() constructor can be used when you have the categories and integer codes already:

Categorical.from_codes(codes, categories[, …]) Make a Categorical type from codes and categories arrays.

The dtype information is available on the Categorical

Categorical.dtype The CategoricalDtype for this instance
Categorical.categories The categories of this categorical.
Categorical.ordered Whether the categories have an ordered relationship.
Categorical.codes The category codes of this categorical.

np.asarray(categorical) works by implementing the array interface. Be aware, that this converts the Categorical back to a NumPy array, so categories and order information is not preserved!

Categorical.__array__([dtype]) The numpy array interface.

A Categorical can be stored in a Series or DataFrame. To create a Series of dtype category, use cat = s.astype(dtype) or Series(..., dtype=dtype) where dtype is either

If the Series is of dtype CategoricalDtype, Series.cat can be used to change the categorical data. This accessor is similar to the Series.dt or Series.str and has the following usable methods and properties:

Series.cat.categories The categories of this categorical.
Series.cat.ordered Whether the categories have an ordered relationship.
Series.cat.codes Return Series of codes as well as the index.
Series.cat.rename_categories(*args, **kwargs) Renames categories.
Series.cat.reorder_categories(*args, **kwargs) Reorders categories as specified in new_categories.
Series.cat.add_categories(*args, **kwargs) Add new categories.
Series.cat.remove_categories(*args, **kwargs) Removes the specified categories.
Series.cat.remove_unused_categories(*args, …) Removes categories which are not used.
Series.cat.set_categories(*args, **kwargs) Sets the categories to the specified new_categories.
Series.cat.as_ordered(*args, **kwargs) Set the Categorical to be ordered.
Series.cat.as_unordered(*args, **kwargs) Set the Categorical to be unordered.

Plotting

Series.plot is both a callable method and a namespace attribute for specific plotting methods of the form Series.plot.<kind>.

Series.plot([kind, ax, figsize, ….]) Series plotting accessor and method
Series.plot.area(**kwds) Area plot.
Series.plot.bar(**kwds) Vertical bar plot.
Series.plot.barh(**kwds) Horizontal bar plot.
Series.plot.box(**kwds) Boxplot.
Series.plot.density([bw_method, ind]) Generate Kernel Density Estimate plot using Gaussian kernels.
Series.plot.hist([bins]) Histogram.
Series.plot.kde([bw_method, ind]) Generate Kernel Density Estimate plot using Gaussian kernels.
Series.plot.line(**kwds) Line plot.
Series.plot.pie(**kwds) Pie chart.
Series.hist([by, ax, grid, xlabelsize, …]) Draw histogram of the input series using matplotlib.

Serialization / IO / Conversion

Series.to_pickle(path[, compression, protocol]) Pickle (serialize) object to file.
Series.to_csv(*args, **kwargs) Write object to a comma-separated values (csv) file.
Series.to_dict([into]) Convert Series to {label -> value} dict or dict-like object.
Series.to_excel(excel_writer[, sheet_name, …]) Write object to an Excel sheet.
Series.to_frame([name]) Convert Series to DataFrame.
Series.to_xarray() Return an xarray object from the pandas object.
Series.to_hdf(path_or_buf, key, **kwargs) Write the contained data to an HDF5 file using HDFStore.
Series.to_sql(name, con[, schema, …]) Write records stored in a DataFrame to a SQL database.
Series.to_msgpack([path_or_buf, encoding]) Serialize object to input file path using msgpack format.
Series.to_json([path_or_buf, orient, …]) Convert the object to a JSON string.
Series.to_sparse([kind, fill_value]) Convert Series to SparseSeries.
Series.to_dense() Return dense representation of NDFrame (as opposed to sparse).
Series.to_string([buf, na_rep, …]) Render a string representation of the Series.
Series.to_clipboard([excel, sep]) Copy object to the system clipboard.
Series.to_latex([buf, columns, col_space, …]) Render an object to a LaTeX tabular environment table.

Sparse

SparseSeries.to_coo([row_levels, …]) Create a scipy.sparse.coo_matrix from a SparseSeries with MultiIndex.
SparseSeries.from_coo(A[, dense_index]) Create a SparseSeries from a scipy.sparse.coo_matrix.
Series.sparse.npoints The number of non- fill_value points.
Series.sparse.density The percent of non- fill_value points, as decimal.
Series.sparse.fill_value Elements in data that are fill_value are not stored.
Series.sparse.sp_values An ndarray containing the non- fill_value values.
Series.sparse.from_coo(A[, dense_index]) Create a SparseSeries from a scipy.sparse.coo_matrix.
Series.sparse.to_coo([row_levels, …]) Create a scipy.sparse.coo_matrix from a SparseSeries with MultiIndex.

DataFrame

Constructor

DataFrame([data, index, columns, dtype, copy]) Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns).

Attributes and underlying data

Axes

DataFrame.index The index (row labels) of the DataFrame.
DataFrame.columns The column labels of the DataFrame.
DataFrame.dtypes Return the dtypes in the DataFrame.
DataFrame.ftypes Return the ftypes (indication of sparse/dense and dtype) in DataFrame.
DataFrame.get_dtype_counts() Return counts of unique dtypes in this object.
DataFrame.get_ftype_counts() (DEPRECATED) Return counts of unique ftypes in this object.
DataFrame.select_dtypes([include, exclude]) Return a subset of the DataFrame’s columns based on the column dtypes.
DataFrame.values Return a Numpy representation of the DataFrame.
DataFrame.get_values() Return an ndarray after converting sparse values to dense.
DataFrame.axes Return a list representing the axes of the DataFrame.
DataFrame.ndim Return an int representing the number of axes / array dimensions.
DataFrame.size Return an int representing the number of elements in this object.
DataFrame.shape Return a tuple representing the dimensionality of the DataFrame.
DataFrame.memory_usage([index, deep]) Return the memory usage of each column in bytes.
DataFrame.empty Indicator whether DataFrame is empty.
DataFrame.is_copy Return the copy.

Conversion

DataFrame.astype(dtype[, copy, errors]) Cast a pandas object to a specified dtype dtype.
DataFrame.convert_objects([convert_dates, …]) (DEPRECATED) Attempt to infer better dtype for object columns.
DataFrame.infer_objects() Attempt to infer better dtypes for object columns.
DataFrame.copy([deep]) Make a copy of this object’s indices and data.
DataFrame.isna() Detect missing values.
DataFrame.notna() Detect existing (non-missing) values.
DataFrame.bool() Return the bool of a single element PandasObject.

Indexing, iteration

DataFrame.head([n]) Return the first n rows.
DataFrame.at Access a single value for a row/column label pair.
DataFrame.iat Access a single value for a row/column pair by integer position.
DataFrame.loc Access a group of rows and columns by label(s) or a boolean array.
DataFrame.iloc Purely integer-location based indexing for selection by position.
DataFrame.insert(loc, column, value[, …]) Insert column into DataFrame at specified location.
DataFrame.__iter__() Iterate over infor axis
DataFrame.items() Iterator over (column name, Series) pairs.
DataFrame.keys() Get the ‘info axis’ (see Indexing for more)
DataFrame.iteritems() Iterator over (column name, Series) pairs.
DataFrame.iterrows() Iterate over DataFrame rows as (index, Series) pairs.
DataFrame.itertuples([index, name]) Iterate over DataFrame rows as namedtuples.
DataFrame.lookup(row_labels, col_labels) Label-based “fancy indexing” function for DataFrame.
DataFrame.pop(item) Return item and drop from frame.
DataFrame.tail([n]) Return the last n rows.
DataFrame.xs(key[, axis, level, drop_level]) Return cross-section from the Series/DataFrame.
DataFrame.get(key[, default]) Get item from object for given key (DataFrame column, Panel slice, etc.).
DataFrame.isin(values) Whether each element in the DataFrame is contained in values.
DataFrame.where(cond[, other, inplace, …]) Replace values where the condition is False.
DataFrame.mask(cond[, other, inplace, axis, …]) Replace values where the condition is True.
DataFrame.query(expr[, inplace]) Query the columns of a DataFrame with a boolean expression.

For more information on .at, .iat, .loc, and .iloc, see the indexing documentation.

Binary operator functions

DataFrame.add(other[, axis, level, fill_value]) Addition of dataframe and other, element-wise (binary operator add).
DataFrame.sub(other[, axis, level, fill_value]) Subtraction of dataframe and other, element-wise (binary operator sub).
DataFrame.mul(other[, axis, level, fill_value]) Multiplication of dataframe and other, element-wise (binary operator mul).
DataFrame.div(other[, axis, level, fill_value]) Floating division of dataframe and other, element-wise (binary operator truediv).
DataFrame.truediv(other[, axis, level, …]) Floating division of dataframe and other, element-wise (binary operator truediv).
DataFrame.floordiv(other[, axis, level, …]) Integer division of dataframe and other, element-wise (binary operator floordiv).
DataFrame.mod(other[, axis, level, fill_value]) Modulo of dataframe and other, element-wise (binary operator mod).
DataFrame.pow(other[, axis, level, fill_value]) Exponential power of dataframe and other, element-wise (binary operator pow).
DataFrame.dot(other) Matrix multiplication with DataFrame or Series objects.
DataFrame.radd(other[, axis, level, fill_value]) Addition of dataframe and other, element-wise (binary operator radd).
DataFrame.rsub(other[, axis, level, fill_value]) Subtraction of dataframe and other, element-wise (binary operator rsub).
DataFrame.rmul(other[, axis, level, fill_value]) Multiplication of dataframe and other, element-wise (binary operator rmul).
DataFrame.rdiv(other[, axis, level, fill_value]) Floating division of dataframe and other, element-wise (binary operator rtruediv).
DataFrame.rtruediv(other[, axis, level, …]) Floating division of dataframe and other, element-wise (binary operator rtruediv).
DataFrame.rfloordiv(other[, axis, level, …]) Integer division of dataframe and other, element-wise (binary operator rfloordiv).
DataFrame.rmod(other[, axis, level, fill_value]) Modulo of dataframe and other, element-wise (binary operator rmod).
DataFrame.rpow(other[, axis, level, fill_value]) Exponential power of dataframe and other, element-wise (binary operator rpow).
DataFrame.lt(other[, axis, level]) Less than of dataframe and other, element-wise (binary operator lt).
DataFrame.gt(other[, axis, level]) Greater than of dataframe and other, element-wise (binary operator gt).
DataFrame.le(other[, axis, level]) Less than or equal to of dataframe and other, element-wise (binary operator le).
DataFrame.ge(other[, axis, level]) Greater than or equal to of dataframe and other, element-wise (binary operator ge).
DataFrame.ne(other[, axis, level]) Not equal to of dataframe and other, element-wise (binary operator ne).
DataFrame.eq(other[, axis, level]) Equal to of dataframe and other, element-wise (binary operator eq).
DataFrame.combine(other, func[, fill_value, …]) Perform column-wise combine with another DataFrame based on a passed function.
DataFrame.combine_first(other) Update null elements with value in the same location in other.

Function application, GroupBy & Window

DataFrame.apply(func[, axis, broadcast, …]) Apply a function along an axis of the DataFrame.
DataFrame.applymap(func) Apply a function to a Dataframe elementwise.
DataFrame.pipe(func, *args, **kwargs) Apply func(self, *args, **kwargs).
DataFrame.agg(func[, axis]) Aggregate using one or more operations over the specified axis.
DataFrame.aggregate(func[, axis]) Aggregate using one or more operations over the specified axis.
DataFrame.transform(func[, axis]) Call func on self producing a DataFrame with transformed values and that has the same axis length as self.
DataFrame.groupby([by, axis, level, …]) Group DataFrame or Series using a mapper or by a Series of columns.
DataFrame.rolling(window[, min_periods, …]) Provides rolling window calculations.
DataFrame.expanding([min_periods, center, axis]) Provides expanding transformations.
DataFrame.ewm([com, span, halflife, alpha, …]) Provides exponential weighted functions.

Computations / Descriptive Stats

DataFrame.abs() Return a Series/DataFrame with absolute numeric value of each element.
DataFrame.all([axis, bool_only, skipna, level]) Return whether all elements are True, potentially over an axis.
DataFrame.any([axis, bool_only, skipna, level]) Return whether any element is True, potentially over an axis.
DataFrame.clip([lower, upper, axis, inplace]) Trim values at input threshold(s).
DataFrame.clip_lower(threshold[, axis, inplace]) (DEPRECATED) Trim values below a given threshold.
DataFrame.clip_upper(threshold[, axis, inplace]) (DEPRECATED) Trim values above a given threshold.
DataFrame.compound([axis, skipna, level]) Return the compound percentage of the values for the requested axis.
DataFrame.corr([method, min_periods]) Compute pairwise correlation of columns, excluding NA/null values.
DataFrame.corrwith(other[, axis, drop]) Compute pairwise correlation between rows or columns of two DataFrame objects.
DataFrame.count([axis, level, numeric_only]) Count non-NA cells for each column or row.
DataFrame.cov([min_periods]) Compute pairwise covariance of columns, excluding NA/null values.
DataFrame.cummax([axis, skipna]) Return cumulative maximum over a DataFrame or Series axis.
DataFrame.cummin([axis, skipna]) Return cumulative minimum over a DataFrame or Series axis.
DataFrame.cumprod([axis, skipna]) Return cumulative product over a DataFrame or Series axis.
DataFrame.cumsum([axis, skipna]) Return cumulative sum over a DataFrame or Series axis.
DataFrame.describe([percentiles, include, …]) Generate descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.
DataFrame.diff([periods, axis]) First discrete difference of element.
DataFrame.eval(expr[, inplace]) Evaluate a string describing operations on DataFrame columns.
DataFrame.kurt([axis, skipna, level, …]) Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0).
DataFrame.kurtosis([axis, skipna, level, …]) Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0).
DataFrame.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis.
DataFrame.max([axis, skipna, level, …]) This method returns the maximum of the values in the object.
DataFrame.mean([axis, skipna, level, …]) Return the mean of the values for the requested axis.
DataFrame.median([axis, skipna, level, …]) Return the median of the values for the requested axis.
DataFrame.min([axis, skipna, level, …]) This method returns the minimum of the values in the object.
DataFrame.mode([axis, numeric_only, dropna]) Get the mode(s) of each element along the selected axis.
DataFrame.pct_change([periods, fill_method, …]) Percentage change between the current and a prior element.
DataFrame.prod([axis, skipna, level, …]) Return the product of the values for the requested axis.
DataFrame.product([axis, skipna, level, …]) Return the product of the values for the requested axis.
DataFrame.quantile([q, axis, numeric_only, …]) Return values at the given quantile over requested axis.
DataFrame.rank([axis, method, numeric_only, …]) Compute numerical data ranks (1 through n) along axis.
DataFrame.round([decimals]) Round a DataFrame to a variable number of decimal places.
DataFrame.sem([axis, skipna, level, ddof, …]) Return unbiased standard error of the mean over requested axis.
DataFrame.skew([axis, skipna, level, …]) Return unbiased skew over requested axis Normalized by N-1.
DataFrame.sum([axis, skipna, level, …]) Return the sum of the values for the requested axis.
DataFrame.std([axis, skipna, level, ddof, …]) Return sample standard deviation over requested axis.
DataFrame.var([axis, skipna, level, ddof, …]) Return unbiased variance over requested axis.
DataFrame.nunique([axis, dropna]) Count distinct observations over requested axis.

Reindexing / Selection / Label manipulation

DataFrame.add_prefix(prefix) Prefix labels with string prefix.
DataFrame.add_suffix(suffix) Suffix labels with string suffix.
DataFrame.align(other[, join, axis, level, …]) Align two objects on their axes with the specified join method for each axis Index.
DataFrame.at_time(time[, asof, axis]) Select values at particular time of day (e.g.
DataFrame.between_time(start_time, end_time) Select values between particular times of the day (e.g., 9:00-9:30 AM).
DataFrame.drop([labels, axis, index, …]) Drop specified labels from rows or columns.
DataFrame.drop_duplicates([subset, keep, …]) Return DataFrame with duplicate rows removed, optionally only considering certain columns.
DataFrame.duplicated([subset, keep]) Return boolean Series denoting duplicate rows, optionally only considering certain columns.
DataFrame.equals(other) Test whether two objects contain the same elements.
DataFrame.filter([items, like, regex, axis]) Subset rows or columns of dataframe according to labels in the specified index.
DataFrame.first(offset) Convenience method for subsetting initial periods of time series data based on a date offset.
DataFrame.head([n]) Return the first n rows.
DataFrame.idxmax([axis, skipna]) Return index of first occurrence of maximum over requested axis.
DataFrame.idxmin([axis, skipna]) Return index of first occurrence of minimum over requested axis.
DataFrame.last(offset) Convenience method for subsetting final periods of time series data based on a date offset.
DataFrame.reindex([labels, index, columns, …]) Conform DataFrame to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index.
DataFrame.reindex_axis(labels[, axis, …]) (DEPRECATED) Conform input object to new index.
DataFrame.reindex_like(other[, method, …]) Return an object with matching indices as other object.
DataFrame.rename([mapper, index, columns, …]) Alter axes labels.
DataFrame.rename_axis([mapper, index, …]) Set the name of the axis for the index or columns.
DataFrame.reset_index([level, drop, …]) Reset the index, or a level of it.
DataFrame.sample([n, frac, replace, …]) Return a random sample of items from an axis of object.
DataFrame.select(crit[, axis]) (DEPRECATED) Return data corresponding to axis labels matching criteria.
DataFrame.set_axis(labels[, axis, inplace]) Assign desired index to given axis.
DataFrame.set_index(keys[, drop, append, …]) Set the DataFrame index using existing columns.
DataFrame.tail([n]) Return the last n rows.
DataFrame.take(indices[, axis, convert, is_copy]) Return the elements in the given positional indices along an axis.
DataFrame.truncate([before, after, axis, copy]) Truncate a Series or DataFrame before and after some index value.

Missing data handling

DataFrame.dropna([axis, how, thresh, …]) Remove missing values.
DataFrame.fillna([value, method, axis, …]) Fill NA/NaN values using the specified method.
DataFrame.replace([to_replace, value, …]) Replace values given in to_replace with value.
DataFrame.interpolate([method, axis, limit, …]) Interpolate values according to different methods.

Reshaping, sorting, transposing

DataFrame.droplevel(level[, axis]) Return DataFrame with requested index / column level(s) removed.
DataFrame.pivot([index, columns, values]) Return reshaped DataFrame organized by given index / column values.
DataFrame.pivot_table([values, index, …]) Create a spreadsheet-style pivot table as a DataFrame.
DataFrame.reorder_levels(order[, axis]) Rearrange index levels using input order.
DataFrame.sort_values(by[, axis, ascending, …]) Sort by the values along either axis
DataFrame.sort_index([axis, level, …]) Sort object by labels (along an axis)
DataFrame.nlargest(n, columns[, keep]) Return the first n rows ordered by columns in descending order.
DataFrame.nsmallest(n, columns[, keep]) Return the first n rows ordered by columns in ascending order.
DataFrame.swaplevel([i, j, axis]) Swap levels i and j in a MultiIndex on a particular axis.
DataFrame.stack([level, dropna]) Stack the prescribed level(s) from columns to index.
DataFrame.unstack([level, fill_value]) Pivot a level of the (necessarily hierarchical) index labels, returning a DataFrame having a new level of column labels whose inner-most level consists of the pivoted index labels.
DataFrame.swapaxes(axis1, axis2[, copy]) Interchange axes and swap values axes appropriately.
DataFrame.melt([id_vars, value_vars, …]) Unpivots a DataFrame from wide format to long format, optionally leaving identifier variables set.
DataFrame.squeeze([axis]) Squeeze 1 dimensional axis objects into scalars.
DataFrame.to_panel() (DEPRECATED) Transform long (stacked) format (DataFrame) into wide (3D, Panel) format.
DataFrame.to_xarray() Return an xarray object from the pandas object.
DataFrame.T Transpose index and columns.
DataFrame.transpose(*args, **kwargs) Transpose index and columns.

Combining / joining / merging

DataFrame.append(other[, ignore_index, …]) Append rows of other to the end of caller, returning a new object.
DataFrame.assign(**kwargs) Assign new columns to a DataFrame.
DataFrame.join(other[, on, how, lsuffix, …]) Join columns of another DataFrame.
DataFrame.merge(right[, how, on, left_on, …]) Merge DataFrame or named Series objects with a database-style join.
DataFrame.update(other[, join, overwrite, …]) Modify in place using non-NA values from another DataFrame.

Time series-related

DataFrame.asfreq(freq[, method, how, …]) Convert TimeSeries to specified frequency.
DataFrame.asof(where[, subset]) Return the last row(s) without any NaNs before where.
DataFrame.shift([periods, freq, axis]) Shift index by desired number of periods with an optional time freq.
DataFrame.slice_shift([periods, axis]) Equivalent to shift without copying data.
DataFrame.tshift([periods, freq, axis]) Shift the time index, using the index’s frequency if available.
DataFrame.first_valid_index() Return index for first non-NA/null value.
DataFrame.last_valid_index() Return index for last non-NA/null value.
DataFrame.resample(rule[, how, axis, …]) Resample time-series data.
DataFrame.to_period([freq, axis, copy]) Convert DataFrame from DatetimeIndex to PeriodIndex with desired frequency (inferred from index if not passed).
DataFrame.to_timestamp([freq, how, axis, copy]) Cast to DatetimeIndex of timestamps, at beginning of period.
DataFrame.tz_convert(tz[, axis, level, copy]) Convert tz-aware axis to target time zone.
DataFrame.tz_localize(tz[, axis, level, …]) Localize tz-naive TimeSeries to target time zone.

Plotting

DataFrame.plot is both a callable method and a namespace attribute for specific plotting methods of the form DataFrame.plot.<kind>.

DataFrame.plot([x, y, kind, ax, ….]) DataFrame plotting accessor and method
DataFrame.plot.area([x, y]) Draw a stacked area plot.
DataFrame.plot.bar([x, y]) Vertical bar plot.
DataFrame.plot.barh([x, y]) Make a horizontal bar plot.
DataFrame.plot.box([by]) Make a box plot of the DataFrame columns.
DataFrame.plot.density([bw_method, ind]) Generate Kernel Density Estimate plot using Gaussian kernels.
DataFrame.plot.hexbin(x, y[, C, …]) Generate a hexagonal binning plot.
DataFrame.plot.hist([by, bins]) Draw one histogram of the DataFrame’s columns.
DataFrame.plot.kde([bw_method, ind]) Generate Kernel Density Estimate plot using Gaussian kernels.
DataFrame.plot.line([x, y]) Plot DataFrame columns as lines.
DataFrame.plot.pie([y]) Generate a pie plot.
DataFrame.plot.scatter(x, y[, s, c]) Create a scatter plot with varying marker point size and color.
DataFrame.boxplot([column, by, ax, …]) Make a box plot from DataFrame columns.
DataFrame.hist([column, by, grid, …]) Make a histogram of the DataFrame’s.

Serialization / IO / Conversion

DataFrame.from_csv(path[, header, sep, …]) (DEPRECATED) Read CSV file.
DataFrame.from_dict(data[, orient, dtype, …]) Construct DataFrame from dict of array-like or dicts.
DataFrame.from_items(items[, columns, orient]) (DEPRECATED) Construct a DataFrame from a list of tuples.
DataFrame.from_records(data[, index, …]) Convert structured or record ndarray to DataFrame.
DataFrame.info([verbose, buf, max_cols, …]) Print a concise summary of a DataFrame.
DataFrame.to_parquet(fname[, engine, …]) Write a DataFrame to the binary parquet format.
DataFrame.to_pickle(path[, compression, …]) Pickle (serialize) object to file.
DataFrame.to_csv([path_or_buf, sep, na_rep, …]) Write object to a comma-separated values (csv) file.
DataFrame.to_hdf(path_or_buf, key, **kwargs) Write the contained data to an HDF5 file using HDFStore.
DataFrame.to_sql(name, con[, schema, …]) Write records stored in a DataFrame to a SQL database.
DataFrame.to_dict([orient, into]) Convert the DataFrame to a dictionary.
DataFrame.to_excel(excel_writer[, …]) Write object to an Excel sheet.
DataFrame.to_json([path_or_buf, orient, …]) Convert the object to a JSON string.
DataFrame.to_html([buf, columns, col_space, …]) Render a DataFrame as an HTML table.
DataFrame.to_feather(fname) Write out the binary feather-format for DataFrames.
DataFrame.to_latex([buf, columns, …]) Render an object to a LaTeX tabular environment table.
DataFrame.to_stata(fname[, convert_dates, …]) Export DataFrame object to Stata dta format.
DataFrame.to_msgpack([path_or_buf, encoding]) Serialize object to input file path using msgpack format.
DataFrame.to_gbq(destination_table[, …]) Write a DataFrame to a Google BigQuery table.
DataFrame.to_records([index, convert_datetime64]) Convert DataFrame to a NumPy record array.
DataFrame.to_sparse([fill_value, kind]) Convert to SparseDataFrame.
DataFrame.to_dense() Return dense representation of NDFrame (as opposed to sparse).
DataFrame.to_string([buf, columns, …]) Render a DataFrame to a console-friendly tabular output.
DataFrame.to_clipboard([excel, sep]) Copy object to the system clipboard.
DataFrame.style Property returning a Styler object containing methods for building a styled HTML representation fo the DataFrame.

Sparse

SparseDataFrame.to_coo() Return the contents of the frame as a sparse SciPy COO matrix.

Panel

Constructor

Panel([data, items, major_axis, minor_axis, …]) (DEPRECATED) Represents wide format panel data, stored as 3-dimensional array.

Attributes and underlying data

Axes

  • items: axis 0; each item corresponds to a DataFrame contained inside
  • major_axis: axis 1; the index (rows) of each of the DataFrames
  • minor_axis: axis 2; the columns of each of the DataFrames
Panel.values Return a Numpy representation of the DataFrame.
Panel.axes Return index label(s) of the internal NDFrame
Panel.ndim Return an int representing the number of axes / array dimensions.
Panel.size Return an int representing the number of elements in this object.
Panel.shape Return a tuple of axis dimensions
Panel.dtypes Return the dtypes in the DataFrame.
Panel.ftypes Return the ftypes (indication of sparse/dense and dtype) in DataFrame.
Panel.get_dtype_counts() Return counts of unique dtypes in this object.
Panel.get_ftype_counts() (DEPRECATED) Return counts of unique ftypes in this object.

Conversion

Panel.astype(dtype[, copy, errors]) Cast a pandas object to a specified dtype dtype.
Panel.copy([deep]) Make a copy of this object’s indices and data.
Panel.isna() Detect missing values.
Panel.notna() Detect existing (non-missing) values.

Getting and setting

Panel.get_value(*args, **kwargs) (DEPRECATED) Quickly retrieve single value at (item, major, minor) location.
Panel.set_value(*args, **kwargs) (DEPRECATED) Quickly set single value at (item, major, minor) location.

Indexing, iteration, slicing

Panel.at Access a single value for a row/column label pair.
Panel.iat Access a single value for a row/column pair by integer position.
Panel.loc Access a group of rows and columns by label(s) or a boolean array.
Panel.iloc Purely integer-location based indexing for selection by position.
Panel.__iter__() Iterate over infor axis
Panel.iteritems() Iterate over (label, values) on info axis
Panel.pop(item) Return item and drop from frame.
Panel.xs(key[, axis]) Return slice of panel along selected axis.
Panel.major_xs(key) Return slice of panel along major axis.
Panel.minor_xs(key) Return slice of panel along minor axis.

For more information on .at, .iat, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Panel.add(other[, axis]) Addition of series and other, element-wise (binary operator add).
Panel.sub(other[, axis]) Subtraction of series and other, element-wise (binary operator sub).
Panel.mul(other[, axis]) Multiplication of series and other, element-wise (binary operator mul).
Panel.div(other[, axis]) Floating division of series and other, element-wise (binary operator truediv).
Panel.truediv(other[, axis]) Floating division of series and other, element-wise (binary operator truediv).
Panel.floordiv(other[, axis]) Integer division of series and other, element-wise (binary operator floordiv).
Panel.mod(other[, axis]) Modulo of series and other, element-wise (binary operator mod).
Panel.pow(other[, axis]) Exponential power of series and other, element-wise (binary operator pow).
Panel.radd(other[, axis]) Addition of series and other, element-wise (binary operator radd).
Panel.rsub(other[, axis]) Subtraction of series and other, element-wise (binary operator rsub).
Panel.rmul(other[, axis]) Multiplication of series and other, element-wise (binary operator rmul).
Panel.rdiv(other[, axis]) Floating division of series and other, element-wise (binary operator rtruediv).
Panel.rtruediv(other[, axis]) Floating division of series and other, element-wise (binary operator rtruediv).
Panel.rfloordiv(other[, axis]) Integer division of series and other, element-wise (binary operator rfloordiv).
Panel.rmod(other[, axis]) Modulo of series and other, element-wise (binary operator rmod).
Panel.rpow(other[, axis]) Exponential power of series and other, element-wise (binary operator rpow).
Panel.lt(other[, axis]) Wrapper for comparison method lt
Panel.gt(other[, axis]) Wrapper for comparison method gt
Panel.le(other[, axis]) Wrapper for comparison method le
Panel.ge(other[, axis]) Wrapper for comparison method ge
Panel.ne(other[, axis]) Wrapper for comparison method ne
Panel.eq(other[, axis]) Wrapper for comparison method eq

Function application, GroupBy

Panel.apply(func[, axis]) Applies function along axis (or axes) of the Panel.
Panel.groupby(function[, axis]) Group data on given axis, returning GroupBy object.

Computations / Descriptive Stats

Panel.abs() Return a Series/DataFrame with absolute numeric value of each element.
Panel.clip([lower, upper, axis, inplace]) Trim values at input threshold(s).
Panel.clip_lower(threshold[, axis, inplace]) (DEPRECATED) Trim values below a given threshold.
Panel.clip_upper(threshold[, axis, inplace]) (DEPRECATED) Trim values above a given threshold.
Panel.count([axis]) Return number of observations over requested axis.
Panel.cummax([axis, skipna]) Return cumulative maximum over a DataFrame or Series axis.
Panel.cummin([axis, skipna]) Return cumulative minimum over a DataFrame or Series axis.
Panel.cumprod([axis, skipna]) Return cumulative product over a DataFrame or Series axis.
Panel.cumsum([axis, skipna]) Return cumulative sum over a DataFrame or Series axis.
Panel.max([axis, skipna, level, numeric_only]) This method returns the maximum of the values in the object.
Panel.mean([axis, skipna, level, numeric_only]) Return the mean of the values for the requested axis.
Panel.median([axis, skipna, level, numeric_only]) Return the median of the values for the requested axis.
Panel.min([axis, skipna, level, numeric_only]) This method returns the minimum of the values in the object.
Panel.pct_change([periods, fill_method, …]) Percentage change between the current and a prior element.
Panel.prod([axis, skipna, level, …]) Return the product of the values for the requested axis.
Panel.sem([axis, skipna, level, ddof, …]) Return unbiased standard error of the mean over requested axis.
Panel.skew([axis, skipna, level, numeric_only]) Return unbiased skew over requested axis Normalized by N-1.
Panel.sum([axis, skipna, level, …]) Return the sum of the values for the requested axis.
Panel.std([axis, skipna, level, ddof, …]) Return sample standard deviation over requested axis.
Panel.var([axis, skipna, level, ddof, …]) Return unbiased variance over requested axis.

Reindexing / Selection / Label manipulation

Panel.add_prefix(prefix) Prefix labels with string prefix.
Panel.add_suffix(suffix) Suffix labels with string suffix.
Panel.drop([labels, axis, index, columns, …])
Panel.equals(other) Test whether two objects contain the same elements.
Panel.filter([items, like, regex, axis]) Subset rows or columns of dataframe according to labels in the specified index.
Panel.first(offset) Convenience method for subsetting initial periods of time series data based on a date offset.
Panel.last(offset) Convenience method for subsetting final periods of time series data based on a date offset.
Panel.reindex(*args, **kwargs) Conform Panel to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index.
Panel.reindex_axis(labels[, axis, method, …]) (DEPRECATED) Conform input object to new index.
Panel.reindex_like(other[, method, copy, …]) Return an object with matching indices as other object.
Panel.rename([items, major_axis, minor_axis]) Alter axes input function or functions.
Panel.sample([n, frac, replace, weights, …]) Return a random sample of items from an axis of object.
Panel.select(crit[, axis]) (DEPRECATED) Return data corresponding to axis labels matching criteria.
Panel.take(indices[, axis, convert, is_copy]) Return the elements in the given positional indices along an axis.
Panel.truncate([before, after, axis, copy]) Truncate a Series or DataFrame before and after some index value.

Missing data handling

Panel.dropna([axis, how, inplace]) Drop 2D from panel, holding passed axis constant.

Reshaping, sorting, transposing

Panel.sort_index([axis, level, ascending, …]) Sort object by labels (along an axis)
Panel.swaplevel([i, j, axis]) Swap levels i and j in a MultiIndex on a particular axis
Panel.transpose(*args, **kwargs) Permute the dimensions of the Panel
Panel.swapaxes(axis1, axis2[, copy]) Interchange axes and swap values axes appropriately.
Panel.conform(frame[, axis]) Conform input DataFrame to align with chosen axis pair.

Combining / joining / merging

Panel.join(other[, how, lsuffix, rsuffix]) Join items with other Panel either on major and minor axes column.
Panel.update(other[, join, overwrite, …]) Modify Panel in place using non-NA values from other Panel.

Time series-related

Panel.asfreq(freq[, method, how, normalize, …]) Convert TimeSeries to specified frequency.
Panel.shift([periods, freq, axis]) Shift index by desired number of periods with an optional time freq.
Panel.resample(rule[, how, axis, …]) Resample time-series data.
Panel.tz_convert(tz[, axis, level, copy]) Convert tz-aware axis to target time zone.
Panel.tz_localize(tz[, axis, level, copy, …]) Localize tz-naive TimeSeries to target time zone.

Serialization / IO / Conversion

Panel.from_dict(data[, intersect, orient, dtype]) Construct Panel from dict of DataFrame objects.
Panel.to_pickle(path[, compression, protocol]) Pickle (serialize) object to file.
Panel.to_excel(path[, na_rep, engine]) Write each DataFrame in Panel to a separate excel sheet.
Panel.to_hdf(path_or_buf, key, **kwargs) Write the contained data to an HDF5 file using HDFStore.
Panel.to_sparse(*args, **kwargs) NOT IMPLEMENTED: do not call this method, as sparsifying is not supported for Panel objects and will raise an error.
Panel.to_frame([filter_observations]) Transform wide format into long (stacked) format as DataFrame whose columns are the Panel’s items and whose index is a MultiIndex formed of the Panel’s major and minor axes.
Panel.to_clipboard([excel, sep]) Copy object to the system clipboard.

Index

Many of these methods or variants thereof are available on the objects that contain an index (Series/DataFrame) and those should most likely be used before calling these methods directly.

Index Immutable ndarray implementing an ordered, sliceable set.

Attributes

Index.values Return an array representing the data in the Index.
Index.is_monotonic Alias for is_monotonic_increasing.
Index.is_monotonic_increasing Return if the index is monotonic increasing (only equal or increasing) values.
Index.is_monotonic_decreasing Return if the index is monotonic decreasing (only equal or decreasing) values.
Index.is_unique Return if the index has unique values.
Index.has_duplicates
Index.hasnans Return if I have any nans; enables various perf speedups.
Index.dtype Return the dtype object of the underlying data.
Index.dtype_str Return the dtype str of the underlying data.
Index.inferred_type Return a string of the type inferred from the values.
Index.is_all_dates
Index.shape Return a tuple of the shape of the underlying data.
Index.name
Index.names
Index.nbytes Return the number of bytes in the underlying data.
Index.ndim Number of dimensions of the underlying data, by definition 1.
Index.size Return the number of elements in the underlying data.
Index.empty
Index.strides Return the strides of the underlying data.
Index.itemsize Return the size of the dtype of the item of the underlying data.
Index.base Return the base object if the memory of the underlying data is shared.
Index.T Return the transpose, which is by definition self.
Index.memory_usage([deep]) Memory usage of the values

Modifying and Computations

Index.all(*args, **kwargs) Return whether all elements are True.
Index.any(*args, **kwargs) Return whether any element is True.
Index.argmin([axis]) Return a ndarray of the minimum argument indexer.
Index.argmax([axis]) Return a ndarray of the maximum argument indexer.
Index.copy([name, deep, dtype]) Make a copy of this object.
Index.delete(loc) Make new Index with passed location(-s) deleted.
Index.drop(labels[, errors]) Make new Index with passed list of labels deleted.
Index.drop_duplicates([keep]) Return Index with duplicate values removed.
Index.duplicated([keep]) Indicate duplicate index values.
Index.equals(other) Determines if two Index objects contain the same elements.
Index.factorize([sort, na_sentinel]) Encode the object as an enumerated type or categorical variable.
Index.identical(other) Similar to equals, but check that other comparable attributes are also equal.
Index.insert(loc, item) Make new Index inserting new item at location.
Index.is_(other) More flexible, faster check like is but that works through views.
Index.is_boolean()
Index.is_categorical() Check if the Index holds categorical data.
Index.is_floating()
Index.is_integer()
Index.is_interval()
Index.is_mixed()
Index.is_numeric()
Index.is_object()
Index.min() Return the minimum value of the Index.
Index.max() Return the maximum value of the Index.
Index.reindex(target[, method, level, …]) Create index with target’s values (move/add/delete values as necessary).
Index.rename(name[, inplace]) Alter Index or MultiIndex name.
Index.repeat(repeats, *args, **kwargs) Repeat elements of an Index.
Index.where(cond[, other]) Return an Index of same shape as self and whose corresponding entries are from self where cond is True and otherwise are from other.
Index.take(indices[, axis, allow_fill, …]) Return a new Index of the values selected by the indices.
Index.putmask(mask, value) Return a new Index of the values set with the mask.
Index.unique([level]) Return unique values in the index.
Index.nunique([dropna]) Return number of unique elements in the object.
Index.value_counts([normalize, sort, …]) Return a Series containing counts of unique values.

Compatibility with MultiIndex

Index.set_names(names[, level, inplace]) Set Index or MultiIndex name.
Index.is_lexsorted_for_tuple(tup)
Index.droplevel([level]) Return index with requested level(s) removed.

Missing Values

Index.fillna([value, downcast]) Fill NA/NaN values with the specified value
Index.dropna([how]) Return Index without NA/NaN values
Index.isna() Detect missing values.
Index.notna() Detect existing (non-missing) values.

Conversion

Index.astype(dtype[, copy]) Create an Index with values cast to dtypes.
Index.item() Return the first element of the underlying data as a python scalar.
Index.map(mapper[, na_action]) Map values using input correspondence (a dict, Series, or function).
Index.ravel([order]) Return an ndarray of the flattened values of the underlying data.
Index.to_list() Return a list of the values.
Index.to_native_types([slicer]) Format specified values of self and return them.
Index.to_series([index, name]) Create a Series with both index and values equal to the index keys useful with map for returning an indexer based on an index.
Index.to_frame([index, name]) Create a DataFrame with a column containing the Index.
Index.view([cls])

Sorting

Index.argsort(*args, **kwargs) Return the integer indices that would sort the index.
Index.searchsorted(value[, side, sorter]) Find indices where elements should be inserted to maintain order.
Index.sort_values([return_indexer, ascending]) Return a sorted copy of the index.

Time-specific operations

Index.shift([periods, freq]) Shift index by desired number of time frequency increments.

Combining / joining / set operations

Index.append(other) Append a collection of Index options together.
Index.join(other[, how, level, …]) Compute join_index and indexers to conform data structures to the new index.
Index.intersection(other) Form the intersection of two Index objects.
Index.union(other) Form the union of two Index objects and sorts if possible.
Index.difference(other[, sort]) Return a new Index with elements from the index that are not in other.
Index.symmetric_difference(other[, result_name]) Compute the symmetric difference of two Index objects.

Selecting

Index.asof(label) Return the label from the index, or, if not present, the previous one.
Index.asof_locs(where, mask) Finds the locations (indices) of the labels from the index for every entry in the where argument.
Index.contains(key) Return a boolean indicating whether the provided key is in the index.
Index.get_duplicates() (DEPRECATED) Extract duplicated index elements.
Index.get_indexer(target[, method, limit, …]) Compute indexer and mask for new index given the current index.
Index.get_indexer_for(target, **kwargs) Guaranteed return of an indexer even when non-unique.
Index.get_indexer_non_unique(target) Compute indexer and mask for new index given the current index.
Index.get_level_values(level) Return an Index of values for requested level.
Index.get_loc(key[, method, tolerance]) Get integer location, slice or boolean mask for requested label.
Index.get_slice_bound(label, side, kind) Calculate slice bound that corresponds to given label.
Index.get_value(series, key) Fast lookup of value from 1-dimensional ndarray.
Index.get_values() Return Index data as an numpy.ndarray.
Index.set_value(arr, key, value) Fast lookup of value from 1-dimensional ndarray.
Index.isin(values[, level]) Return a boolean array where the index values are in values.
Index.slice_indexer([start, end, step, kind]) For an ordered or unique index, compute the slice indexer for input labels and step.
Index.slice_locs([start, end, step, kind]) Compute slice locations for input labels.

Numeric Index

RangeIndex Immutable Index implementing a monotonic integer range.
Int64Index Immutable ndarray implementing an ordered, sliceable set.
UInt64Index Immutable ndarray implementing an ordered, sliceable set.
Float64Index Immutable ndarray implementing an ordered, sliceable set.
RangeIndex.from_range(data[, name, dtype]) Create RangeIndex from a range (py3), or xrange (py2) object.

CategoricalIndex

CategoricalIndex Immutable Index implementing an ordered, sliceable set.

Categorical Components

CategoricalIndex.codes
CategoricalIndex.categories
CategoricalIndex.ordered
CategoricalIndex.rename_categories(*args, …) Renames categories.
CategoricalIndex.reorder_categories(*args, …) Reorders categories as specified in new_categories.
CategoricalIndex.add_categories(*args, **kwargs) Add new categories.
CategoricalIndex.remove_categories(*args, …) Removes the specified categories.
CategoricalIndex.remove_unused_categories(…) Removes categories which are not used.
CategoricalIndex.set_categories(*args, **kwargs) Sets the categories to the specified new_categories.
CategoricalIndex.as_ordered(*args, **kwargs) Set the Categorical to be ordered.
CategoricalIndex.as_unordered(*args, **kwargs) Set the Categorical to be unordered.
CategoricalIndex.map(mapper) Map values using input correspondence (a dict, Series, or function).

IntervalIndex

IntervalIndex Immutable index of intervals that are closed on the same side.

IntervalIndex Components

IntervalIndex.from_arrays(left, right[, …]) Construct from two arrays defining the left and right bounds.
IntervalIndex.from_tuples(data[, closed, …]) Construct an IntervalIndex from an array-like of tuples
IntervalIndex.from_breaks(breaks[, closed, …]) Construct an IntervalIndex from an array of splits.
IntervalIndex.contains(key) Return a boolean indicating if the key is IN the index
IntervalIndex.left Return the left endpoints of each Interval in the IntervalIndex as an Index
IntervalIndex.right Return the right endpoints of each Interval in the IntervalIndex as an Index
IntervalIndex.mid Return the midpoint of each Interval in the IntervalIndex as an Index
IntervalIndex.closed Whether the intervals are closed on the left-side, right-side, both or neither
IntervalIndex.length Return an Index with entries denoting the length of each Interval in the IntervalIndex
IntervalIndex.values Return the IntervalIndex’s data as an IntervalArray.
IntervalIndex.is_non_overlapping_monotonic
IntervalIndex.is_overlapping Return True if the IntervalIndex has overlapping intervals, else False.
IntervalIndex.get_loc(key[, method]) Get integer location, slice or boolean mask for requested label.
IntervalIndex.get_indexer(target[, method, …]) Compute indexer and mask for new index given the current index.
IntervalIndex.set_closed(closed) Return an IntervalIndex identical to the current one, but closed on the specified side
IntervalIndex.overlaps(other) Check elementwise if an Interval overlaps the values in the IntervalIndex.

MultiIndex

MultiIndex A multi-level, or hierarchical, index object for pandas objects.
IndexSlice Create an object to more easily perform multi-index slicing

MultiIndex Constructors

MultiIndex.from_arrays(arrays[, sortorder, …]) Convert arrays to MultiIndex.
MultiIndex.from_tuples(tuples[, sortorder, …]) Convert list of tuples to MultiIndex.
MultiIndex.from_product(iterables[, …]) Make a MultiIndex from the cartesian product of multiple iterables.
MultiIndex.from_frame(df[, sortorder, names]) Make a MultiIndex from a DataFrame.

MultiIndex Attributes

MultiIndex.names Names of levels in MultiIndex
MultiIndex.levels
MultiIndex.codes
MultiIndex.nlevels Integer number of levels in this MultiIndex.
MultiIndex.levshape A tuple with the length of each level.

MultiIndex Components

MultiIndex.set_levels(levels[, level, …]) Set new levels on MultiIndex.
MultiIndex.set_codes(codes[, level, …]) Set new codes on MultiIndex.
MultiIndex.to_hierarchical(n_repeat[, n_shuffle]) (DEPRECATED) Return a MultiIndex reshaped to conform to the shapes given by n_repeat and n_shuffle.
MultiIndex.to_flat_index() Convert a MultiIndex to an Index of Tuples containing the level values.
MultiIndex.to_frame([index, name]) Create a DataFrame with the levels of the MultiIndex as columns.
MultiIndex.is_lexsorted() Return True if the codes are lexicographically sorted
MultiIndex.sortlevel([level, ascending, …]) Sort MultiIndex at the requested level.
MultiIndex.droplevel([level]) Return index with requested level(s) removed.
MultiIndex.swaplevel([i, j]) Swap level i with level j.
MultiIndex.reorder_levels(order) Rearrange levels using input order.
MultiIndex.remove_unused_levels() Create a new MultiIndex from the current that removes unused levels, meaning that they are not expressed in the labels.
MultiIndex.unique([level]) Return unique values in the index.

MultiIndex Selecting

MultiIndex.get_loc(key[, method]) Get location for a label or a tuple of labels as an integer, slice or boolean mask.
MultiIndex.get_indexer(target[, method, …]) Compute indexer and mask for new index given the current index.
MultiIndex.get_level_values(level) Return vector of label values for requested level, equal to the length of the index.

DatetimeIndex

DatetimeIndex Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.

Time/Date Components

DatetimeIndex.year The year of the datetime.
DatetimeIndex.month The month as January=1, December=12.
DatetimeIndex.day The days of the datetime.
DatetimeIndex.hour The hours of the datetime.
DatetimeIndex.minute The minutes of the datetime.
DatetimeIndex.second The seconds of the datetime.
DatetimeIndex.microsecond The microseconds of the datetime.
DatetimeIndex.nanosecond The nanoseconds of the datetime.
DatetimeIndex.date Returns numpy array of python datetime.date objects (namely, the date part of Timestamps without timezone information).
DatetimeIndex.time Returns numpy array of datetime.time.
DatetimeIndex.timetz Returns numpy array of datetime.time also containing timezone information.
DatetimeIndex.dayofyear The ordinal day of the year.
DatetimeIndex.weekofyear The week ordinal of the year.
DatetimeIndex.week The week ordinal of the year.
DatetimeIndex.dayofweek The day of the week with Monday=0, Sunday=6.
DatetimeIndex.weekday The day of the week with Monday=0, Sunday=6.
DatetimeIndex.quarter The quarter of the date.
DatetimeIndex.tz Return timezone.
DatetimeIndex.freq Return the frequency object if it is set, otherwise None.
DatetimeIndex.freqstr Return the frequency object as a string if its set, otherwise None
DatetimeIndex.is_month_start Indicates whether the date is the first day of the month.
DatetimeIndex.is_month_end Indicates whether the date is the last day of the month.
DatetimeIndex.is_quarter_start Indicator for whether the date is the first day of a quarter.
DatetimeIndex.is_quarter_end Indicator for whether the date is the last day of a quarter.
DatetimeIndex.is_year_start Indicate whether the date is the first day of a year.
DatetimeIndex.is_year_end Indicate whether the date is the last day of the year.
DatetimeIndex.is_leap_year Boolean indicator if the date belongs to a leap year.
DatetimeIndex.inferred_freq Tryies to return a string representing a frequency guess, generated by infer_freq.

Selecting

DatetimeIndex.indexer_at_time(time[, asof]) Returns index locations of index values at particular time of day (e.g.
DatetimeIndex.indexer_between_time(…[, …]) Return index locations of values between particular times of day (e.g., 9:00-9:30AM).

Time-specific operations

DatetimeIndex.normalize(*args, **kwargs) Convert times to midnight.
DatetimeIndex.strftime(date_format) Convert to Index using specified date_format.
DatetimeIndex.snap([freq]) Snap time stamps to nearest occurring frequency
DatetimeIndex.tz_convert(*args, **kwargs) Convert tz-aware Datetime Array/Index from one time zone to another.
DatetimeIndex.tz_localize(*args, **kwargs) Localize tz-naive Datetime Array/Index to tz-aware Datetime Array/Index.
DatetimeIndex.round(freq[, ambiguous, …]) Perform round operation on the data to the specified freq.
DatetimeIndex.floor(freq[, ambiguous, …]) Perform floor operation on the data to the specified freq.
DatetimeIndex.ceil(freq[, ambiguous, …]) Perform ceil operation on the data to the specified freq.
DatetimeIndex.month_name(*args, **kwargs) Return the month names of the DateTimeIndex with specified locale.
DatetimeIndex.day_name(*args, **kwargs) Return the day names of the DateTimeIndex with specified locale.

Conversion

DatetimeIndex.to_period(*args, **kwargs) Cast to PeriodArray/Index at a particular frequency.
DatetimeIndex.to_perioddelta(*args, **kwargs) Calculate TimedeltaArray of difference between index values and index converted to PeriodArray at specified freq.
DatetimeIndex.to_pydatetime() Return Datetime Array/Index as object ndarray of datetime.datetime objects
DatetimeIndex.to_series([keep_tz, index, name]) Create a Series with both index and values equal to the index keys useful with map for returning an indexer based on an index
DatetimeIndex.to_frame([index, name]) Create a DataFrame with a column containing the Index.

TimedeltaIndex

TimedeltaIndex Immutable ndarray of timedelta64 data, represented internally as int64, and which can be boxed to timedelta objects

Components

TimedeltaIndex.days Number of days for each element.
TimedeltaIndex.seconds Number of seconds (>= 0 and less than 1 day) for each element.
TimedeltaIndex.microseconds Number of microseconds (>= 0 and less than 1 second) for each element.
TimedeltaIndex.nanoseconds Number of nanoseconds (>= 0 and less than 1 microsecond) for each element.
TimedeltaIndex.components Return a dataframe of the components (days, hours, minutes, seconds, milliseconds, microseconds, nanoseconds) of the Timedeltas.
TimedeltaIndex.inferred_freq Tryies to return a string representing a frequency guess, generated by infer_freq.

Conversion

TimedeltaIndex.to_pytimedelta() Return Timedelta Array/Index as object ndarray of datetime.timedelta objects.
TimedeltaIndex.to_series([index, name]) Create a Series with both index and values equal to the index keys useful with map for returning an indexer based on an index.
TimedeltaIndex.round(freq[, ambiguous, …]) Perform round operation on the data to the specified freq.
TimedeltaIndex.floor(freq[, ambiguous, …]) Perform floor operation on the data to the specified freq.
TimedeltaIndex.ceil(freq[, ambiguous, …]) Perform ceil operation on the data to the specified freq.
TimedeltaIndex.to_frame([index, name]) Create a DataFrame with a column containing the Index.

PeriodIndex

PeriodIndex Immutable ndarray holding ordinal values indicating regular periods in time such as particular years, quarters, months, etc.

Attributes

PeriodIndex.day The days of the period
PeriodIndex.dayofweek The day of the week with Monday=0, Sunday=6
PeriodIndex.dayofyear The ordinal day of the year
PeriodIndex.days_in_month The number of days in the month
PeriodIndex.daysinmonth The number of days in the month
PeriodIndex.end_time
PeriodIndex.freq Return the frequency object if it is set, otherwise None.
PeriodIndex.freqstr Return the frequency object as a string if its set, otherwise None
PeriodIndex.hour The hour of the period
PeriodIndex.is_leap_year Logical indicating if the date belongs to a leap year
PeriodIndex.minute The minute of the period
PeriodIndex.month The month as January=1, December=12
PeriodIndex.quarter The quarter of the date
PeriodIndex.qyear
PeriodIndex.second The second of the period
PeriodIndex.start_time
PeriodIndex.week The week ordinal of the year
PeriodIndex.weekday The day of the week with Monday=0, Sunday=6
PeriodIndex.weekofyear The week ordinal of the year
PeriodIndex.year The year of the period

Methods

PeriodIndex.asfreq(*args, **kwargs) Convert the Period Array/Index to the specified frequency freq.
PeriodIndex.strftime(date_format) Convert to Index using specified date_format.
PeriodIndex.to_timestamp(*args, **kwargs) Cast to DatetimeArray/Index.

Scalars

Period

Period Represents a period of time

Attributes

Period.day Get day of the month that a Period falls on.
Period.dayofweek Day of the week the period lies in, with Monday=0 and Sunday=6.
Period.dayofyear Return the day of the year.
Period.days_in_month Get the total number of days in the month that this period falls on.
Period.daysinmonth Get the total number of days of the month that the Period falls in.
Period.end_time
Period.freq
Period.freqstr
Period.hour Get the hour of the day component of the Period.
Period.is_leap_year
Period.minute Get minute of the hour component of the Period.
Period.month
Period.ordinal
Period.quarter
Period.qyear Fiscal year the Period lies in according to its starting-quarter.
Period.second Get the second component of the Period.
Period.start_time Get the Timestamp for the start of the period.
Period.week Get the week of the year on the given Period.
Period.weekday Day of the week the period lies in, with Monday=0 and Sunday=6.
Period.weekofyear
Period.year

Methods

Period.asfreq Convert Period to desired frequency, either at the start or end of the interval
Period.now
Period.strftime Returns the string representation of the Period, depending on the selected fmt.
Period.to_timestamp Return the Timestamp representation of the Period at the target frequency at the specified end (how) of the Period

Timestamp

Timestamp Pandas replacement for datetime.datetime

Methods

Timestamp.astimezone Convert tz-aware Timestamp to another time zone.
Timestamp.ceil return a new Timestamp ceiled to this resolution
Timestamp.combine(date, time) date, time -> datetime with same date and time fields
Timestamp.ctime Return ctime() style string.
Timestamp.date Return date object with same year, month and day.
Timestamp.day_name Return the day name of the Timestamp with specified locale.
Timestamp.dst Return self.tzinfo.dst(self).
Timestamp.floor return a new Timestamp floored to this resolution
Timestamp.freq
Timestamp.freqstr
Timestamp.fromordinal(ordinal[, freq, tz]) passed an ordinal, translate and convert to a ts note: by definition there cannot be any tz info on the ordinal itself
Timestamp.fromtimestamp(ts) timestamp[, tz] -> tz’s local time from POSIX timestamp.
Timestamp.isocalendar Return a 3-tuple containing ISO year, week number, and weekday.
Timestamp.isoformat
Timestamp.isoweekday Return the day of the week represented by the date.
Timestamp.month_name Return the month name of the Timestamp with specified locale.
Timestamp.normalize Normalize Timestamp to midnight, preserving tz information.
Timestamp.now([tz]) Returns new Timestamp object representing current time local to tz.
Timestamp.replace implements datetime.replace, handles nanoseconds
Timestamp.round Round the Timestamp to the specified resolution
Timestamp.strftime format -> strftime() style string.
Timestamp.strptime string, format -> new datetime parsed from a string (like time.strptime()).
Timestamp.time Return time object with same time but with tzinfo=None.
Timestamp.timestamp Return POSIX timestamp as float.
Timestamp.timetuple Return time tuple, compatible with time.localtime().
Timestamp.timetz Return time object with same time and tzinfo.
Timestamp.to_datetime64 Returns a numpy.datetime64 object with ‘ns’ precision
Timestamp.to_julian_date Convert TimeStamp to a Julian Date.
Timestamp.to_period Return an period of which this timestamp is an observation.
Timestamp.to_pydatetime Convert a Timestamp object to a native Python datetime object.
Timestamp.today(cls[, tz]) Return the current time in the local timezone.
Timestamp.toordinal Return proleptic Gregorian ordinal.
Timestamp.tz_convert Convert tz-aware Timestamp to another time zone.
Timestamp.tz_localize Convert naive Timestamp to local time zone, or remove timezone from tz-aware Timestamp.
Timestamp.tzname Return self.tzinfo.tzname(self).
Timestamp.utcfromtimestamp(ts) Construct a naive UTC datetime from a POSIX timestamp.
Timestamp.utcnow() Return a new Timestamp representing UTC day and time.
Timestamp.utcoffset Return self.tzinfo.utcoffset(self).
Timestamp.utctimetuple Return UTC time tuple, compatible with time.localtime().
Timestamp.weekday Return the day of the week represented by the date.

Interval

Interval Immutable object implementing an Interval, a bounded slice-like interval.

Properties

Interval.closed Whether the interval is closed on the left-side, right-side, both or neither
Interval.closed_left Check if the interval is closed on the left side.
Interval.closed_right Check if the interval is closed on the right side.
Interval.left Left bound for the interval
Interval.length Return the length of the Interval
Interval.mid Return the midpoint of the Interval
Interval.open_left Check if the interval is open on the left side.
Interval.open_right Check if the interval is open on the right side.
Interval.overlaps Check whether two Interval objects overlap.
Interval.right Right bound for the interval

Timedelta

Timedelta Represents a duration, the difference between two dates or times.

Properties

Timedelta.asm8 Return a numpy timedelta64 array scalar view.
Timedelta.components Return a Components NamedTuple-like
Timedelta.days Number of days.
Timedelta.delta Return the timedelta in nanoseconds (ns), for internal compatibility.
Timedelta.freq
Timedelta.is_populated
Timedelta.max
Timedelta.microseconds Number of microseconds (>= 0 and less than 1 second).
Timedelta.min
Timedelta.nanoseconds Return the number of nanoseconds (n), where 0 <= n < 1 microsecond.
Timedelta.resolution Return a string representing the lowest timedelta resolution.
Timedelta.seconds Number of seconds (>= 0 and less than 1 day).
Timedelta.value
Timedelta.view array view compat

Methods

Timedelta.ceil return a new Timedelta ceiled to this resolution
Timedelta.floor return a new Timedelta floored to this resolution
Timedelta.isoformat Format Timedelta as ISO 8601 Duration like P[n]Y[n]M[n]DT[n]H[n]M[n]S, where the [n] s are replaced by the values.
Timedelta.round Round the Timedelta to the specified resolution
Timedelta.to_pytimedelta return an actual datetime.timedelta object note: we lose nanosecond resolution if any
Timedelta.to_timedelta64 Returns a numpy.timedelta64 object with ‘ns’ precision
Timedelta.total_seconds Total duration of timedelta in seconds (to ns precision)

Date Offsets

DateOffset([n, normalize]) Standard kind of date increment used for a date range.
BusinessDay([n, normalize, offset]) DateOffset subclass representing possibly n business days.
BusinessHour([n, normalize, start, end, offset]) DateOffset subclass representing possibly n business days.
CustomBusinessDay([n, normalize, weekmask, …]) DateOffset subclass representing possibly n custom business days, excluding holidays.
CustomBusinessHour([n, normalize, weekmask, …]) DateOffset subclass representing possibly n custom business days.
MonthOffset

Attributes

MonthEnd DateOffset of one month end.
MonthBegin DateOffset of one month at beginning.
BusinessMonthEnd DateOffset increments between business EOM dates.
BusinessMonthBegin DateOffset of one business month at beginning.
CustomBusinessMonthEnd([n, normalize, …]) DateOffset subclass representing one custom business month, incrementing between end of month dates.
CustomBusinessMonthBegin([n, normalize, …]) DateOffset subclass representing one custom business month, incrementing between beginning of month dates.
SemiMonthOffset([n, normalize, day_of_month])

Attributes

SemiMonthEnd([n, normalize, day_of_month]) Two DateOffset’s per month repeating on the last day of the month and day_of_month.
SemiMonthBegin([n, normalize, day_of_month]) Two DateOffset’s per month repeating on the first day of the month and day_of_month.
Week([n, normalize, weekday]) Weekly offset.
WeekOfMonth([n, normalize, week, weekday]) Describes monthly dates like “the Tuesday of the 2nd week of each month”.
LastWeekOfMonth([n, normalize, weekday]) Describes monthly dates in last week of month like “the last Tuesday of each month”.
QuarterOffset([n, normalize, startingMonth]) Quarter representation - doesn’t call super.
BQuarterEnd([n, normalize, startingMonth]) DateOffset increments between business Quarter dates.
BQuarterBegin([n, normalize, startingMonth])

Attributes

QuarterEnd([n, normalize, startingMonth]) DateOffset increments between business Quarter dates.
QuarterBegin([n, normalize, startingMonth])

Attributes

YearOffset([n, normalize, month]) DateOffset that just needs a month.
BYearEnd([n, normalize, month]) DateOffset increments between business EOM dates.
BYearBegin([n, normalize, month]) DateOffset increments between business year begin dates.
YearEnd([n, normalize, month]) DateOffset increments between calendar year ends.
YearBegin([n, normalize, month]) DateOffset increments between calendar year begin dates.
FY5253([n, normalize, weekday, …]) Describes 52-53 week fiscal year.
FY5253Quarter([n, normalize, weekday, …]) DateOffset increments between business quarter dates for 52-53 week fiscal year (also known as a 4-4-5 calendar).
Easter DateOffset for the Easter holiday using logic defined in dateutil.
Tick([n, normalize])

Attributes

Day([n, normalize])

Attributes

Hour([n, normalize])

Attributes

Minute([n, normalize])

Attributes

Second([n, normalize])

Attributes

Milli([n, normalize])

Attributes

Micro([n, normalize])

Attributes

Nano([n, normalize])

Attributes

BDay alias of pandas.tseries.offsets.BusinessDay
BMonthEnd alias of pandas.tseries.offsets.BusinessMonthEnd
BMonthBegin alias of pandas.tseries.offsets.BusinessMonthBegin
CBMonthEnd alias of pandas.tseries.offsets.CustomBusinessMonthEnd
CBMonthBegin alias of pandas.tseries.offsets.CustomBusinessMonthBegin
CDay alias of pandas.tseries.offsets.CustomBusinessDay

Frequencies

to_offset(freq) Return DateOffset object from string or tuple representation or datetime.timedelta object

Window

Rolling objects are returned by .rolling calls: pandas.DataFrame.rolling(), pandas.Series.rolling(), etc. Expanding objects are returned by .expanding calls: pandas.DataFrame.expanding(), pandas.Series.expanding(), etc. EWM objects are returned by .ewm calls: pandas.DataFrame.ewm(), pandas.Series.ewm(), etc.

Standard moving window functions

Rolling.count() The rolling count of any non-NaN observations inside the window.
Rolling.sum(*args, **kwargs) Calculate rolling sum of given DataFrame or Series.
Rolling.mean(*args, **kwargs) Calculate the rolling mean of the values.
Rolling.median(**kwargs) Calculate the rolling median.
Rolling.var([ddof]) Calculate unbiased rolling variance.
Rolling.std([ddof]) Calculate rolling standard deviation.
Rolling.min(*args, **kwargs) Calculate the rolling minimum.
Rolling.max(*args, **kwargs) Calculate the rolling maximum.
Rolling.corr([other, pairwise]) Calculate rolling correlation.
Rolling.cov([other, pairwise, ddof]) Calculate the rolling sample covariance.
Rolling.skew(**kwargs) Unbiased rolling skewness Returns ——- same type as input
Rolling.kurt(**kwargs) Calculate unbiased rolling kurtosis.
Rolling.apply(func[, raw, args, kwargs]) rolling function apply.
Rolling.aggregate(arg, *args, **kwargs) Aggregate using one or more operations over the specified axis.
Rolling.quantile(quantile[, interpolation]) Calculate the rolling quantile.
Window.mean(*args, **kwargs) Calculate the window mean of the values.
Window.sum(*args, **kwargs) Calculate window sum of given DataFrame or Series.

Standard expanding window functions

Expanding.count(**kwargs) The expanding count of any non-NaN observations inside the window.
Expanding.sum(*args, **kwargs) Calculate expanding sum of given DataFrame or Series.
Expanding.mean(*args, **kwargs) Calculate the expanding mean of the values.
Expanding.median(**kwargs) Calculate the expanding median.
Expanding.var([ddof]) Calculate unbiased expanding variance.
Expanding.std([ddof]) Calculate expanding standard deviation.
Expanding.min(*args, **kwargs) Calculate the expanding minimum.
Expanding.max(*args, **kwargs) Calculate the expanding maximum.
Expanding.corr([other, pairwise]) Calculate expanding correlation.
Expanding.cov([other, pairwise, ddof]) Calculate the expanding sample covariance.
Expanding.skew(**kwargs) Unbiased expanding skewness Returns ——- same type as input
Expanding.kurt(**kwargs) Calculate unbiased expanding kurtosis.
Expanding.apply(func[, raw, args, kwargs]) expanding function apply.
Expanding.aggregate(arg, *args, **kwargs) Aggregate using one or more operations over the specified axis.
Expanding.quantile(quantile[, interpolation]) Calculate the expanding quantile.

Exponentially-weighted moving window functions

EWM.mean(*args, **kwargs) Exponential weighted moving average.
EWM.std([bias]) Exponential weighted moving stddev.
EWM.var([bias]) Exponential weighted moving variance.
EWM.corr([other, pairwise]) Exponential weighted sample correlation.
EWM.cov([other, pairwise, bias]) Exponential weighted sample covariance.

GroupBy

GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby(), pandas.Series.groupby(), etc.

Indexing, iteration

GroupBy.__iter__() Groupby iterator.
GroupBy.groups Dict {group name -> group labels}.
GroupBy.indices Dict {group name -> group indices}.
GroupBy.get_group(name[, obj]) Constructs NDFrame from group with provided name.
Grouper([key, level, freq, axis, sort]) A Grouper allows the user to specify a groupby instruction for a target object

Function application

GroupBy.apply(func, *args, **kwargs) Apply function func group-wise and combine the results together.
GroupBy.aggregate(func, *args, **kwargs)
GroupBy.transform(func, *args, **kwargs)
GroupBy.pipe(func, *args, **kwargs) Apply a function func with arguments to this GroupBy object and return the function’s result.

Computations / Descriptive Stats

GroupBy.all([skipna]) Returns True if all values in the group are truthful, else False.
GroupBy.any([skipna]) Returns True if any value in the group is truthful, else False.
GroupBy.bfill([limit]) Backward fill the values.
GroupBy.count() Compute count of group, excluding missing values.
GroupBy.cumcount([ascending]) Number each item in each group from 0 to the length of that group - 1.
GroupBy.ffill([limit]) Forward fill the values.
GroupBy.first(**kwargs) Compute first of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
GroupBy.head([n]) Returns first n rows of each group.
GroupBy.last(**kwargs) Compute last of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
GroupBy.max(**kwargs) Compute max of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
GroupBy.mean(*args, **kwargs) Compute mean of groups, excluding missing values.
GroupBy.median(**kwargs) Compute median of groups, excluding missing values.
GroupBy.min(**kwargs) Compute min of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
GroupBy.ngroup([ascending]) Number each group from 0 to the number of groups - 1.
GroupBy.nth(n[, dropna]) Take the nth row from each group if n is an int, or a subset of rows if n is a list of ints.
GroupBy.ohlc() Compute sum of values, excluding missing values.
GroupBy.prod(**kwargs) Compute prod of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
GroupBy.rank([method, ascending, na_option, …]) Provides the rank of values within each group.
GroupBy.pct_change([periods, fill_method, …]) Calculate pct_change of each value to previous entry in group.
GroupBy.size() Compute group sizes.
GroupBy.sem([ddof]) Compute standard error of the mean of groups, excluding missing values.
GroupBy.std([ddof]) Compute standard deviation of groups, excluding missing values.
GroupBy.sum(**kwargs) Compute sum of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
GroupBy.var([ddof]) Compute variance of groups, excluding missing values.
GroupBy.tail([n]) Returns last n rows of each group.

The following methods are available in both SeriesGroupBy and DataFrameGroupBy objects, but may differ slightly, usually in that the DataFrameGroupBy version usually permits the specification of an axis argument, and often an argument indicating whether to restrict application to columns of a specific data type.

DataFrameGroupBy.agg(arg, *args, **kwargs) Aggregate using one or more operations over the specified axis.
DataFrameGroupBy.all([skipna]) Returns True if all values in the group are truthful, else False.
DataFrameGroupBy.any([skipna]) Returns True if any value in the group is truthful, else False.
DataFrameGroupBy.bfill([limit]) Backward fill the values.
DataFrameGroupBy.corr Compute pairwise correlation of columns, excluding NA/null values.
DataFrameGroupBy.count() Compute count of group, excluding missing values
DataFrameGroupBy.cov Compute pairwise covariance of columns, excluding NA/null values.
DataFrameGroupBy.cummax([axis]) Cumulative max for each group.
DataFrameGroupBy.cummin([axis]) Cumulative min for each group.
DataFrameGroupBy.cumprod([axis]) Cumulative product for each group.
DataFrameGroupBy.cumsum([axis]) Cumulative sum for each group.
DataFrameGroupBy.describe(**kwargs) Generate descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.
DataFrameGroupBy.diff First discrete difference of element.
DataFrameGroupBy.ffill([limit]) Forward fill the values.
DataFrameGroupBy.fillna Fill NA/NaN values using the specified method.
DataFrameGroupBy.filter(func[, dropna]) Return a copy of a DataFrame excluding elements from groups that do not satisfy the boolean criterion specified by func.
DataFrameGroupBy.hist Make a histogram of the DataFrame’s.
DataFrameGroupBy.idxmax Return index of first occurrence of maximum over requested axis.
DataFrameGroupBy.idxmin Return index of first occurrence of minimum over requested axis.
DataFrameGroupBy.mad Return the mean absolute deviation of the values for the requested axis.
DataFrameGroupBy.pct_change([periods, …]) Calculate pct_change of each value to previous entry in group.
DataFrameGroupBy.plot Class implementing the .plot attribute for groupby objects.
DataFrameGroupBy.quantile Return values at the given quantile over requested axis.
DataFrameGroupBy.rank([method, ascending, …]) Provides the rank of values within each group.
DataFrameGroupBy.resample(rule, *args, **kwargs) Provide resampling when using a TimeGrouper.
DataFrameGroupBy.shift([periods, freq, axis]) Shift each group by periods observations.
DataFrameGroupBy.size() Compute group sizes.
DataFrameGroupBy.skew Return unbiased skew over requested axis Normalized by N-1.
DataFrameGroupBy.take Return the elements in the given positional indices along an axis.
DataFrameGroupBy.tshift Shift the time index, using the index’s frequency if available.

The following methods are available only for SeriesGroupBy objects.

SeriesGroupBy.nlargest Return the largest n elements.
SeriesGroupBy.nsmallest Return the smallest n elements.
SeriesGroupBy.nunique([dropna]) Returns number of unique elements in the group
SeriesGroupBy.unique Return unique values of Series object.
SeriesGroupBy.value_counts([normalize, …])
SeriesGroupBy.is_monotonic_increasing Return boolean if values in the object are monotonic_increasing.
SeriesGroupBy.is_monotonic_decreasing Return boolean if values in the object are monotonic_decreasing.

The following methods are available only for DataFrameGroupBy objects.

DataFrameGroupBy.corrwith Compute pairwise correlation between rows or columns of two DataFrame objects.
DataFrameGroupBy.boxplot([subplots, column, …]) Make box plots from DataFrameGroupBy data.

Resampling

Resampler objects are returned by resample calls: pandas.DataFrame.resample(), pandas.Series.resample().

Indexing, iteration

Resampler.__iter__() Resampler iterator.
Resampler.groups Dict {group name -> group labels}.
Resampler.indices Dict {group name -> group indices}.
Resampler.get_group(name[, obj]) Constructs NDFrame from group with provided name.

Function application

Resampler.apply(func, *args, **kwargs) Aggregate using one or more operations over the specified axis.
Resampler.aggregate(func, *args, **kwargs) Aggregate using one or more operations over the specified axis.
Resampler.transform(arg, *args, **kwargs) Call function producing a like-indexed Series on each group and return a Series with the transformed values.
Resampler.pipe(func, *args, **kwargs) Apply a function func with arguments to this Resampler object and return the function’s result.

Upsampling

Resampler.ffill([limit]) Forward fill the values.
Resampler.backfill([limit]) Backward fill the new missing values in the resampled data.
Resampler.bfill([limit]) Backward fill the new missing values in the resampled data.
Resampler.pad([limit]) Forward fill the values.
Resampler.nearest([limit]) Resample by using the nearest value.
Resampler.fillna(method[, limit]) Fill missing values introduced by upsampling.
Resampler.asfreq([fill_value]) Return the values at the new freq, essentially a reindex.
Resampler.interpolate([method, axis, limit, …]) Interpolate values according to different methods.

Computations / Descriptive Stats

Resampler.count([_method]) Compute count of group, excluding missing values.
Resampler.nunique([_method]) Returns number of unique elements in the group
Resampler.first([_method]) Compute first of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
Resampler.last([_method]) Compute last of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
Resampler.max([_method]) Compute max of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
Resampler.mean([_method]) Compute mean of groups, excluding missing values.
Resampler.median([_method]) Compute median of groups, excluding missing values.
Resampler.min([_method]) Compute min of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
Resampler.ohlc([_method]) Compute sum of values, excluding missing values.
Resampler.prod([_method, min_count]) Compute prod of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
Resampler.size() Compute group sizes.
Resampler.sem([_method]) Compute standard error of the mean of groups, excluding missing values.
Resampler.std([ddof]) Compute standard deviation of groups, excluding missing values.
Resampler.sum([_method, min_count]) Compute sum of group values See Also ——– pandas.Series.groupby pandas.DataFrame.groupby pandas.Panel.groupby
Resampler.var([ddof]) Compute variance of groups, excluding missing values.
Resampler.quantile([q]) Return value at the given quantile.

Style

Styler objects are returned by pandas.DataFrame.style.

Styler Constructor

Styler(data[, precision, table_styles, …]) Helps style a DataFrame or Series according to the data with HTML and CSS.
Styler.from_custom_template(searchpath, name) Factory function for creating a subclass of Styler with a custom template and Jinja environment.

Style Application

Styler.apply(func[, axis, subset]) Apply a function column-wise, row-wise, or table-wise, updating the HTML representation with the result.
Styler.applymap(func[, subset]) Apply a function elementwise, updating the HTML representation with the result.
Styler.where(cond, value[, other, subset]) Apply a function elementwise, updating the HTML representation with a style which is selected in accordance with the return value of a function.
Styler.format(formatter[, subset]) Format the text display value of cells.
Styler.set_precision(precision) Set the precision used to render.
Styler.set_table_styles(table_styles) Set the table styles on a Styler.
Styler.set_table_attributes(attributes) Set the table attributes.
Styler.set_caption(caption) Set the caption on a Styler
Styler.set_properties([subset]) Convenience method for setting one or more non-data dependent properties or each cell.
Styler.set_uuid(uuid) Set the uuid for a Styler.
Styler.clear() Reset the styler, removing any previously applied styles.
Styler.pipe(func, *args, **kwargs) Apply func(self, *args, **kwargs), and return the result.

Builtin Styles

Styler.highlight_max([subset, color, axis]) Highlight the maximum by shading the background.
Styler.highlight_min([subset, color, axis]) Highlight the minimum by shading the background.
Styler.highlight_null([null_color]) Shade the background null_color for missing values.
Styler.background_gradient([cmap, low, …]) Color the background in a gradient according to the data in each column (optionally row).
Styler.bar([subset, axis, color, width, …]) Draw bar chart in the cell backgrounds.

Style Export and Import

Styler.render(**kwargs) Render the built up styles to HTML.
Styler.export() Export the styles to applied to the current Styler.
Styler.use(styles) Set the styles on the current Styler, possibly using styles from Styler.export.
Styler.to_excel(excel_writer[, sheet_name, …]) Write Styler to an Excel sheet.

Plotting

The following functions are contained in the pandas.plotting module.

andrews_curves(frame, class_column[, ax, …]) Generates a matplotlib plot of Andrews curves, for visualising clusters of multivariate data.
bootstrap_plot(series[, fig, size, samples]) Bootstrap plot on mean, median and mid-range statistics.
deregister_matplotlib_converters() Remove pandas’ formatters and converters
lag_plot(series[, lag, ax]) Lag plot for time series.
parallel_coordinates(frame, class_column[, …]) Parallel coordinates plotting.
radviz(frame, class_column[, ax, color, …]) Plot a multidimensional dataset in 2D.
register_matplotlib_converters([explicit]) Register Pandas Formatters and Converters with matplotlib
scatter_matrix(frame[, alpha, figsize, ax, …]) Draw a matrix of scatter plots.

General utility functions

Working with options

describe_option(pat[, _print_desc]) Prints the description for one or more registered options.
reset_option(pat) Reset one or more options to their default value.
get_option(pat) Retrieves the value of the specified option.
set_option(pat, value) Sets the value of the specified option.
option_context(*args) Context manager to temporarily set options in the with statement context.

Testing functions

testing.assert_frame_equal(left, right[, …]) Check that left and right DataFrame are equal.
testing.assert_series_equal(left, right[, …]) Check that left and right Series are equal.
testing.assert_index_equal(left, right[, …]) Check that left and right Index are equal.

Exceptions and warnings

errors.DtypeWarning Warning raised when reading different dtypes in a column from a file.
errors.EmptyDataError Exception that is thrown in pd.read_csv (by both the C and Python engines) when empty data or header is encountered.
errors.OutOfBoundsDatetime
errors.ParserError Exception that is raised by an error encountered in pd.read_csv.
errors.ParserWarning Warning raised when reading a file that doesn’t use the default ‘c’ parser.
errors.PerformanceWarning Warning raised when there is a possible performance impact.
errors.UnsortedIndexError Error raised when attempting to get a slice of a MultiIndex, and the index has not been lexsorted.
errors.UnsupportedFunctionCall Exception raised when attempting to call a numpy function on a pandas object, but that function is not supported by the object e.g.

Extensions

These are primarily intended for library authors looking to extend pandas objects.

api.extensions.register_extension_dtype(cls) Class decorator to register an ExtensionType with pandas.
api.extensions.register_dataframe_accessor(name) Register a custom accessor on DataFrame objects.
api.extensions.register_series_accessor(name) Register a custom accessor on Series objects.
api.extensions.register_index_accessor(name) Register a custom accessor on Index objects.
api.extensions.ExtensionDtype A custom data type, to be paired with an ExtensionArray.
api.extensions.ExtensionArray Abstract base class for custom 1-D array types.
Scroll To Top