Table Of Contents

Search

Enter search terms or a module, class or function name.

API Reference

This page gives an overview of all public pandas objects, functions and methods. In general, all classes and functions exposed in the top-level pandas.* namespace are regarded as public.

Further some of the subpackages are public, including pandas.errors, pandas.plotting, and pandas.testing. Certain functions in the the pandas.io and pandas.tseries submodules are public as well (those mentioned in the documentation). Further, the pandas.api.types subpackage holds some public functions related to data types in pandas.

Warning

The pandas.core, pandas.compat, and pandas.util top-level modules are considered to be PRIVATE. Stability of functionality in those modules in not guaranteed.

Input/Output

Pickling

read_pickle(path[, compression]) Load pickled pandas object (or any other pickled object) from the specified

Flat File

read_table(filepath_or_buffer[, sep, ...]) Read general delimited file into DataFrame
read_csv(filepath_or_buffer[, sep, ...]) Read CSV (comma-separated) file into DataFrame
read_fwf(filepath_or_buffer[, colspecs, widths]) Read a table of fixed-width formatted lines into DataFrame
read_msgpack(path_or_buf[, encoding, iterator]) Load msgpack pandas object from the specified

Clipboard

read_clipboard([sep]) Read text from clipboard and pass to read_table.

Excel

read_excel(io[, sheet_name, header, ...]) Read an Excel table into a pandas DataFrame
ExcelFile.parse([sheet_name, header, ...]) Parse specified sheet(s) into a DataFrame

JSON

read_json([path_or_buf, orient, typ, dtype, ...]) Convert a JSON string to pandas object
json_normalize(data[, record_path, meta, ...]) “Normalize” semi-structured JSON data into a flat table
build_table_schema(data[, index, ...]) Create a Table schema from data.

HTML

read_html(io[, match, flavor, header, ...]) Read HTML tables into a list of DataFrame objects.

HDFStore: PyTables (HDF5)

read_hdf(path_or_buf[, key, mode]) read from the store, close it if we opened it
HDFStore.put(key, value[, format, append]) Store object in HDFStore
HDFStore.append(key, value[, format, ...]) Append to Table in file.
HDFStore.get(key) Retrieve pandas object stored in file
HDFStore.select(key[, where, start, stop, ...]) Retrieve pandas object stored in file, optionally based on where
HDFStore.info() print detailed information on the store

Feather

read_feather(path[, nthreads]) Load a feather-format object from the file path

Parquet

read_parquet(path[, engine, columns]) Load a parquet object from the file path, returning a DataFrame.

SAS

read_sas(filepath_or_buffer[, format, ...]) Read SAS files stored as either XPORT or SAS7BDAT format files.

SQL

read_sql_table(table_name, con[, schema, ...]) Read SQL database table into a DataFrame.
read_sql_query(sql, con[, index_col, ...]) Read SQL query into a DataFrame.
read_sql(sql, con[, index_col, ...]) Read SQL query or database table into a DataFrame.

Google BigQuery

read_gbq(query[, project_id, index_col, ...]) Load data from Google BigQuery.

STATA

read_stata(filepath_or_buffer[, ...]) Read Stata file into DataFrame
StataReader.data(**kwargs) Reads observations from Stata file, converting them into a dataframe
StataReader.data_label() Returns data label of Stata file
StataReader.value_labels() Returns a dict, associating each variable name a dict, associating
StataReader.variable_labels() Returns variable labels as a dict, associating each variable name
StataWriter.write_file()

General functions

Data manipulations

melt(frame[, id_vars, value_vars, var_name, ...]) “Unpivots” a DataFrame from wide format to long format, optionally
pivot(index, columns, values) Produce ‘pivot’ table based on 3 columns of this DataFrame.
pivot_table(data[, values, index, columns, ...]) Create a spreadsheet-style pivot table as a DataFrame.
crosstab(index, columns[, values, rownames, ...]) Compute a simple cross-tabulation of two (or more) factors.
cut(x, bins[, right, labels, retbins, ...]) Return indices of half-open bins to which each value of x belongs.
qcut(x, q[, labels, retbins, precision, ...]) Quantile-based discretization function.
merge(left, right[, how, on, left_on, ...]) Merge DataFrame objects by performing a database-style join operation by columns or indexes.
merge_ordered(left, right[, on, left_on, ...]) Perform merge with optional filling/interpolation designed for ordered data like time series data.
merge_asof(left, right[, on, left_on, ...]) Perform an asof merge.
concat(objs[, axis, join, join_axes, ...]) Concatenate pandas objects along a particular axis with optional set logic along the other axes.
get_dummies(data[, prefix, prefix_sep, ...]) Convert categorical variable into dummy/indicator variables
factorize(values[, sort, order, ...]) Encode input values as an enumerated type or categorical variable
unique(values) Hash table-based unique.
wide_to_long(df, stubnames, i, j[, sep, suffix]) Wide panel to long format.

Top-level missing data

isna(obj) Detect missing values (NaN in numeric arrays, None/NaN in object arrays)
isnull(obj) Detect missing values (NaN in numeric arrays, None/NaN in object arrays)
notna(obj) Replacement for numpy.isfinite / -numpy.isnan which is suitable for use on object arrays.
notnull(obj) Replacement for numpy.isfinite / -numpy.isnan which is suitable for use on object arrays.

Top-level conversions

to_numeric(arg[, errors, downcast]) Convert argument to a numeric type.

Top-level dealing with datetimelike

to_datetime(arg[, errors, dayfirst, ...]) Convert argument to datetime.
to_timedelta(arg[, unit, box, errors]) Convert argument to timedelta
date_range([start, end, periods, freq, tz, ...]) Return a fixed frequency DatetimeIndex, with day (calendar) as the default
bdate_range([start, end, periods, freq, tz, ...]) Return a fixed frequency DatetimeIndex, with business day as the default
period_range([start, end, periods, freq, name]) Return a fixed frequency PeriodIndex, with day (calendar) as the default
timedelta_range([start, end, periods, freq, ...]) Return a fixed frequency TimedeltaIndex, with day as the default
infer_freq(index[, warn]) Infer the most likely frequency given the input index.

Top-level dealing with intervals

interval_range([start, end, periods, freq, ...]) Return a fixed frequency IntervalIndex

Top-level evaluation

eval(expr[, parser, engine, truediv, ...]) Evaluate a Python expression as a string using various backends.

Testing

test([extra_args])

Series

Constructor

Series([data, index, dtype, name, copy, ...]) One-dimensional ndarray with axis labels (including time series).

Attributes

Axes
  • index: axis labels
Series.values Return Series as ndarray or ndarray-like
Series.dtype return the dtype object of the underlying data
Series.ftype return if the data is sparse|dense
Series.shape return a tuple of the shape of the underlying data
Series.nbytes return the number of bytes in the underlying data
Series.ndim return the number of dimensions of the underlying data,
Series.size return the number of elements in the underlying data
Series.strides return the strides of the underlying data
Series.itemsize return the size of the dtype of the item of the underlying data
Series.base return the base object if the memory of the underlying data is
Series.T return the transpose, which is by definition self
Series.memory_usage([index, deep]) Memory usage of the Series
Series.hasnans
Series.flags
Series.empty
Series.dtypes return the dtype object of the underlying data
Series.ftypes return if the data is sparse|dense
Series.data return the data pointer of the underlying data
Series.is_copy
Series.name
Series.put(*args, **kwargs) Applies the put method to its values attribute if it has one.

Conversion

Series.astype(dtype[, copy, errors]) Cast a pandas object to a specified dtype dtype.
Series.infer_objects() Attempt to infer better dtypes for object columns.
Series.convert_objects([convert_dates, ...]) Deprecated.
Series.copy([deep]) Make a copy of this objects data.
Series.bool() Return the bool of a single element PandasObject.
Series.to_period([freq, copy]) Convert Series from DatetimeIndex to PeriodIndex with desired
Series.to_timestamp([freq, how, copy]) Cast to datetimeindex of timestamps, at beginning of period
Series.tolist() Return a list of the values.
Series.get_values() same as values (but handles sparseness conversions); is a view

Indexing, iteration

Series.get(key[, default]) Get item from object for given key (DataFrame column, Panel slice, etc.).
Series.at Fast label-based scalar accessor
Series.iat Fast integer location scalar accessor.
Series.loc Purely label-location based indexer for selection by label.
Series.iloc Purely integer-location based indexing for selection by position.
Series.__iter__() Return an iterator of the values.
Series.iteritems() Lazily iterate over (index, value) tuples
Series.items() Lazily iterate over (index, value) tuples
Series.keys() Alias for index
Series.pop(item) Return item and drop from frame.
Series.item() return the first element of the underlying data as a python
Series.xs(key[, axis, level, drop_level]) Returns a cross-section (row(s) or column(s)) from the Series/DataFrame.

For more information on .at, .iat, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Series.add(other[, level, fill_value, axis]) Addition of series and other, element-wise (binary operator add).
Series.sub(other[, level, fill_value, axis]) Subtraction of series and other, element-wise (binary operator sub).
Series.mul(other[, level, fill_value, axis]) Multiplication of series and other, element-wise (binary operator mul).
Series.div(other[, level, fill_value, axis]) Floating division of series and other, element-wise (binary operator truediv).
Series.truediv(other[, level, fill_value, axis]) Floating division of series and other, element-wise (binary operator truediv).
Series.floordiv(other[, level, fill_value, axis]) Integer division of series and other, element-wise (binary operator floordiv).
Series.mod(other[, level, fill_value, axis]) Modulo of series and other, element-wise (binary operator mod).
Series.pow(other[, level, fill_value, axis]) Exponential power of series and other, element-wise (binary operator pow).
Series.radd(other[, level, fill_value, axis]) Addition of series and other, element-wise (binary operator radd).
Series.rsub(other[, level, fill_value, axis]) Subtraction of series and other, element-wise (binary operator rsub).
Series.rmul(other[, level, fill_value, axis]) Multiplication of series and other, element-wise (binary operator rmul).
Series.rdiv(other[, level, fill_value, axis]) Floating division of series and other, element-wise (binary operator rtruediv).
Series.rtruediv(other[, level, fill_value, axis]) Floating division of series and other, element-wise (binary operator rtruediv).
Series.rfloordiv(other[, level, fill_value, ...]) Integer division of series and other, element-wise (binary operator rfloordiv).
Series.rmod(other[, level, fill_value, axis]) Modulo of series and other, element-wise (binary operator rmod).
Series.rpow(other[, level, fill_value, axis]) Exponential power of series and other, element-wise (binary operator rpow).
Series.combine(other, func[, fill_value]) Perform elementwise binary operation on two Series using given function
Series.combine_first(other) Combine Series values, choosing the calling Series’s values first.
Series.round([decimals]) Round each value in a Series to the given number of decimals.
Series.lt(other[, level, fill_value, axis]) Less than of series and other, element-wise (binary operator lt).
Series.gt(other[, level, fill_value, axis]) Greater than of series and other, element-wise (binary operator gt).
Series.le(other[, level, fill_value, axis]) Less than or equal to of series and other, element-wise (binary operator le).
Series.ge(other[, level, fill_value, axis]) Greater than or equal to of series and other, element-wise (binary operator ge).
Series.ne(other[, level, fill_value, axis]) Not equal to of series and other, element-wise (binary operator ne).
Series.eq(other[, level, fill_value, axis]) Equal to of series and other, element-wise (binary operator eq).
Series.product([axis, skipna, level, ...]) Return the product of the values for the requested axis
Series.dot(other) Matrix multiplication with DataFrame or inner-product with Series

Function application, GroupBy & Window

Series.apply(func[, convert_dtype, args]) Invoke function on values of Series.
Series.agg(func[, axis]) Aggregate using callable, string, dict, or list of string/callables
Series.aggregate(func[, axis]) Aggregate using callable, string, dict, or list of string/callables
Series.transform(func, *args, **kwargs) Call function producing a like-indexed NDFrame
Series.map(arg[, na_action]) Map values of Series using input correspondence (which can be
Series.groupby([by, axis, level, as_index, ...]) Group series using mapper (dict or key function, apply given function to group, return result as series) or by a series of columns.
Series.rolling(window[, min_periods, freq, ...]) Provides rolling window calculations.
Series.expanding([min_periods, freq, ...]) Provides expanding transformations.
Series.ewm([com, span, halflife, alpha, ...]) Provides exponential weighted functions
Series.pipe(func, *args, **kwargs) Apply func(self, *args, **kwargs)

Computations / Descriptive Stats

Series.abs() Return an object with absolute value taken–only applicable to objects that are all numeric.
Series.all([axis, bool_only, skipna, level]) Return whether all elements are True over requested axis
Series.any([axis, bool_only, skipna, level]) Return whether any element is True over requested axis
Series.autocorr([lag]) Lag-N autocorrelation
Series.between(left, right[, inclusive]) Return boolean Series equivalent to left <= series <= right.
Series.clip([lower, upper, axis, inplace]) Trim values at input threshold(s).
Series.clip_lower(threshold[, axis, inplace]) Return copy of the input with values below given value(s) truncated.
Series.clip_upper(threshold[, axis, inplace]) Return copy of input with values above given value(s) truncated.
Series.corr(other[, method, min_periods]) Compute correlation with other Series, excluding missing values
Series.count([level]) Return number of non-NA/null observations in the Series
Series.cov(other[, min_periods]) Compute covariance with Series, excluding missing values
Series.cummax([axis, skipna]) Return cumulative max over requested axis.
Series.cummin([axis, skipna]) Return cumulative minimum over requested axis.
Series.cumprod([axis, skipna]) Return cumulative product over requested axis.
Series.cumsum([axis, skipna]) Return cumulative sum over requested axis.
Series.describe([percentiles, include, exclude]) Generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.
Series.diff([periods]) 1st discrete difference of object
Series.factorize([sort, na_sentinel]) Encode the object as an enumerated type or categorical variable
Series.kurt([axis, skipna, level, numeric_only]) Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0).
Series.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis
Series.max([axis, skipna, level, numeric_only]) This method returns the maximum of the values in the object.
Series.mean([axis, skipna, level, numeric_only]) Return the mean of the values for the requested axis
Series.median([axis, skipna, level, ...]) Return the median of the values for the requested axis
Series.min([axis, skipna, level, numeric_only]) This method returns the minimum of the values in the object.
Series.mode() Return the mode(s) of the dataset.
Series.nlargest([n, keep]) Return the largest n elements.
Series.nsmallest([n, keep]) Return the smallest n elements.
Series.pct_change([periods, fill_method, ...]) Percent change over given number of periods.
Series.prod([axis, skipna, level, numeric_only]) Return the product of the values for the requested axis
Series.quantile([q, interpolation]) Return value at the given quantile, a la numpy.percentile.
Series.rank([axis, method, numeric_only, ...]) Compute numerical data ranks (1 through n) along axis.
Series.sem([axis, skipna, level, ddof, ...]) Return unbiased standard error of the mean over requested axis.
Series.skew([axis, skipna, level, numeric_only]) Return unbiased skew over requested axis
Series.std([axis, skipna, level, ddof, ...]) Return sample standard deviation over requested axis.
Series.sum([axis, skipna, level, numeric_only]) Return the sum of the values for the requested axis
Series.var([axis, skipna, level, ddof, ...]) Return unbiased variance over requested axis.
Series.kurtosis([axis, skipna, level, ...]) Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0).
Series.unique() Return unique values in the object.
Series.nunique([dropna]) Return number of unique elements in the object.
Series.is_unique Return boolean if values in the object are unique
Series.is_monotonic Return boolean if values in the object are
Series.is_monotonic_increasing Return boolean if values in the object are
Series.is_monotonic_decreasing Return boolean if values in the object are
Series.value_counts([normalize, sort, ...]) Returns object containing counts of unique values.
Series.compound([axis, skipna, level]) Return the compound percentage of the values for the requested axis
Series.nonzero() Return the indices of the elements that are non-zero
Series.ptp([axis, skipna, level, numeric_only]) Returns the difference between the maximum value and the minimum value in the object.

Reindexing / Selection / Label manipulation

Series.align(other[, join, axis, level, ...]) Align two objects on their axes with the
Series.drop([labels, axis, index, columns, ...]) Return new object with labels in requested axis removed.
Series.drop_duplicates([keep, inplace]) Return Series with duplicate values removed
Series.duplicated([keep]) Return boolean Series denoting duplicate values
Series.equals(other) Determines if two NDFrame objects contain the same elements.
Series.first(offset) Convenience method for subsetting initial periods of time series data based on a date offset.
Series.head([n]) Return the first n rows.
Series.idxmax([axis, skipna]) Index label of the first occurrence of maximum of values.
Series.idxmin([axis, skipna]) Index label of the first occurrence of minimum of values.
Series.isin(values) Return a boolean Series showing whether each element in the Series is exactly contained in the passed sequence of values.
Series.last(offset) Convenience method for subsetting final periods of time series data based on a date offset.
Series.reindex([index]) Conform Series to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index.
Series.reindex_like(other[, method, copy, ...]) Return an object with matching indices to myself.
Series.rename([index]) Alter Series index labels or name
Series.rename_axis(mapper[, axis, copy, inplace]) Alter the name of the index or columns.
Series.reset_index([level, drop, name, inplace]) Analogous to the pandas.DataFrame.reset_index() function, see docstring there.
Series.sample([n, frac, replace, weights, ...]) Returns a random sample of items from an axis of object.
Series.select(crit[, axis]) Return data corresponding to axis labels matching criteria
Series.set_axis(labels[, axis, inplace]) Assign desired index to given axis
Series.take(indices[, axis, convert, is_copy]) Return the elements in the given positional indices along an axis.
Series.tail([n]) Return the last n rows.
Series.truncate([before, after, axis, copy]) Truncates a sorted DataFrame/Series before and/or after some particular index value.
Series.where(cond[, other, inplace, axis, ...]) Return an object of same shape as self and whose corresponding entries are from self where cond is True and otherwise are from other.
Series.mask(cond[, other, inplace, axis, ...]) Return an object of same shape as self and whose corresponding entries are from self where cond is False and otherwise are from other.
Series.add_prefix(prefix) Concatenate prefix string with panel items names.
Series.add_suffix(suffix) Concatenate suffix string with panel items names.
Series.filter([items, like, regex, axis]) Subset rows or columns of dataframe according to labels in the specified index.

Missing data handling

Series.isna() Return a boolean same-sized object indicating if the values are NA.
Series.notna() Return a boolean same-sized object indicating if the values are not NA.
Series.dropna([axis, inplace]) Return Series without null values
Series.fillna([value, method, axis, ...]) Fill NA/NaN values using the specified method
Series.interpolate([method, axis, limit, ...]) Interpolate values according to different methods.

Reshaping, sorting

Series.argsort([axis, kind, order]) Overrides ndarray.argsort.
Series.argmin([axis, skipna]) ‘argmin’ is deprecated, use ‘idxmin’ instead. The behavior of ‘argmin’
Series.argmax([axis, skipna]) ‘argmax’ is deprecated, use ‘idxmax’ instead. The behavior of ‘argmax’
Series.reorder_levels(order) Rearrange index levels using input order.
Series.sort_values([axis, ascending, ...]) Sort by the values along either axis
Series.sort_index([axis, level, ascending, ...]) Sort object by labels (along an axis)
Series.swaplevel([i, j, copy]) Swap levels i and j in a MultiIndex
Series.unstack([level, fill_value]) Unstack, a.k.a.
Series.searchsorted(value[, side, sorter]) Find indices where elements should be inserted to maintain order.
Series.ravel([order]) Return the flattened underlying data as an ndarray
Series.repeat(repeats, *args, **kwargs) Repeat elements of an Series.
Series.squeeze([axis]) Squeeze length 1 dimensions.
Series.view([dtype])
Series.sortlevel([level, ascending, ...]) DEPRECATED: use Series.sort_index()

Combining / joining / merging

Series.append(to_append[, ignore_index, ...]) Concatenate two or more Series.
Series.replace([to_replace, value, inplace, ...]) Replace values given in ‘to_replace’ with ‘value’.
Series.update(other) Modify Series in place using non-NA values from passed Series.

Datetimelike Properties

Series.dt can be used to access the values of the series as datetimelike and return several properties. These can be accessed like Series.dt.<property>.

Datetime Properties

Series.dt.date Returns numpy array of python datetime.date objects (namely, the date part of Timestamps without timezone information).
Series.dt.time Returns numpy array of datetime.time.
Series.dt.year The year of the datetime
Series.dt.month The month as January=1, December=12
Series.dt.day The days of the datetime
Series.dt.hour The hours of the datetime
Series.dt.minute The minutes of the datetime
Series.dt.second The seconds of the datetime
Series.dt.microsecond The microseconds of the datetime
Series.dt.nanosecond The nanoseconds of the datetime
Series.dt.week The week ordinal of the year
Series.dt.weekofyear The week ordinal of the year
Series.dt.dayofweek The day of the week with Monday=0, Sunday=6
Series.dt.weekday The day of the week with Monday=0, Sunday=6
Series.dt.weekday_name The name of day in a week (ex: Friday)
Series.dt.dayofyear The ordinal day of the year
Series.dt.quarter The quarter of the date
Series.dt.is_month_start Logical indicating if first day of month (defined by frequency)
Series.dt.is_month_end Logical indicating if last day of month (defined by frequency)
Series.dt.is_quarter_start Logical indicating if first day of quarter (defined by frequency)
Series.dt.is_quarter_end Logical indicating if last day of quarter (defined by frequency)
Series.dt.is_year_start Logical indicating if first day of year (defined by frequency)
Series.dt.is_year_end Logical indicating if last day of year (defined by frequency)
Series.dt.is_leap_year Logical indicating if the date belongs to a leap year
Series.dt.daysinmonth The number of days in the month
Series.dt.days_in_month The number of days in the month
Series.dt.tz
Series.dt.freq

Datetime Methods

Series.dt.to_period(*args, **kwargs) Cast to PeriodIndex at a particular frequency
Series.dt.to_pydatetime()
Series.dt.tz_localize(*args, **kwargs) Localize tz-naive DatetimeIndex to given time zone (using
Series.dt.tz_convert(*args, **kwargs) Convert tz-aware DatetimeIndex from one time zone to another (using
Series.dt.normalize(*args, **kwargs) Return DatetimeIndex with times to midnight.
Series.dt.strftime(*args, **kwargs) Return an array of formatted strings specified by date_format, which supports the same string format as the python standard library.
Series.dt.round(*args, **kwargs) round the index to the specified freq
Series.dt.floor(*args, **kwargs) floor the index to the specified freq
Series.dt.ceil(*args, **kwargs) ceil the index to the specified freq

Timedelta Properties

Series.dt.days Number of days for each element.
Series.dt.seconds Number of seconds (>= 0 and less than 1 day) for each element.
Series.dt.microseconds Number of microseconds (>= 0 and less than 1 second) for each element.
Series.dt.nanoseconds Number of nanoseconds (>= 0 and less than 1 microsecond) for each element.
Series.dt.components Return a dataframe of the components (days, hours, minutes, seconds, milliseconds, microseconds, nanoseconds) of the Timedeltas.

Timedelta Methods

Series.dt.to_pytimedelta()
Series.dt.total_seconds(*args, **kwargs) Total duration of each element expressed in seconds.

String handling

Series.str can be used to access the values of the series as strings and apply several methods to it. These can be accessed like Series.str.<function/property>.

Series.str.capitalize() Convert strings in the Series/Index to be capitalized.
Series.str.cat([others, sep, na_rep]) Concatenate strings in the Series/Index with given separator.
Series.str.center(width[, fillchar]) Filling left and right side of strings in the Series/Index with an additional character.
Series.str.contains(pat[, case, flags, na, ...]) Return boolean Series/array whether given pattern/regex is contained in each string in the Series/Index.
Series.str.count(pat[, flags]) Count occurrences of pattern in each string of the Series/Index.
Series.str.decode(encoding[, errors]) Decode character string in the Series/Index using indicated encoding.
Series.str.encode(encoding[, errors]) Encode character string in the Series/Index using indicated encoding.
Series.str.endswith(pat[, na]) Return boolean Series indicating whether each string in the Series/Index ends with passed pattern.
Series.str.extract(pat[, flags, expand]) For each subject string in the Series, extract groups from the first match of regular expression pat.
Series.str.extractall(pat[, flags]) For each subject string in the Series, extract groups from all matches of regular expression pat.
Series.str.find(sub[, start, end]) Return lowest indexes in each strings in the Series/Index where the substring is fully contained between [start:end].
Series.str.findall(pat[, flags]) Find all occurrences of pattern or regular expression in the Series/Index.
Series.str.get(i) Extract element from lists, tuples, or strings in each element in the Series/Index.
Series.str.index(sub[, start, end]) Return lowest indexes in each strings where the substring is fully contained between [start:end].
Series.str.join(sep) Join lists contained as elements in the Series/Index with passed delimiter.
Series.str.len() Compute length of each string in the Series/Index.
Series.str.ljust(width[, fillchar]) Filling right side of strings in the Series/Index with an additional character.
Series.str.lower() Convert strings in the Series/Index to lowercase.
Series.str.lstrip([to_strip]) Strip whitespace (including newlines) from each string in the Series/Index from left side.
Series.str.match(pat[, case, flags, na, ...]) Determine if each string matches a regular expression.
Series.str.normalize(form) Return the Unicode normal form for the strings in the Series/Index.
Series.str.pad(width[, side, fillchar]) Pad strings in the Series/Index with an additional character to specified side.
Series.str.partition([pat, expand]) Split the string at the first occurrence of sep, and return 3 elements containing the part before the separator, the separator itself, and the part after the separator.
Series.str.repeat(repeats) Duplicate each string in the Series/Index by indicated number of times.
Series.str.replace(pat, repl[, n, case, flags]) Replace occurrences of pattern/regex in the Series/Index with some other string.
Series.str.rfind(sub[, start, end]) Return highest indexes in each strings in the Series/Index where the substring is fully contained between [start:end].
Series.str.rindex(sub[, start, end]) Return highest indexes in each strings where the substring is fully contained between [start:end].
Series.str.rjust(width[, fillchar]) Filling left side of strings in the Series/Index with an additional character.
Series.str.rpartition([pat, expand]) Split the string at the last occurrence of sep, and return 3 elements containing the part before the separator, the separator itself, and the part after the separator.
Series.str.rstrip([to_strip]) Strip whitespace (including newlines) from each string in the Series/Index from right side.
Series.str.slice([start, stop, step]) Slice substrings from each element in the Series/Index
Series.str.slice_replace([start, stop, repl]) Replace a slice of each string in the Series/Index with another string.
Series.str.split([pat, n, expand]) Split each string (a la re.split) in the Series/Index by given pattern, propagating NA values.
Series.str.rsplit([pat, n, expand]) Split each string in the Series/Index by the given delimiter string, starting at the end of the string and working to the front.
Series.str.startswith(pat[, na]) Return boolean Series/array indicating whether each string in the Series/Index starts with passed pattern.
Series.str.strip([to_strip]) Strip whitespace (including newlines) from each string in the Series/Index from left and right sides.
Series.str.swapcase() Convert strings in the Series/Index to be swapcased.
Series.str.title() Convert strings in the Series/Index to titlecase.
Series.str.translate(table[, deletechars]) Map all characters in the string through the given mapping table.
Series.str.upper() Convert strings in the Series/Index to uppercase.
Series.str.wrap(width, **kwargs) Wrap long strings in the Series/Index to be formatted in paragraphs with length less than a given width.
Series.str.zfill(width) Filling left side of strings in the Series/Index with 0.
Series.str.isalnum() Check whether all characters in each string in the Series/Index are alphanumeric.
Series.str.isalpha() Check whether all characters in each string in the Series/Index are alphabetic.
Series.str.isdigit() Check whether all characters in each string in the Series/Index are digits.
Series.str.isspace() Check whether all characters in each string in the Series/Index are whitespace.
Series.str.islower() Check whether all characters in each string in the Series/Index are lowercase.
Series.str.isupper() Check whether all characters in each string in the Series/Index are uppercase.
Series.str.istitle() Check whether all characters in each string in the Series/Index are titlecase.
Series.str.isnumeric() Check whether all characters in each string in the Series/Index are numeric.
Series.str.isdecimal() Check whether all characters in each string in the Series/Index are decimal.
Series.str.get_dummies([sep]) Split each string in the Series by sep and return a frame of dummy/indicator variables.

Categorical

Pandas defines a custom data type for representing data that can take only a limited, fixed set of values. The dtype of a Categorical can be described by a pandas.api.types.CategoricalDtype.

api.types.CategoricalDtype([categories, ordered]) Type for categorical data with the categories and orderedness
api.types.CategoricalDtype.categories An Index containing the unique categories allowed.
api.types.CategoricalDtype.ordered Whether the categories have an ordered relationship

Categorical data can be stored in a pandas.Categorical

Categorical(values[, categories, ordered, ...]) Represents a categorical variable in classic R / S-plus fashion

The alternative Categorical.from_codes() constructor can be used when you have the categories and integer codes already:

Categorical.from_codes(codes, categories[, ...]) Make a Categorical type from codes and categories arrays.

The dtype information is available on the Categorical

Categorical.dtype The CategoricalDtype for this instance
Categorical.categories The categories of this categorical.
Categorical.ordered Whether the categories have an ordered relationship
Categorical.codes The category codes of this categorical.

np.asarray(categorical) works by implementing the array interface. Be aware, that this converts the Categorical back to a numpy array, so categories and order information is not preserved!

Categorical.__array__([dtype]) The numpy array interface.

A Categorical can be stored in a Series or DataFrame. To create a Series of dtype category, use cat = s.astype(dtype) or Series(..., dtype=dtype) where dtype is either

If the Series is of dtype CategoricalDtype, Series.cat can be used to change the categorical data. This accessor is similar to the Series.dt or Series.str and has the following usable methods and properties:

Series.cat.categories The categories of this categorical.
Series.cat.ordered Whether the categories have an ordered relationship
Series.cat.codes
Series.cat.rename_categories(*args, **kwargs) Renames categories.
Series.cat.reorder_categories(*args, **kwargs) Reorders categories as specified in new_categories.
Series.cat.add_categories(*args, **kwargs) Add new categories.
Series.cat.remove_categories(*args, **kwargs) Removes the specified categories.
Series.cat.remove_unused_categories(*args, ...) Removes categories which are not used.
Series.cat.set_categories(*args, **kwargs) Sets the categories to the specified new_categories.
Series.cat.as_ordered(*args, **kwargs) Sets the Categorical to be ordered
Series.cat.as_unordered(*args, **kwargs) Sets the Categorical to be unordered

Plotting

Series.plot is both a callable method and a namespace attribute for specific plotting methods of the form Series.plot.<kind>.

Series.plot([kind, ax, figsize, ....]) Series plotting accessor and method
Series.plot.area(**kwds) Area plot
Series.plot.bar(**kwds) Vertical bar plot
Series.plot.barh(**kwds) Horizontal bar plot
Series.plot.box(**kwds) Boxplot
Series.plot.density(**kwds) Kernel Density Estimate plot
Series.plot.hist([bins]) Histogram
Series.plot.kde(**kwds) Kernel Density Estimate plot
Series.plot.line(**kwds) Line plot
Series.plot.pie(**kwds) Pie chart
Series.hist([by, ax, grid, xlabelsize, ...]) Draw histogram of the input series using matplotlib

Serialization / IO / Conversion

Series.to_pickle(path[, compression, protocol]) Pickle (serialize) object to input file path.
Series.to_csv([path, index, sep, na_rep, ...]) Write Series to a comma-separated values (csv) file
Series.to_dict([into]) Convert Series to {label -> value} dict or dict-like object.
Series.to_excel(excel_writer[, sheet_name, ...]) Write Series to an excel sheet
Series.to_frame([name]) Convert Series to DataFrame
Series.to_xarray() Return an xarray object from the pandas object.
Series.to_hdf(path_or_buf, key, **kwargs) Write the contained data to an HDF5 file using HDFStore.
Series.to_sql(name, con[, flavor, schema, ...]) Write records stored in a DataFrame to a SQL database.
Series.to_msgpack([path_or_buf, encoding]) msgpack (serialize) object to input file path
Series.to_json([path_or_buf, orient, ...]) Convert the object to a JSON string.
Series.to_sparse([kind, fill_value]) Convert Series to SparseSeries
Series.to_dense() Return dense representation of NDFrame (as opposed to sparse)
Series.to_string([buf, na_rep, ...]) Render a string representation of the Series
Series.to_clipboard([excel, sep]) Attempt to write text representation of object to the system clipboard This can be pasted into Excel, for example.
Series.to_latex([buf, columns, col_space, ...]) Render an object to a tabular environment table.

Sparse

SparseSeries.to_coo([row_levels, ...]) Create a scipy.sparse.coo_matrix from a SparseSeries with MultiIndex.
SparseSeries.from_coo(A[, dense_index]) Create a SparseSeries from a scipy.sparse.coo_matrix.

DataFrame

Constructor

DataFrame([data, index, columns, dtype, copy]) Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns).

Attributes and underlying data

Axes

  • index: row labels
  • columns: column labels
DataFrame.as_matrix([columns]) Convert the frame to its Numpy-array representation.
DataFrame.dtypes Return the dtypes in this object.
DataFrame.ftypes Return the ftypes (indication of sparse/dense and dtype) in this object.
DataFrame.get_dtype_counts() Return the counts of dtypes in this object.
DataFrame.get_ftype_counts() Return the counts of ftypes in this object.
DataFrame.select_dtypes([include, exclude]) Return a subset of a DataFrame including/excluding columns based on their dtype.
DataFrame.values Numpy representation of NDFrame
DataFrame.get_values() same as values (but handles sparseness conversions)
DataFrame.axes Return a list with the row axis labels and column axis labels as the only members.
DataFrame.ndim Number of axes / array dimensions
DataFrame.size number of elements in the NDFrame
DataFrame.shape Return a tuple representing the dimensionality of the DataFrame.
DataFrame.memory_usage([index, deep]) Memory usage of DataFrame columns.
DataFrame.empty True if NDFrame is entirely empty [no items], meaning any of the axes are of length 0.
DataFrame.is_copy

Conversion

DataFrame.astype(dtype[, copy, errors]) Cast a pandas object to a specified dtype dtype.
DataFrame.convert_objects([convert_dates, ...]) Deprecated.
DataFrame.infer_objects() Attempt to infer better dtypes for object columns.
DataFrame.copy([deep]) Make a copy of this objects data.
DataFrame.isna() Return a boolean same-sized object indicating if the values are NA.
DataFrame.notna() Return a boolean same-sized object indicating if the values are not NA.
DataFrame.bool() Return the bool of a single element PandasObject.

Indexing, iteration

DataFrame.head([n]) Return the first n rows.
DataFrame.at Fast label-based scalar accessor
DataFrame.iat Fast integer location scalar accessor.
DataFrame.loc Purely label-location based indexer for selection by label.
DataFrame.iloc Purely integer-location based indexing for selection by position.
DataFrame.insert(loc, column, value[, ...]) Insert column into DataFrame at specified location.
DataFrame.insert(loc, column, value[, ...]) Insert column into DataFrame at specified location.
DataFrame.__iter__() Iterate over infor axis
DataFrame.items() Iterator over (column name, Series) pairs.
DataFrame.keys() Get the ‘info axis’ (see Indexing for more)
DataFrame.iteritems() Iterator over (column name, Series) pairs.
DataFrame.iterrows() Iterate over DataFrame rows as (index, Series) pairs.
DataFrame.itertuples([index, name]) Iterate over DataFrame rows as namedtuples, with index value as first element of the tuple.
DataFrame.lookup(row_labels, col_labels) Label-based “fancy indexing” function for DataFrame.
DataFrame.pop(item) Return item and drop from frame.
DataFrame.tail([n]) Return the last n rows.
DataFrame.xs(key[, axis, level, drop_level]) Returns a cross-section (row(s) or column(s)) from the Series/DataFrame.
DataFrame.get(key[, default]) Get item from object for given key (DataFrame column, Panel slice, etc.).
DataFrame.isin(values) Return boolean DataFrame showing whether each element in the DataFrame is contained in values.
DataFrame.where(cond[, other, inplace, ...]) Return an object of same shape as self and whose corresponding entries are from self where cond is True and otherwise are from other.
DataFrame.mask(cond[, other, inplace, axis, ...]) Return an object of same shape as self and whose corresponding entries are from self where cond is False and otherwise are from other.
DataFrame.query(expr[, inplace]) Query the columns of a frame with a boolean expression.

For more information on .at, .iat, .loc, and .iloc, see the indexing documentation.

Binary operator functions

DataFrame.add(other[, axis, level, fill_value]) Addition of dataframe and other, element-wise (binary operator add).
DataFrame.sub(other[, axis, level, fill_value]) Subtraction of dataframe and other, element-wise (binary operator sub).
DataFrame.mul(other[, axis, level, fill_value]) Multiplication of dataframe and other, element-wise (binary operator mul).
DataFrame.div(other[, axis, level, fill_value]) Floating division of dataframe and other, element-wise (binary operator truediv).
DataFrame.truediv(other[, axis, level, ...]) Floating division of dataframe and other, element-wise (binary operator truediv).
DataFrame.floordiv(other[, axis, level, ...]) Integer division of dataframe and other, element-wise (binary operator floordiv).
DataFrame.mod(other[, axis, level, fill_value]) Modulo of dataframe and other, element-wise (binary operator mod).
DataFrame.pow(other[, axis, level, fill_value]) Exponential power of dataframe and other, element-wise (binary operator pow).
DataFrame.dot(other) Matrix multiplication with DataFrame or Series objects
DataFrame.radd(other[, axis, level, fill_value]) Addition of dataframe and other, element-wise (binary operator radd).
DataFrame.rsub(other[, axis, level, fill_value]) Subtraction of dataframe and other, element-wise (binary operator rsub).
DataFrame.rmul(other[, axis, level, fill_value]) Multiplication of dataframe and other, element-wise (binary operator rmul).
DataFrame.rdiv(other[, axis, level, fill_value]) Floating division of dataframe and other, element-wise (binary operator rtruediv).
DataFrame.rtruediv(other[, axis, level, ...]) Floating division of dataframe and other, element-wise (binary operator rtruediv).
DataFrame.rfloordiv(other[, axis, level, ...]) Integer division of dataframe and other, element-wise (binary operator rfloordiv).
DataFrame.rmod(other[, axis, level, fill_value]) Modulo of dataframe and other, element-wise (binary operator rmod).
DataFrame.rpow(other[, axis, level, fill_value]) Exponential power of dataframe and other, element-wise (binary operator rpow).
DataFrame.lt(other[, axis, level]) Wrapper for flexible comparison methods lt
DataFrame.gt(other[, axis, level]) Wrapper for flexible comparison methods gt
DataFrame.le(other[, axis, level]) Wrapper for flexible comparison methods le
DataFrame.ge(other[, axis, level]) Wrapper for flexible comparison methods ge
DataFrame.ne(other[, axis, level]) Wrapper for flexible comparison methods ne
DataFrame.eq(other[, axis, level]) Wrapper for flexible comparison methods eq
DataFrame.combine(other, func[, fill_value, ...]) Add two DataFrame objects and do not propagate NaN values, so if for a
DataFrame.combine_first(other) Combine two DataFrame objects and default to non-null values in frame calling the method.

Function application, GroupBy & Window

DataFrame.apply(func[, axis, broadcast, ...]) Applies function along input axis of DataFrame.
DataFrame.applymap(func) Apply a function to a DataFrame that is intended to operate elementwise, i.e.
DataFrame.pipe(func, *args, **kwargs) Apply func(self, *args, **kwargs)
DataFrame.agg(func[, axis]) Aggregate using callable, string, dict, or list of string/callables
DataFrame.aggregate(func[, axis]) Aggregate using callable, string, dict, or list of string/callables
DataFrame.transform(func, *args, **kwargs) Call function producing a like-indexed NDFrame
DataFrame.groupby([by, axis, level, ...]) Group series using mapper (dict or key function, apply given function to group, return result as series) or by a series of columns.
DataFrame.rolling(window[, min_periods, ...]) Provides rolling window calculations.
DataFrame.expanding([min_periods, freq, ...]) Provides expanding transformations.
DataFrame.ewm([com, span, halflife, alpha, ...]) Provides exponential weighted functions

Computations / Descriptive Stats

DataFrame.abs() Return an object with absolute value taken–only applicable to objects that are all numeric.
DataFrame.all([axis, bool_only, skipna, level]) Return whether all elements are True over requested axis
DataFrame.any([axis, bool_only, skipna, level]) Return whether any element is True over requested axis
DataFrame.clip([lower, upper, axis, inplace]) Trim values at input threshold(s).
DataFrame.clip_lower(threshold[, axis, inplace]) Return copy of the input with values below given value(s) truncated.
DataFrame.clip_upper(threshold[, axis, inplace]) Return copy of input with values above given value(s) truncated.
DataFrame.compound([axis, skipna, level]) Return the compound percentage of the values for the requested axis
DataFrame.corr([method, min_periods]) Compute pairwise correlation of columns, excluding NA/null values
DataFrame.corrwith(other[, axis, drop]) Compute pairwise correlation between rows or columns of two DataFrame objects.
DataFrame.count([axis, level, numeric_only]) Return Series with number of non-NA/null observations over requested axis.
DataFrame.cov([min_periods]) Compute pairwise covariance of columns, excluding NA/null values
DataFrame.cummax([axis, skipna]) Return cumulative max over requested axis.
DataFrame.cummin([axis, skipna]) Return cumulative minimum over requested axis.
DataFrame.cumprod([axis, skipna]) Return cumulative product over requested axis.
DataFrame.cumsum([axis, skipna]) Return cumulative sum over requested axis.
DataFrame.describe([percentiles, include, ...]) Generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.
DataFrame.diff([periods, axis]) 1st discrete difference of object
DataFrame.eval(expr[, inplace]) Evaluate an expression in the context of the calling DataFrame instance.
DataFrame.kurt([axis, skipna, level, ...]) Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0).
DataFrame.kurtosis([axis, skipna, level, ...]) Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0).
DataFrame.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis
DataFrame.max([axis, skipna, level, ...]) This method returns the maximum of the values in the object.
DataFrame.mean([axis, skipna, level, ...]) Return the mean of the values for the requested axis
DataFrame.median([axis, skipna, level, ...]) Return the median of the values for the requested axis
DataFrame.min([axis, skipna, level, ...]) This method returns the minimum of the values in the object.
DataFrame.mode([axis, numeric_only]) Gets the mode(s) of each element along the axis selected.
DataFrame.pct_change([periods, fill_method, ...]) Percent change over given number of periods.
DataFrame.prod([axis, skipna, level, ...]) Return the product of the values for the requested axis
DataFrame.product([axis, skipna, level, ...]) Return the product of the values for the requested axis
DataFrame.quantile([q, axis, numeric_only, ...]) Return values at the given quantile over requested axis, a la numpy.percentile.
DataFrame.rank([axis, method, numeric_only, ...]) Compute numerical data ranks (1 through n) along axis.
DataFrame.round([decimals]) Round a DataFrame to a variable number of decimal places.
DataFrame.sem([axis, skipna, level, ddof, ...]) Return unbiased standard error of the mean over requested axis.
DataFrame.skew([axis, skipna, level, ...]) Return unbiased skew over requested axis
DataFrame.sum([axis, skipna, level, ...]) Return the sum of the values for the requested axis
DataFrame.std([axis, skipna, level, ddof, ...]) Return sample standard deviation over requested axis.
DataFrame.var([axis, skipna, level, ddof, ...]) Return unbiased variance over requested axis.
DataFrame.nunique([axis, dropna]) Return Series with number of distinct observations over requested axis.

Reindexing / Selection / Label manipulation

DataFrame.add_prefix(prefix) Concatenate prefix string with panel items names.
DataFrame.add_suffix(suffix) Concatenate suffix string with panel items names.
DataFrame.align(other[, join, axis, level, ...]) Align two objects on their axes with the
DataFrame.at_time(time[, asof]) Select values at particular time of day (e.g.
DataFrame.between_time(start_time, end_time) Select values between particular times of the day (e.g., 9:00-9:30 AM).
DataFrame.drop([labels, axis, index, ...]) Return new object with labels in requested axis removed.
DataFrame.drop_duplicates([subset, keep, ...]) Return DataFrame with duplicate rows removed, optionally only
DataFrame.duplicated([subset, keep]) Return boolean Series denoting duplicate rows, optionally only
DataFrame.equals(other) Determines if two NDFrame objects contain the same elements.
DataFrame.filter([items, like, regex, axis]) Subset rows or columns of dataframe according to labels in the specified index.
DataFrame.first(offset) Convenience method for subsetting initial periods of time series data based on a date offset.
DataFrame.head([n]) Return the first n rows.
DataFrame.idxmax([axis, skipna]) Return index of first occurrence of maximum over requested axis.
DataFrame.idxmin([axis, skipna]) Return index of first occurrence of minimum over requested axis.
DataFrame.last(offset) Convenience method for subsetting final periods of time series data based on a date offset.
DataFrame.reindex([labels, index, columns, ...]) Conform DataFrame to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index.
DataFrame.reindex_axis(labels[, axis, ...]) Conform input object to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index.
DataFrame.reindex_like(other[, method, ...]) Return an object with matching indices to myself.
DataFrame.rename([mapper, index, columns, ...]) Alter axes labels.
DataFrame.rename_axis(mapper[, axis, copy, ...]) Alter the name of the index or columns.
DataFrame.reset_index([level, drop, ...]) For DataFrame with multi-level index, return new DataFrame with labeling information in the columns under the index names, defaulting to ‘level_0’, ‘level_1’, etc.
DataFrame.sample([n, frac, replace, ...]) Returns a random sample of items from an axis of object.
DataFrame.select(crit[, axis]) Return data corresponding to axis labels matching criteria
DataFrame.set_axis(labels[, axis, inplace]) Assign desired index to given axis
DataFrame.set_index(keys[, drop, append, ...]) Set the DataFrame index (row labels) using one or more existing columns.
DataFrame.tail([n]) Return the last n rows.
DataFrame.take(indices[, axis, convert, is_copy]) Return the elements in the given positional indices along an axis.
DataFrame.truncate([before, after, axis, copy]) Truncates a sorted DataFrame/Series before and/or after some particular index value.

Missing data handling

DataFrame.dropna([axis, how, thresh, ...]) Return object with labels on given axis omitted where alternately any
DataFrame.fillna([value, method, axis, ...]) Fill NA/NaN values using the specified method
DataFrame.replace([to_replace, value, ...]) Replace values given in ‘to_replace’ with ‘value’.
DataFrame.interpolate([method, axis, limit, ...]) Interpolate values according to different methods.

Reshaping, sorting, transposing

DataFrame.pivot([index, columns, values]) Reshape data (produce a “pivot” table) based on column values.
DataFrame.pivot_table([values, index, ...]) Create a spreadsheet-style pivot table as a DataFrame.
DataFrame.reorder_levels(order[, axis]) Rearrange index levels using input order.
DataFrame.sort_values(by[, axis, ascending, ...]) Sort by the values along either axis
DataFrame.sort_index([axis, level, ...]) Sort object by labels (along an axis)
DataFrame.nlargest(n, columns[, keep]) Get the rows of a DataFrame sorted by the n largest values of columns.
DataFrame.nsmallest(n, columns[, keep]) Get the rows of a DataFrame sorted by the n smallest values of columns.
DataFrame.swaplevel([i, j, axis]) Swap levels i and j in a MultiIndex on a particular axis
DataFrame.stack([level, dropna]) Pivot a level of the (possibly hierarchical) column labels, returning a DataFrame (or Series in the case of an object with a single level of column labels) having a hierarchical index with a new inner-most level of row labels.
DataFrame.unstack([level, fill_value]) Pivot a level of the (necessarily hierarchical) index labels, returning a DataFrame having a new level of column labels whose inner-most level consists of the pivoted index labels.
DataFrame.swapaxes(axis1, axis2[, copy]) Interchange axes and swap values axes appropriately
DataFrame.melt([id_vars, value_vars, ...]) “Unpivots” a DataFrame from wide format to long format, optionally
DataFrame.squeeze([axis]) Squeeze length 1 dimensions.
DataFrame.to_panel() Transform long (stacked) format (DataFrame) into wide (3D, Panel) format.
DataFrame.to_xarray() Return an xarray object from the pandas object.
DataFrame.T Transpose index and columns
DataFrame.transpose(*args, **kwargs) Transpose index and columns

Combining / joining / merging

DataFrame.append(other[, ignore_index, ...]) Append rows of other to the end of this frame, returning a new object.
DataFrame.assign(**kwargs) Assign new columns to a DataFrame, returning a new object (a copy) with all the original columns in addition to the new ones.
DataFrame.join(other[, on, how, lsuffix, ...]) Join columns with other DataFrame either on index or on a key column.
DataFrame.merge(right[, how, on, left_on, ...]) Merge DataFrame objects by performing a database-style join operation by columns or indexes.
DataFrame.update(other[, join, overwrite, ...]) Modify DataFrame in place using non-NA values from passed DataFrame.

Time series-related

DataFrame.asfreq(freq[, method, how, ...]) Convert TimeSeries to specified frequency.
DataFrame.asof(where[, subset]) The last row without any NaN is taken (or the last row without
DataFrame.shift([periods, freq, axis]) Shift index by desired number of periods with an optional time freq
DataFrame.slice_shift([periods, axis]) Equivalent to shift without copying data.
DataFrame.tshift([periods, freq, axis]) Shift the time index, using the index’s frequency if available.
DataFrame.first_valid_index() Return index for first non-NA/null value.
DataFrame.last_valid_index() Return index for first non-NA/null value.
DataFrame.resample(rule[, how, axis, ...]) Convenience method for frequency conversion and resampling of time series.
DataFrame.to_period([freq, axis, copy]) Convert DataFrame from DatetimeIndex to PeriodIndex with desired
DataFrame.to_timestamp([freq, how, axis, copy]) Cast to DatetimeIndex of timestamps, at beginning of period
DataFrame.tz_convert(tz[, axis, level, copy]) Convert tz-aware axis to target time zone.
DataFrame.tz_localize(tz[, axis, level, ...]) Localize tz-naive TimeSeries to target time zone.

Plotting

DataFrame.plot is both a callable method and a namespace attribute for specific plotting methods of the form DataFrame.plot.<kind>.

DataFrame.plot([x, y, kind, ax, ....]) DataFrame plotting accessor and method
DataFrame.plot.area([x, y]) Area plot
DataFrame.plot.bar([x, y]) Vertical bar plot
DataFrame.plot.barh([x, y]) Horizontal bar plot
DataFrame.plot.box([by]) Boxplot
DataFrame.plot.density(**kwds) Kernel Density Estimate plot
DataFrame.plot.hexbin(x, y[, C, ...]) Hexbin plot
DataFrame.plot.hist([by, bins]) Histogram
DataFrame.plot.kde(**kwds) Kernel Density Estimate plot
DataFrame.plot.line([x, y]) Line plot
DataFrame.plot.pie([y]) Pie chart
DataFrame.plot.scatter(x, y[, s, c]) Scatter plot
DataFrame.boxplot([column, by, ax, ...]) Make a box plot from DataFrame column optionally grouped by some columns or
DataFrame.hist(data[, column, by, grid, ...]) Draw histogram of the DataFrame’s series using matplotlib / pylab.

Serialization / IO / Conversion

DataFrame.from_csv(path[, header, sep, ...]) Read CSV file (DEPRECATED, please use pandas.read_csv() instead).
DataFrame.from_dict(data[, orient, dtype]) Construct DataFrame from dict of array-like or dicts
DataFrame.from_items(items[, columns, orient]) Convert (key, value) pairs to DataFrame.
DataFrame.from_records(data[, index, ...]) Convert structured or record ndarray to DataFrame
DataFrame.info([verbose, buf, max_cols, ...]) Concise summary of a DataFrame.
DataFrame.to_parquet(fname[, engine, ...]) Write a DataFrame to the binary parquet format.
DataFrame.to_pickle(path[, compression, ...]) Pickle (serialize) object to input file path.
DataFrame.to_csv([path_or_buf, sep, na_rep, ...]) Write DataFrame to a comma-separated values (csv) file
DataFrame.to_hdf(path_or_buf, key, **kwargs) Write the contained data to an HDF5 file using HDFStore.
DataFrame.to_sql(name, con[, flavor, ...]) Write records stored in a DataFrame to a SQL database.
DataFrame.to_dict([orient, into]) Convert DataFrame to dictionary.
DataFrame.to_excel(excel_writer[, ...]) Write DataFrame to an excel sheet
DataFrame.to_json([path_or_buf, orient, ...]) Convert the object to a JSON string.
DataFrame.to_html([buf, columns, col_space, ...]) Render a DataFrame as an HTML table.
DataFrame.to_feather(fname) write out the binary feather-format for DataFrames
DataFrame.to_latex([buf, columns, ...]) Render an object to a tabular environment table.
DataFrame.to_stata(fname[, convert_dates, ...]) A class for writing Stata binary dta files from array-like objects
DataFrame.to_msgpack([path_or_buf, encoding]) msgpack (serialize) object to input file path
DataFrame.to_gbq(destination_table, project_id) Write a DataFrame to a Google BigQuery table.
DataFrame.to_records([index, convert_datetime64]) Convert DataFrame to record array.
DataFrame.to_sparse([fill_value, kind]) Convert to SparseDataFrame
DataFrame.to_dense() Return dense representation of NDFrame (as opposed to sparse)
DataFrame.to_string([buf, columns, ...]) Render a DataFrame to a console-friendly tabular output.
DataFrame.to_clipboard([excel, sep]) Attempt to write text representation of object to the system clipboard This can be pasted into Excel, for example.
DataFrame.style Property returning a Styler object containing methods for building a styled HTML representation fo the DataFrame.

Sparse

SparseDataFrame.to_coo() Return the contents of the frame as a sparse SciPy COO matrix.

Panel

Constructor

Panel([data, items, major_axis, minor_axis, ...]) Represents wide format panel data, stored as 3-dimensional array

Attributes and underlying data

Axes

  • items: axis 0; each item corresponds to a DataFrame contained inside
  • major_axis: axis 1; the index (rows) of each of the DataFrames
  • minor_axis: axis 2; the columns of each of the DataFrames
Panel.values Numpy representation of NDFrame
Panel.axes Return index label(s) of the internal NDFrame
Panel.ndim Number of axes / array dimensions
Panel.size number of elements in the NDFrame
Panel.shape Return a tuple of axis dimensions
Panel.dtypes Return the dtypes in this object.
Panel.ftypes Return the ftypes (indication of sparse/dense and dtype) in this object.
Panel.get_dtype_counts() Return the counts of dtypes in this object.
Panel.get_ftype_counts() Return the counts of ftypes in this object.

Conversion

Panel.astype(dtype[, copy, errors]) Cast a pandas object to a specified dtype dtype.
Panel.copy([deep]) Make a copy of this objects data.
Panel.isna() Return a boolean same-sized object indicating if the values are NA.
Panel.notna() Return a boolean same-sized object indicating if the values are not NA.

Getting and setting

Panel.get_value(*args, **kwargs) Quickly retrieve single value at (item, major, minor) location
Panel.set_value(*args, **kwargs) Quickly set single value at (item, major, minor) location

Indexing, iteration, slicing

Panel.at Fast label-based scalar accessor
Panel.iat Fast integer location scalar accessor.
Panel.loc Purely label-location based indexer for selection by label.
Panel.iloc Purely integer-location based indexing for selection by position.
Panel.__iter__() Iterate over infor axis
Panel.iteritems() Iterate over (label, values) on info axis
Panel.pop(item) Return item and drop from frame.
Panel.xs(key[, axis]) Return slice of panel along selected axis
Panel.major_xs(key) Return slice of panel along major axis
Panel.minor_xs(key) Return slice of panel along minor axis

For more information on .at, .iat, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Panel.add(other[, axis]) Addition of series and other, element-wise (binary operator add).
Panel.sub(other[, axis]) Subtraction of series and other, element-wise (binary operator sub).
Panel.mul(other[, axis]) Multiplication of series and other, element-wise (binary operator mul).
Panel.div(other[, axis]) Floating division of series and other, element-wise (binary operator truediv).
Panel.truediv(other[, axis]) Floating division of series and other, element-wise (binary operator truediv).
Panel.floordiv(other[, axis]) Integer division of series and other, element-wise (binary operator floordiv).
Panel.mod(other[, axis]) Modulo of series and other, element-wise (binary operator mod).
Panel.pow(other[, axis]) Exponential power of series and other, element-wise (binary operator pow).
Panel.radd(other[, axis]) Addition of series and other, element-wise (binary operator radd).
Panel.rsub(other[, axis]) Subtraction of series and other, element-wise (binary operator rsub).
Panel.rmul(other[, axis]) Multiplication of series and other, element-wise (binary operator rmul).
Panel.rdiv(other[, axis]) Floating division of series and other, element-wise (binary operator rtruediv).
Panel.rtruediv(other[, axis]) Floating division of series and other, element-wise (binary operator rtruediv).
Panel.rfloordiv(other[, axis]) Integer division of series and other, element-wise (binary operator rfloordiv).
Panel.rmod(other[, axis]) Modulo of series and other, element-wise (binary operator rmod).
Panel.rpow(other[, axis]) Exponential power of series and other, element-wise (binary operator rpow).
Panel.lt(other[, axis]) Wrapper for comparison method lt
Panel.gt(other[, axis]) Wrapper for comparison method gt
Panel.le(other[, axis]) Wrapper for comparison method le
Panel.ge(other[, axis]) Wrapper for comparison method ge
Panel.ne(other[, axis]) Wrapper for comparison method ne
Panel.eq(other[, axis]) Wrapper for comparison method eq

Function application, GroupBy

Panel.apply(func[, axis]) Applies function along axis (or axes) of the Panel
Panel.groupby(function[, axis]) Group data on given axis, returning GroupBy object

Computations / Descriptive Stats

Panel.abs() Return an object with absolute value taken–only applicable to objects that are all numeric.
Panel.clip([lower, upper, axis, inplace]) Trim values at input threshold(s).
Panel.clip_lower(threshold[, axis, inplace]) Return copy of the input with values below given value(s) truncated.
Panel.clip_upper(threshold[, axis, inplace]) Return copy of input with values above given value(s) truncated.
Panel.count([axis]) Return number of observations over requested axis.
Panel.cummax([axis, skipna]) Return cumulative max over requested axis.
Panel.cummin([axis, skipna]) Return cumulative minimum over requested axis.
Panel.cumprod([axis, skipna]) Return cumulative product over requested axis.
Panel.cumsum([axis, skipna]) Return cumulative sum over requested axis.
Panel.max([axis, skipna, level, numeric_only]) This method returns the maximum of the values in the object.
Panel.mean([axis, skipna, level, numeric_only]) Return the mean of the values for the requested axis
Panel.median([axis, skipna, level, numeric_only]) Return the median of the values for the requested axis
Panel.min([axis, skipna, level, numeric_only]) This method returns the minimum of the values in the object.
Panel.pct_change([periods, fill_method, ...]) Percent change over given number of periods.
Panel.prod([axis, skipna, level, numeric_only]) Return the product of the values for the requested axis
Panel.sem([axis, skipna, level, ddof, ...]) Return unbiased standard error of the mean over requested axis.
Panel.skew([axis, skipna, level, numeric_only]) Return unbiased skew over requested axis
Panel.sum([axis, skipna, level, numeric_only]) Return the sum of the values for the requested axis
Panel.std([axis, skipna, level, ddof, ...]) Return sample standard deviation over requested axis.
Panel.var([axis, skipna, level, ddof, ...]) Return unbiased variance over requested axis.

Reindexing / Selection / Label manipulation

Panel.add_prefix(prefix) Concatenate prefix string with panel items names.
Panel.add_suffix(suffix) Concatenate suffix string with panel items names.
Panel.drop([labels, axis, index, columns, ...]) Return new object with labels in requested axis removed.
Panel.equals(other) Determines if two NDFrame objects contain the same elements.
Panel.filter([items, like, regex, axis]) Subset rows or columns of dataframe according to labels in the specified index.
Panel.first(offset) Convenience method for subsetting initial periods of time series data based on a date offset.
Panel.last(offset) Convenience method for subsetting final periods of time series data based on a date offset.
Panel.reindex(*args, **kwargs) Conform Panel to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index.
Panel.reindex_axis(labels[, axis, method, ...]) Conform input object to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index.
Panel.reindex_like(other[, method, copy, ...]) Return an object with matching indices to myself.
Panel.rename([items, major_axis, minor_axis]) Alter axes input function or functions.
Panel.sample([n, frac, replace, weights, ...]) Returns a random sample of items from an axis of object.
Panel.select(crit[, axis]) Return data corresponding to axis labels matching criteria
Panel.take(indices[, axis, convert, is_copy]) Return the elements in the given positional indices along an axis.
Panel.truncate([before, after, axis, copy]) Truncates a sorted DataFrame/Series before and/or after some particular index value.

Missing data handling

Panel.dropna([axis, how, inplace]) Drop 2D from panel, holding passed axis constant

Reshaping, sorting, transposing

Panel.sort_index([axis, level, ascending, ...]) Sort object by labels (along an axis)
Panel.swaplevel([i, j, axis]) Swap levels i and j in a MultiIndex on a particular axis
Panel.transpose(*args, **kwargs) Permute the dimensions of the Panel
Panel.swapaxes(axis1, axis2[, copy]) Interchange axes and swap values axes appropriately
Panel.conform(frame[, axis]) Conform input DataFrame to align with chosen axis pair.

Combining / joining / merging

Panel.join(other[, how, lsuffix, rsuffix]) Join items with other Panel either on major and minor axes column
Panel.update(other[, join, overwrite, ...]) Modify Panel in place using non-NA values from passed Panel, or object coercible to Panel.

Time series-related

Panel.asfreq(freq[, method, how, normalize, ...]) Convert TimeSeries to specified frequency.
Panel.shift([periods, freq, axis]) Shift index by desired number of periods with an optional time freq.
Panel.resample(rule[, how, axis, ...]) Convenience method for frequency conversion and resampling of time series.
Panel.tz_convert(tz[, axis, level, copy]) Convert tz-aware axis to target time zone.
Panel.tz_localize(tz[, axis, level, copy, ...]) Localize tz-naive TimeSeries to target time zone.

Serialization / IO / Conversion

Panel.from_dict(data[, intersect, orient, dtype]) Construct Panel from dict of DataFrame objects
Panel.to_pickle(path[, compression, protocol]) Pickle (serialize) object to input file path.
Panel.to_excel(path[, na_rep, engine]) Write each DataFrame in Panel to a separate excel sheet
Panel.to_hdf(path_or_buf, key, **kwargs) Write the contained data to an HDF5 file using HDFStore.
Panel.to_sparse(*args, **kwargs) NOT IMPLEMENTED: do not call this method, as sparsifying is not supported for Panel objects and will raise an error.
Panel.to_frame([filter_observations]) Transform wide format into long (stacked) format as DataFrame whose columns are the Panel’s items and whose index is a MultiIndex formed of the Panel’s major and minor axes.
Panel.to_clipboard([excel, sep]) Attempt to write text representation of object to the system clipboard This can be pasted into Excel, for example.

Index

Many of these methods or variants thereof are available on the objects that contain an index (Series/DataFrame) and those should most likely be used before calling these methods directly.

Index Immutable ndarray implementing an ordered, sliceable set.

Attributes

Index.values return the underlying data as an ndarray
Index.is_monotonic alias for is_monotonic_increasing (deprecated)
Index.is_monotonic_increasing return if the index is monotonic increasing (only equal or
Index.is_monotonic_decreasing return if the index is monotonic decreasing (only equal or
Index.is_unique
Index.has_duplicates
Index.hasnans
Index.dtype
Index.dtype_str
Index.inferred_type
Index.is_all_dates
Index.shape return a tuple of the shape of the underlying data
Index.name
Index.names
Index.nbytes return the number of bytes in the underlying data
Index.ndim return the number of dimensions of the underlying data,
Index.size return the number of elements in the underlying data
Index.empty
Index.strides return the strides of the underlying data
Index.itemsize return the size of the dtype of the item of the underlying data
Index.base return the base object if the memory of the underlying data is
Index.T return the transpose, which is by definition self
Index.memory_usage([deep]) Memory usage of my values

Modifying and Computations

Index.all(*args, **kwargs) Return whether all elements are True
Index.any(*args, **kwargs) Return whether any element is True
Index.argmin([axis]) return a ndarray of the minimum argument indexer
Index.argmax([axis]) return a ndarray of the maximum argument indexer
Index.copy([name, deep, dtype]) Make a copy of this object.
Index.delete(loc) Make new Index with passed location(-s) deleted
Index.drop(labels[, errors]) Make new Index with passed list of labels deleted
Index.drop_duplicates([keep]) Return Index with duplicate values removed
Index.duplicated([keep]) Return boolean np.ndarray denoting duplicate values
Index.equals(other) Determines if two Index objects contain the same elements.
Index.factorize([sort, na_sentinel]) Encode the object as an enumerated type or categorical variable
Index.identical(other) Similar to equals, but check that other comparable attributes are
Index.insert(loc, item) Make new Index inserting new item at location.
Index.is_(other) More flexible, faster check like is but that works through views
Index.is_boolean()
Index.is_categorical()
Index.is_floating()
Index.is_integer()
Index.is_interval()
Index.is_lexsorted_for_tuple(tup)
Index.is_mixed()
Index.is_numeric()
Index.is_object()
Index.min() The minimum value of the object
Index.max() The maximum value of the object
Index.reindex(target[, method, level, ...]) Create index with target’s values (move/add/delete values as necessary)
Index.rename(name[, inplace]) Set new names on index.
Index.repeat(repeats, *args, **kwargs) Repeat elements of an Index.
Index.where(cond[, other])

New in version 0.19.0.

Index.take(indices[, axis, allow_fill, ...]) return a new Index of the values selected by the indices
Index.putmask(mask, value) return a new Index of the values set with the mask
Index.set_names(names[, level, inplace]) Set new names on index.
Index.unique() Return unique values in the object.
Index.nunique([dropna]) Return number of unique elements in the object.
Index.value_counts([normalize, sort, ...]) Returns object containing counts of unique values.

Missing Values

Index.fillna([value, downcast]) Fill NA/NaN values with the specified value
Index.dropna([how]) Return Index without NA/NaN values
Index.isna() Detect missing values
Index.notna() Inverse of isna

Conversion

Index.astype(dtype[, copy]) Create an Index with values cast to dtypes.
Index.item() return the first element of the underlying data as a python
Index.map(mapper) Apply mapper function to an index.
Index.ravel([order]) return an ndarray of the flattened values of the underlying data
Index.tolist() Return a list of the values.
Index.to_datetime([dayfirst]) DEPRECATED: use pandas.to_datetime() instead.
Index.to_native_types([slicer]) Format specified values of self and return them.
Index.to_series(**kwargs) Create a Series with both index and values equal to the index keys
Index.to_frame([index]) Create a DataFrame with a column containing the Index.
Index.view([cls])

Sorting

Index.argsort(*args, **kwargs) Returns the indices that would sort the index and its underlying data.
Index.searchsorted(value[, side, sorter]) Find indices where elements should be inserted to maintain order.
Index.sort_values([return_indexer, ascending]) Return sorted copy of Index

Time-specific operations

Index.shift([periods, freq]) Shift Index containing datetime objects by input number of periods and

Combining / joining / set operations

Index.append(other) Append a collection of Index options together
Index.join(other[, how, level, ...]) this is an internal non-public method
Index.intersection(other) Form the intersection of two Index objects.
Index.union(other) Form the union of two Index objects and sorts if possible.
Index.difference(other) Return a new Index with elements from the index that are not in other.
Index.symmetric_difference(other[, result_name]) Compute the symmetric difference of two Index objects.

Selecting

Index.asof(label) For a sorted index, return the most recent label up to and including the passed label.
Index.asof_locs(where, mask) where : array of timestamps
Index.contains(key) return a boolean if this key is IN the index
Index.get_duplicates()
Index.get_indexer(target[, method, limit, ...]) Compute indexer and mask for new index given the current index.
Index.get_indexer_for(target, **kwargs) guaranteed return of an indexer even when non-unique
Index.get_indexer_non_unique(target) Compute indexer and mask for new index given the current index.
Index.get_level_values(level) Return an Index of values for requested level, equal to the length of the index.
Index.get_loc(key[, method, tolerance]) Get integer location, slice or boolean mask for requested label.
Index.get_slice_bound(label, side, kind) Calculate slice bound that corresponds to given label.
Index.get_value(series, key) Fast lookup of value from 1-dimensional ndarray.
Index.get_values() return the underlying data as an ndarray
Index.set_value(arr, key, value) Fast lookup of value from 1-dimensional ndarray.
Index.isin(values[, level]) Compute boolean array of whether each index value is found in the passed set of values.
Index.slice_indexer([start, end, step, kind]) For an ordered or unique index, compute the slice indexer for input labels and step.
Index.slice_locs([start, end, step, kind]) Compute slice locations for input labels.

Numeric Index

RangeIndex Immutable Index implementing a monotonic integer range.
Int64Index Immutable ndarray implementing an ordered, sliceable set.
UInt64Index Immutable ndarray implementing an ordered, sliceable set.
Float64Index Immutable ndarray implementing an ordered, sliceable set.
RangeIndex.from_range(data[, name, dtype]) create RangeIndex from a range (py3), or xrange (py2) object

CategoricalIndex

CategoricalIndex Immutable Index implementing an ordered, sliceable set.

Categorical Components

CategoricalIndex.codes
CategoricalIndex.categories
CategoricalIndex.ordered
CategoricalIndex.rename_categories(*args, ...) Renames categories.
CategoricalIndex.reorder_categories(*args, ...) Reorders categories as specified in new_categories.
CategoricalIndex.add_categories(*args, **kwargs) Add new categories.
CategoricalIndex.remove_categories(*args, ...) Removes the specified categories.
CategoricalIndex.remove_unused_categories(...) Removes categories which are not used.
CategoricalIndex.set_categories(*args, **kwargs) Sets the categories to the specified new_categories.
CategoricalIndex.as_ordered(*args, **kwargs) Sets the Categorical to be ordered
CategoricalIndex.as_unordered(*args, **kwargs) Sets the Categorical to be unordered
CategoricalIndex.map(mapper) Apply mapper function to its categories (not codes).

IntervalIndex

IntervalIndex Immutable Index implementing an ordered, sliceable set.

IntervalIndex Components

IntervalIndex.from_arrays(left, right[, ...]) Construct an IntervalIndex from a a left and right array
IntervalIndex.from_tuples(data[, closed, ...]) Construct an IntervalIndex from a list/array of tuples
IntervalIndex.from_breaks(breaks[, closed, ...]) Construct an IntervalIndex from an array of splits
IntervalIndex.from_intervals(data[, name, copy]) Construct an IntervalIndex from a 1d array of Interval objects
IntervalIndex.contains(key) return a boolean if this key is IN the index
IntervalIndex.left
IntervalIndex.right
IntervalIndex.mid
IntervalIndex.closed
IntervalIndex.values
IntervalIndex.is_non_overlapping_monotonic

MultiIndex

MultiIndex A multi-level, or hierarchical, index object for pandas objects
IndexSlice Create an object to more easily perform multi-index slicing

MultiIndex Constructors

MultiIndex.from_arrays(arrays[, sortorder, ...]) Convert arrays to MultiIndex
MultiIndex.from_tuples(tuples[, sortorder, ...]) Convert list of tuples to MultiIndex
MultiIndex.from_product(iterables[, ...]) Make a MultiIndex from the cartesian product of multiple iterables

MultiIndex Attributes

MultiIndex.names Names of levels in MultiIndex
MultiIndex.levels
MultiIndex.labels
MultiIndex.nlevels Integer number of levels in this MultiIndex.
MultiIndex.levshape A tuple with the length of each level.

MultiIndex Components

MultiIndex.set_levels(levels[, level, ...]) Set new levels on MultiIndex.
MultiIndex.set_labels(labels[, level, ...]) Set new labels on MultiIndex.
MultiIndex.to_hierarchical(n_repeat[, n_shuffle]) Return a MultiIndex reshaped to conform to the shapes given by n_repeat and n_shuffle.
MultiIndex.to_frame([index]) Create a DataFrame with the levels of the MultiIndex as columns.
MultiIndex.is_lexsorted() Return True if the labels are lexicographically sorted
MultiIndex.sortlevel([level, ascending, ...]) Sort MultiIndex at the requested level.
MultiIndex.droplevel([level]) Return Index with requested level removed.
MultiIndex.swaplevel([i, j]) Swap level i with level j.
MultiIndex.reorder_levels(order) Rearrange levels using input order.
MultiIndex.remove_unused_levels() create a new MultiIndex from the current that removing

DatetimeIndex

DatetimeIndex Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.

Time/Date Components

DatetimeIndex.year The year of the datetime
DatetimeIndex.month The month as January=1, December=12
DatetimeIndex.day The days of the datetime
DatetimeIndex.hour The hours of the datetime
DatetimeIndex.minute The minutes of the datetime
DatetimeIndex.second The seconds of the datetime
DatetimeIndex.microsecond The microseconds of the datetime
DatetimeIndex.nanosecond The nanoseconds of the datetime
DatetimeIndex.date Returns numpy array of python datetime.date objects (namely, the date part of Timestamps without timezone information).
DatetimeIndex.time Returns numpy array of datetime.time.
DatetimeIndex.dayofyear The ordinal day of the year
DatetimeIndex.weekofyear The week ordinal of the year
DatetimeIndex.week The week ordinal of the year
DatetimeIndex.dayofweek The day of the week with Monday=0, Sunday=6
DatetimeIndex.weekday The day of the week with Monday=0, Sunday=6
DatetimeIndex.weekday_name The name of day in a week (ex: Friday)
DatetimeIndex.quarter The quarter of the date
DatetimeIndex.tz
DatetimeIndex.freq get/set the frequency of the Index
DatetimeIndex.freqstr Return the frequency object as a string if its set, otherwise None
DatetimeIndex.is_month_start Logical indicating if first day of month (defined by frequency)
DatetimeIndex.is_month_end Logical indicating if last day of month (defined by frequency)
DatetimeIndex.is_quarter_start Logical indicating if first day of quarter (defined by frequency)
DatetimeIndex.is_quarter_end Logical indicating if last day of quarter (defined by frequency)
DatetimeIndex.is_year_start Logical indicating if first day of year (defined by frequency)
DatetimeIndex.is_year_end Logical indicating if last day of year (defined by frequency)
DatetimeIndex.is_leap_year Logical indicating if the date belongs to a leap year
DatetimeIndex.inferred_freq

Selecting

DatetimeIndex.indexer_at_time(time[, asof]) Select values at particular time of day (e.g.
DatetimeIndex.indexer_between_time(...[, ...]) Select values between particular times of day (e.g., 9:00-9:30AM).

Time-specific operations

DatetimeIndex.normalize() Return DatetimeIndex with times to midnight.
DatetimeIndex.strftime(date_format) Return an array of formatted strings specified by date_format, which supports the same string format as the python standard library.
DatetimeIndex.snap([freq]) Snap time stamps to nearest occurring frequency
DatetimeIndex.tz_convert(tz) Convert tz-aware DatetimeIndex from one time zone to another (using
DatetimeIndex.tz_localize(tz[, ambiguous, ...]) Localize tz-naive DatetimeIndex to given time zone (using
DatetimeIndex.round(freq, *args, **kwargs) round the index to the specified freq
DatetimeIndex.floor(freq) floor the index to the specified freq
DatetimeIndex.ceil(freq) ceil the index to the specified freq

Conversion

DatetimeIndex.to_datetime([dayfirst])
DatetimeIndex.to_period([freq]) Cast to PeriodIndex at a particular frequency
DatetimeIndex.to_perioddelta(freq) Calculates TimedeltaIndex of difference between index values and index converted to PeriodIndex at specified freq.
DatetimeIndex.to_pydatetime() Return DatetimeIndex as object ndarray of datetime.datetime objects
DatetimeIndex.to_series([keep_tz]) Create a Series with both index and values equal to the index keys
DatetimeIndex.to_frame([index]) Create a DataFrame with a column containing the Index.

TimedeltaIndex

TimedeltaIndex Immutable ndarray of timedelta64 data, represented internally as int64, and

Components

TimedeltaIndex.days Number of days for each element.
TimedeltaIndex.seconds Number of seconds (>= 0 and less than 1 day) for each element.
TimedeltaIndex.microseconds Number of microseconds (>= 0 and less than 1 second) for each element.
TimedeltaIndex.nanoseconds Number of nanoseconds (>= 0 and less than 1 microsecond) for each element.
TimedeltaIndex.components Return a dataframe of the components (days, hours, minutes, seconds, milliseconds, microseconds, nanoseconds) of the Timedeltas.
TimedeltaIndex.inferred_freq

Conversion

TimedeltaIndex.to_pytimedelta() Return TimedeltaIndex as object ndarray of datetime.timedelta objects
TimedeltaIndex.to_series(**kwargs) Create a Series with both index and values equal to the index keys
TimedeltaIndex.round(freq, *args, **kwargs) round the index to the specified freq
TimedeltaIndex.floor(freq) floor the index to the specified freq
TimedeltaIndex.ceil(freq) ceil the index to the specified freq
TimedeltaIndex.to_frame([index]) Create a DataFrame with a column containing the Index.

PeriodIndex

PeriodIndex Immutable ndarray holding ordinal values indicating regular periods in time such as particular years, quarters, months, etc.

Attributes

PeriodIndex.day The days of the period
PeriodIndex.dayofweek The day of the week with Monday=0, Sunday=6
PeriodIndex.dayofyear The ordinal day of the year
PeriodIndex.days_in_month The number of days in the month
PeriodIndex.daysinmonth The number of days in the month
PeriodIndex.end_time
PeriodIndex.freq
PeriodIndex.freqstr Return the frequency object as a string if its set, otherwise None
PeriodIndex.hour The hour of the period
PeriodIndex.is_leap_year Logical indicating if the date belongs to a leap year
PeriodIndex.minute The minute of the period
PeriodIndex.month The month as January=1, December=12
PeriodIndex.quarter The quarter of the date
PeriodIndex.qyear
PeriodIndex.second The second of the period
PeriodIndex.start_time
PeriodIndex.week The week ordinal of the year
PeriodIndex.weekday The day of the week with Monday=0, Sunday=6
PeriodIndex.weekofyear The week ordinal of the year
PeriodIndex.year The year of the period

Methods

PeriodIndex.asfreq([freq, how]) Convert the PeriodIndex to the specified frequency freq.
PeriodIndex.strftime(date_format) Return an array of formatted strings specified by date_format, which supports the same string format as the python standard library.
PeriodIndex.to_timestamp([freq, how]) Cast to DatetimeIndex
PeriodIndex.tz_convert(tz) Convert tz-aware DatetimeIndex from one time zone to another (using
PeriodIndex.tz_localize(tz[, infer_dst]) Localize tz-naive DatetimeIndex to given time zone (using

Scalars

Period

Period Represents a period of time

Methods

Period.asfreq Convert Period to desired frequency, either at the start or end of the
Period.now
Period.strftime Returns the string representation of the Period, depending on the selected fmt.
Period.to_timestamp Return the Timestamp representation of the Period at the target

Timestamp

Timestamp Pandas replacement for datetime.datetime

Methods

Timestamp.astimezone Convert tz-aware Timestamp to another time zone.
Timestamp.ceil return a new Timestamp ceiled to this resolution
Timestamp.combine(date, time) date, time -> datetime with same date and time fields
Timestamp.ctime Return ctime() style string.
Timestamp.date Return date object with same year, month and day.
Timestamp.dst Return self.tzinfo.dst(self).
Timestamp.floor return a new Timestamp floored to this resolution
Timestamp.freq
Timestamp.freqstr
Timestamp.fromordinal(ordinal[, freq, tz, ...]) passed an ordinal, translate and convert to a ts
Timestamp.fromtimestamp(ts) timestamp[, tz] -> tz’s local time from POSIX timestamp.
Timestamp.isocalendar Return a 3-tuple containing ISO year, week number, and weekday.
Timestamp.isoformat
Timestamp.isoweekday Return the day of the week represented by the date.
Timestamp.normalize Normalize Timestamp to midnight, preserving tz information.
Timestamp.now([tz]) Returns new Timestamp object representing current time local to tz.
Timestamp.replace implements datetime.replace, handles nanoseconds
Timestamp.round Round the Timestamp to the specified resolution
Timestamp.strftime format -> strftime() style string.
Timestamp.strptime string, format -> new datetime parsed from a string (like time.strptime()).
Timestamp.time Return time object with same time but with tzinfo=None.
Timestamp.timestamp Return POSIX timestamp as float.
Timestamp.timetuple Return time tuple, compatible with time.localtime().
Timestamp.timetz Return time object with same time and tzinfo.
Timestamp.to_datetime64 Returns a numpy.datetime64 object with ‘ns’ precision
Timestamp.to_julian_date Convert TimeStamp to a Julian Date.
Timestamp.to_period Return an period of which this timestamp is an observation.
Timestamp.to_pydatetime Convert a Timestamp object to a native Python datetime object.
Timestamp.today(cls[, tz]) Return the current time in the local timezone.
Timestamp.toordinal Return proleptic Gregorian ordinal.
Timestamp.tz_convert Convert tz-aware Timestamp to another time zone.
Timestamp.tz_localize Convert naive Timestamp to local time zone, or remove timezone from tz-aware Timestamp.
Timestamp.tzname Return self.tzinfo.tzname(self).
Timestamp.utcfromtimestamp(ts) Construct a naive UTC datetime from a POSIX timestamp.
Timestamp.utcnow() Return a new Timestamp representing UTC day and time.
Timestamp.utcoffset Return self.tzinfo.utcoffset(self).
Timestamp.utctimetuple Return UTC time tuple, compatible with time.localtime().
Timestamp.weekday Return the day of the week represented by the date.

Interval

Interval Immutable object implementing an Interval, a bounded slice-like interval.

Timedelta

Timedelta Represents a duration, the difference between two dates or times.

Properties

Timedelta.asm8 return a numpy timedelta64 array view of myself
Timedelta.components Return a Components NamedTuple-like
Timedelta.days Number of Days
Timedelta.delta return out delta in ns (for internal compat)
Timedelta.freq
Timedelta.is_populated
Timedelta.max
Timedelta.microseconds Number of microseconds (>= 0 and less than 1 second).
Timedelta.min
Timedelta.nanoseconds Number of nanoseconds (>= 0 and less than 1 microsecond).
Timedelta.resolution return a string representing the lowest resolution that we have
Timedelta.seconds Number of seconds (>= 0 and less than 1 day).
Timedelta.value
Timedelta.view array view compat

Methods

Timedelta.ceil return a new Timedelta ceiled to this resolution
Timedelta.floor return a new Timedelta floored to this resolution
Timedelta.isoformat Format Timedelta as ISO 8601 Duration like P[n]Y[n]M[n]DT[n]H[n]M[n]S, where the [n] s are replaced by the values.
Timedelta.round Round the Timedelta to the specified resolution
Timedelta.to_pytimedelta return an actual datetime.timedelta object
Timedelta.to_timedelta64 Returns a numpy.timedelta64 object with ‘ns’ precision
Timedelta.total_seconds Total duration of timedelta in seconds (to ns precision)

Frequencies

to_offset(freq) Return DateOffset object from string or tuple representation

Window

Rolling objects are returned by .rolling calls: pandas.DataFrame.rolling(), pandas.Series.rolling(), etc. Expanding objects are returned by .expanding calls: pandas.DataFrame.expanding(), pandas.Series.expanding(), etc. EWM objects are returned by .ewm calls: pandas.DataFrame.ewm(), pandas.Series.ewm(), etc.

Standard moving window functions

Rolling.count() rolling count of number of non-NaN
Rolling.sum(*args, **kwargs) rolling sum
Rolling.mean(*args, **kwargs) rolling mean
Rolling.median(**kwargs) rolling median
Rolling.var([ddof]) rolling variance
Rolling.std([ddof]) rolling standard deviation
Rolling.min(*args, **kwargs) rolling minimum
Rolling.max(*args, **kwargs) rolling maximum
Rolling.corr([other, pairwise]) rolling sample correlation
Rolling.cov([other, pairwise, ddof]) rolling sample covariance
Rolling.skew(**kwargs) Unbiased rolling skewness
Rolling.kurt(**kwargs) Unbiased rolling kurtosis
Rolling.apply(func[, args, kwargs]) rolling function apply
Rolling.quantile(quantile, **kwargs) rolling quantile
Window.mean(*args, **kwargs) window mean
Window.sum(*args, **kwargs) window sum

Standard expanding window functions

Expanding.count(**kwargs) expanding count of number of non-NaN
Expanding.sum(*args, **kwargs) expanding sum
Expanding.mean(*args, **kwargs) expanding mean
Expanding.median(**kwargs) expanding median
Expanding.var([ddof]) expanding variance
Expanding.std([ddof]) expanding standard deviation
Expanding.min(*args, **kwargs) expanding minimum
Expanding.max(*args, **kwargs) expanding maximum
Expanding.corr([other, pairwise]) expanding sample correlation
Expanding.cov([other, pairwise, ddof]) expanding sample covariance
Expanding.skew(**kwargs) Unbiased expanding skewness
Expanding.kurt(**kwargs) Unbiased expanding kurtosis
Expanding.apply(func[, args, kwargs]) expanding function apply
Expanding.quantile(quantile, **kwargs) expanding quantile

Exponentially-weighted moving window functions

EWM.mean(*args, **kwargs) exponential weighted moving average
EWM.std([bias]) exponential weighted moving stddev
EWM.var([bias]) exponential weighted moving variance
EWM.corr([other, pairwise]) exponential weighted sample correlation
EWM.cov([other, pairwise, bias]) exponential weighted sample covariance

GroupBy

GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby(), pandas.Series.groupby(), etc.

Indexing, iteration

GroupBy.__iter__() Groupby iterator
GroupBy.groups dict {group name -> group labels}
GroupBy.indices dict {group name -> group indices}
GroupBy.get_group(name[, obj]) Constructs NDFrame from group with provided name
Grouper([key, level, freq, axis, sort]) A Grouper allows the user to specify a groupby instruction for a target

Function application

GroupBy.apply(func, *args, **kwargs) Apply function and combine results together in an intelligent way.
GroupBy.aggregate(func, *args, **kwargs)
GroupBy.transform(func, *args, **kwargs)
GroupBy.pipe(func, *args, **kwargs) Apply a function with arguments to this GroupBy object,

Computations / Descriptive Stats

GroupBy.count() Compute count of group, excluding missing values
GroupBy.cumcount([ascending]) Number each item in each group from 0 to the length of that group - 1.
GroupBy.first(**kwargs) Compute first of group values
GroupBy.head([n]) Returns first n rows of each group.
GroupBy.last(**kwargs) Compute last of group values
GroupBy.max(**kwargs) Compute max of group values
GroupBy.mean(*args, **kwargs) Compute mean of groups, excluding missing values
GroupBy.median(**kwargs) Compute median of groups, excluding missing values
GroupBy.min(**kwargs) Compute min of group values
GroupBy.ngroup([ascending]) Number each group from 0 to the number of groups - 1.
GroupBy.nth(n[, dropna]) Take the nth row from each group if n is an int, or a subset of rows if n is a list of ints.
GroupBy.ohlc() Compute sum of values, excluding missing values
GroupBy.prod(**kwargs) Compute prod of group values
GroupBy.size() Compute group sizes
GroupBy.sem([ddof]) Compute standard error of the mean of groups, excluding missing values
GroupBy.std([ddof]) Compute standard deviation of groups, excluding missing values
GroupBy.sum(**kwargs) Compute sum of group values
GroupBy.var([ddof]) Compute variance of groups, excluding missing values
GroupBy.tail([n]) Returns last n rows of each group

The following methods are available in both SeriesGroupBy and DataFrameGroupBy objects, but may differ slightly, usually in that the DataFrameGroupBy version usually permits the specification of an axis argument, and often an argument indicating whether to restrict application to columns of a specific data type.

DataFrameGroupBy.agg(arg, *args, **kwargs) Aggregate using callable, string, dict, or list of string/callables
DataFrameGroupBy.all Return whether all elements are True over requested axis
DataFrameGroupBy.any Return whether any element is True over requested axis
DataFrameGroupBy.bfill([limit]) Backward fill the values
DataFrameGroupBy.corr Compute pairwise correlation of columns, excluding NA/null values
DataFrameGroupBy.count() Compute count of group, excluding missing values
DataFrameGroupBy.cov Compute pairwise covariance of columns, excluding NA/null values
DataFrameGroupBy.cummax([axis]) Cumulative max for each group
DataFrameGroupBy.cummin([axis]) Cumulative min for each group
DataFrameGroupBy.cumprod([axis]) Cumulative product for each group
DataFrameGroupBy.cumsum([axis]) Cumulative sum for each group
DataFrameGroupBy.describe(**kwargs) Generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.
DataFrameGroupBy.diff 1st discrete difference of object
DataFrameGroupBy.ffill([limit]) Forward fill the values
DataFrameGroupBy.fillna Fill NA/NaN values using the specified method
DataFrameGroupBy.filter(func[, dropna]) Return a copy of a DataFrame excluding elements from groups that do not satisfy the boolean criterion specified by func.
DataFrameGroupBy.hist Draw histogram of the DataFrame’s series using matplotlib / pylab.
DataFrameGroupBy.idxmax Return index of first occurrence of maximum over requested axis.
DataFrameGroupBy.idxmin Return index of first occurrence of minimum over requested axis.
DataFrameGroupBy.mad Return the mean absolute deviation of the values for the requested axis
DataFrameGroupBy.pct_change Percent change over given number of periods.
DataFrameGroupBy.plot Class implementing the .plot attribute for groupby objects
DataFrameGroupBy.quantile Return values at the given quantile over requested axis, a la numpy.percentile.
DataFrameGroupBy.rank Compute numerical data ranks (1 through n) along axis.
DataFrameGroupBy.resample(rule, *args, **kwargs) Provide resampling when using a TimeGrouper
DataFrameGroupBy.shift([periods, freq, axis]) Shift each group by periods observations
DataFrameGroupBy.size() Compute group sizes
DataFrameGroupBy.skew Return unbiased skew over requested axis
DataFrameGroupBy.take Return the elements in the given positional indices along an axis.
DataFrameGroupBy.tshift Shift the time index, using the index’s frequency if available.

The following methods are available only for SeriesGroupBy objects.

SeriesGroupBy.nlargest Return the largest n elements.
SeriesGroupBy.nsmallest Return the smallest n elements.
SeriesGroupBy.nunique([dropna]) Returns number of unique elements in the group
SeriesGroupBy.unique Return unique values in the object.
SeriesGroupBy.value_counts([normalize, ...])

The following methods are available only for DataFrameGroupBy objects.

DataFrameGroupBy.corrwith Compute pairwise correlation between rows or columns of two DataFrame objects.
DataFrameGroupBy.boxplot(grouped[, ...]) Make box plots from DataFrameGroupBy data.

Resampling

Resampler objects are returned by resample calls: pandas.DataFrame.resample(), pandas.Series.resample().

Indexing, iteration

Resampler.__iter__() Groupby iterator
Resampler.groups dict {group name -> group labels}
Resampler.indices dict {group name -> group indices}
Resampler.get_group(name[, obj]) Constructs NDFrame from group with provided name

Function application

Resampler.apply(arg, *args, **kwargs) Aggregate using callable, string, dict, or list of string/callables
Resampler.aggregate(arg, *args, **kwargs) Aggregate using callable, string, dict, or list of string/callables
Resampler.transform(arg, *args, **kwargs) Call function producing a like-indexed Series on each group and return

Upsampling

Resampler.ffill([limit]) Forward fill the values
Resampler.backfill([limit]) Backward fill the values
Resampler.bfill([limit]) Backward fill the values
Resampler.pad([limit]) Forward fill the values
Resampler.nearest([limit]) Fill values with nearest neighbor starting from center
Resampler.fillna(method[, limit]) Fill missing values
Resampler.asfreq([fill_value]) return the values at the new freq,
Resampler.interpolate([method, axis, limit, ...]) Interpolate values according to different methods.

Computations / Descriptive Stats

Resampler.count([_method]) Compute count of group, excluding missing values
Resampler.nunique([_method]) Returns number of unique elements in the group
Resampler.first([_method]) Compute first of group values
Resampler.last([_method]) Compute last of group values
Resampler.max([_method]) Compute max of group values
Resampler.mean([_method]) Compute mean of groups, excluding missing values
Resampler.median([_method]) Compute median of groups, excluding missing values
Resampler.min([_method]) Compute min of group values
Resampler.ohlc([_method]) Compute sum of values, excluding missing values
Resampler.prod([_method]) Compute prod of group values
Resampler.size() Compute group sizes
Resampler.sem([_method]) Compute standard error of the mean of groups, excluding missing values
Resampler.std([ddof]) Compute standard deviation of groups, excluding missing values
Resampler.sum([_method]) Compute sum of group values
Resampler.var([ddof]) Compute variance of groups, excluding missing values

Style

Styler objects are returned by pandas.DataFrame.style.

Styler Constructor

Styler(data[, precision, table_styles, ...]) Helps style a DataFrame or Series according to the data with HTML and CSS.
Styler.from_custom_template(searchpath, name) Factory function for creating a subclass of Styler with a custom template and Jinja environment.

Style Application

Styler.apply(func[, axis, subset]) Apply a function column-wise, row-wise, or table-wase, updating the HTML representation with the result.
Styler.applymap(func[, subset]) Apply a function elementwise, updating the HTML representation with the result.
Styler.where(cond, value[, other, subset]) Apply a function elementwise, updating the HTML representation with a style which is selected in accordance with the return value of a function.
Styler.format(formatter[, subset]) Format the text display value of cells.
Styler.set_precision(precision) Set the precision used to render.
Styler.set_table_styles(table_styles) Set the table styles on a Styler.
Styler.set_table_attributes(attributes) Set the table attributes.
Styler.set_caption(caption) Se the caption on a Styler
Styler.set_properties([subset]) Convenience method for setting one or more non-data dependent properties or each cell.
Styler.set_uuid(uuid) Set the uuid for a Styler.
Styler.clear() “Reset” the styler, removing any previously applied styles.

Builtin Styles

Styler.highlight_max([subset, color, axis]) Highlight the maximum by shading the background
Styler.highlight_min([subset, color, axis]) Highlight the minimum by shading the background
Styler.highlight_null([null_color]) Shade the background null_color for missing values.
Styler.background_gradient([cmap, low, ...]) Color the background in a gradient according to the data in each column (optionally row).
Styler.bar([subset, axis, color, width, align]) Color the background color proptional to the values in each column.

Style Export and Import

Styler.render(**kwargs) Render the built up styles to HTML
Styler.export() Export the styles to applied to the current Styler.
Styler.use(styles) Set the styles on the current Styler, possibly using styles from Styler.export.
Styler.to_excel(excel_writer[, sheet_name, ...]) Write Styler to an excel sheet

General utility functions

Working with options

describe_option(pat[, _print_desc]) Prints the description for one or more registered options.
reset_option(pat) Reset one or more options to their default value.
get_option(pat) Retrieves the value of the specified option.
set_option(pat, value) Sets the value of the specified option.
option_context(*args) Context manager to temporarily set options in the with statement context.

Testing functions

testing.assert_frame_equal(left, right[, ...]) Check that left and right DataFrame are equal.
testing.assert_series_equal(left, right[, ...]) Check that left and right Series are equal.
testing.assert_index_equal(left, right[, ...]) Check that left and right Index are equal.

Exceptions and warnings

errors.DtypeWarning Warning that is raised for a dtype incompatiblity.
errors.EmptyDataError Exception that is thrown in pd.read_csv (by both the C and Python engines) when empty data or header is encountered.
errors.OutOfBoundsDatetime
errors.ParserError Exception that is raised by an error encountered in pd.read_csv.
errors.ParserWarning Warning that is raised in pd.read_csv whenever it is necessary to change parsers (generally from ‘c’ to ‘python’) contrary to the one specified by the user due to lack of support or functionality for parsing particular attributes of a CSV file with the requsted engine.
errors.PerformanceWarning Warning raised when there is a possible performance impact.
errors.UnsortedIndexError Error raised when attempting to get a slice of a MultiIndex, and the index has not been lexsorted.
errors.UnsupportedFunctionCall Exception raised when attempting to call a numpy function on a pandas object, but that function is not supported by the object e.g.
Scroll To Top