Skip to content

API proposal: shoudn't the "index" property (and parameter) of a df renamed "rows" ? #12891

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sylvaticus opened this issue Apr 13, 2016 · 3 comments

Comments

@sylvaticus
Copy link

Sorry to open a issue for such a minor thing..
in a DataFrame object, both the columns and index properties are actually Index objects.
Wouldn't be cleaner to rename index as rows ?
As you are working with n-dimensional panels, rows and columns could then be just convenient aliases for something like axis0 and axis1 respectively (in a df), while further dimensional container could use axis2, axis3 and so on (the current axis names labels, items, major_axis and minor_axis do not suggest any particular order).

@jreback
Copy link
Contributor

jreback commented Apr 13, 2016

well .rows is not meaninful for a 1-d (Series) object. So not sure of the point. You can index by number if you want e.g. 0 {inde}x, 1 {columns}

@sylvaticus
Copy link
Author

Thank you for your opinion and the quick answer.
I just reply, to be sure what I mean is clear, but of course it's just a proposal from a new user that doesn't know the architectural design of the pandas library. I am aware of it ;-)

The "problem" is that you can't use axis=0 everywhere, e.g. in the constructor:

class pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

df = pd.DataFrame(np.array([[0,1],[2,3]]),
                  columns=['c1','c2'],
                  index=['r1','r2'])

you need to use the index and columns parameter explicitly.
In my proposal rows and columns may be aliases to axis0 and axis1 in the specific DataFrame context, where for "alias" I mean that it is API-side guaranteed to be equivalent (that is, these names are provided just for convenience to the users that can mentally map 2D data in a confident table fashion), while for further dimensionality you can refer to axisN.
For API-backward compatibility you can have the current axis names as other (deprecated?) aliases.
I do not see anything again rows in a Series context, but an other alias name may be more appropriate, in any case something different than a generic name (the Index type of object) used as a "private" name (the index name of the axis)..

@jorisvandenbossche
Copy link
Member

@sylvaticus The last years, pandas has actually made the opposite move. Some functions had a rows keyword, and these are deprecated/removed in favor of index (see eg #5505). So I don't think we are going the change that again.
Further, the >2D dimensional objects (Panel etc) are probably going to be deprecated in a future release, so I wouldn't worry about those names too much.

But I agree that the fact that both the index and the columns are Index objects is a potential confusing point for newcomers. However, I think it is something that people will have to learn, as IMO it is too late to change that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants