Understanding Rbind in Python

One of the most important elements of Python is its expandability. Various third-party libraries can add a wide range of specialized functionality to the language. And this is perhaps most readily apparent in areas related to math and data science. NumPy and Pandas are particularly important for anyone who needs to work with large amounts of data.

But the reason why these libraries function so well isn’t simply because they offer specialized data types. Panda’s lexicon offers specific data-focused functions which are designed to work with complex data types in an impressively optimized way. However, it’s important to remember that Python’s versatility is what makes Pandas possible. These new functions can be used as building blocks to create additional functionality. So if there’s a function that you wish was in Panda’s lexicon you can always implement it yourself using the tools it provides. For example, what if you wanted to use something like R’s Rbind with Panda’s data frames?

A Quick Look at Rbind

R’s Rbind, short for row bind, lets you easily combine two data frames by their rows. In short, as the name suggests, it binds data frames on a row. It’s an extremely useful function. But we don’t find it as a default option for Pandas’ data frames. That doesn’t mean we can’t implement the function’s capabilities by ourselves though.

Binding and Concatonating Data

Pandas might not give us Rbind’s specific function to use. But it does give us a concat function that can do something similar. The main difference is that concat is capable of a lot more than just a simple row bind. We can emulate Rbind fairly easily using a subset of concat’s functionality though. For example, take a look at the following code.

import pandas as pd

df1 = pd.DataFrame({‘Classification’: [‘Alpha’, ‘Beta’, ‘Gamma’],’Value’: [99, 98, 97]})
df2 = pd.DataFrame({‘Classification’: [‘Delta’, ‘Epsilon’, ‘Zeta’],’Value’: [96, 95, 94]})
df = pd.concat([df1, df2])
print(df)

The code sample starts out with an import to use Pandas’ functionality as pd. We then define and populate df1 and df1 as a standard Pandas dataframe. Next, the code uses Pandas’ concat to combine the current frame of df1 with df2 and then assign it to df. Finally, we print out the results on our screen. At this point, we have something very close to Rbind’s functionality. But we can things a little further to make this as simple as Rbind’s syntax in R.

A Custom Rbind

Take a look at the following code to see how we can expand the functionality of the prior example while also tidying it up.

import pandas as pd

def Rbind(frame1,frame2):
joinedFrame = pd.concat([frame1, frame2]).reset_index(drop=True)
return joinedFrame

df1 = pd.DataFrame({‘Classification’: [‘Alpha’, ‘Beta’, ‘Gamma’],’Value’: [99, 98, 97]})
df2 = pd.DataFrame({‘Classification’: [‘Delta’, ‘Epsilon’, ‘Zeta’],’Value’: [96, 95, 94]})
df = Rbind(df1,df2)
print(df)

We begin by once again importing Pandas as pd. But this time around we immediately create a new function called Rbind. This function accepts two data frames, frame1 and frame2. It then performs a similar concatenation to what we implemented in the first example. The main point of difference is that we also reset the index using concat’s reset_index. If you recall, the earlier example had repeated numbers in the initial column. Resetting the index ensures we’re able to cleanly list items in their proper and labeled order.

The code following this new function is nearly the same as in the first example. We define df1 and df2 in the same way and with the same content. But we follow that up by defining df with the output of our new Rbind function. Finally, we print the result to the screen. Note that the column name for df 1 and df2 are properly joined as df. And the numbering properly iterates to create a seamless integration between the two data frames.

Understanding Rbind in Python
Scroll to top