How to Fix the Python Error: nameerror name pd is not defined

Pandas is one of the most well-known Python libraries for data manipulation, analytics, and similar math-related functionality. The library’s combination of fame, power, and versatility makes it an easy choice for Python coders who are interested in pushing the language to the limits. But, at the same time, using a complex new library often brings up some unexpected errors. A “nameerror name pd is not defined” error is one of the most common problems people face when using Pandas for the first time. But it’s also usually quite easy to fix.

What Does the Error Mean?

The “nameerror name pd is not defined” Python error is essentially just saying that you’ve never defined the pd variable in your code. But at some point, your code is referencing it. This is most easily explained as code expecting Pandas to be imported under a missing alias of pd.

Looking Deeper Into Some of the Error’s Hints

The error message would normally be fairly ambiguous on its own. The nameerror can usually refer to anything from a missing library to a typo when assigning an integer. However, there’s one important clue to be found in this error message’s specific format. It references a variable called pd.

The characters pd could technically refer to anything. It’s just two characters that could be assigned anything from a string to a NumPy float. But when you see pd it’s almost always in the context of Pandas. It’s something of a tradition to import the Pandas module under an alias of pd. In fact, even the official Pandas documentation usually uses a pd alias.

There’s nothing forcing people to import Pandas as pd. Pandas can even be used without an alias at all. But the pd naming convention is so common that it’s almost a certainty that code with those characters is using Pandas. And likewise, if code produces a “nameerror name pd is not defined” error then it probably means that it’s missing a proper import and assignment of Pandas to pd.

This is an especially common issue when people are just starting out with Pandas and haven’t become accustomed to standard coding practices within it. For example, some tutorials might have a code block that doesn’t assign Pandas to pd simply because the author assumed everyone would do so by default. This is a less common practice with libraries that lack specific trends to important them under a standard naming convention.

How To Fix the Error

Fixing the error is generally just a matter of making the proper alias assignment of Pandas to pd. We can begin implementing a fix by creating Python code that raises the “nameerror name pd is not defined” error.import pandas
df = pd.DataFrame({‘first’: [1, 2, 3, 4, 5], ‘second’: [6, 7, 8, 9, 10]})
print(df)

We begin by importing Pandas on the first line. Note that there’s no alias assignment. Next, we try to create a Pandas dataframe and assign it to df. Finally, we try to print df to screen. But the code fails almost immediately as the interpreter notes that pd hasn’t been defined. To fix it, we simply need to assign df as an alias of Pandas when importing the module.

import pandas as pd
df = pd.DataFrame({‘first’: [1, 2, 3, 4, 5], ‘second’: [6, 7, 8, 9, 10]})
print(df)

The code should now run smoothly with no errors. However, this isn’t the only way to fix the nameerror. You can also simply use Pandas without an alias. The following code will produce the same clean results with a slightly different approach.

import pandas
df = pandas.DataFrame({‘first’: [1, 2, 3, 4, 5], ‘second’: [6, 7, 8, 9, 10]})
print(df)

This Python code will also run without any errors. And it will produce the same dataframe as in the previous fixed example where we assigned Pandas to an alias of pd. The main difference in this approach can be seen on the first line. Pandas is directly imported, without the use of an alias. The next line removes the reference to pd and instead directly calls dataframe from Pandas.

Both of these solutions will produce the same result. So which is the best approach? On a technical level there’s no real difference between the two. However, it’s generally best to go along with stylistic rules that have been almost uniformly adopted by the larger community. Especially in this case where Pandas itself usually uses a pd alias in its internal documentation.

Using Pandas with an alias of pd also means that your code will have a higher level of overall compatibility. If you share portions of your code with others it’s almost a given that they’ll want to use a pd alias as well. Likewise, code shared with you will probably use that same pd alias.

How to Fix the Python Error: nameerror name pd is not defined
Scroll to top