📖 🐼 Pandas DataFrames: The Swiss Army Knife of Data!

📖 🐼 Pandas DataFrames: The Swiss Army Knife of Data!#

📦 What is a DataFrame?#

Think of a pandas DataFrame as a spreadsheet on steroids. It’s a two-dimensional data structure where:

Rows 🏠 represent individual data entries.
Columns 📊 hold different types of information about those entries.

It’s like an Excel sheet, but faster, smarter, and Pythonic!

🏗️ Creating a DataFrame#

Imagine you’re running a coffee shop ☕ and tracking orders. You can create a DataFrame like this:

import pandas as pd

data = {
    "Customer": ["Alice", "Bob", "Charlie"],
    "Order": ["Latte", "Espresso", "Mocha"],
    "Price ($)": [4.5, 3.0, 5.0],
}

df = pd.DataFrame(data)
print(df)

  Customer     Order  Price ($)
  Alice     Latte        4.5
    Bob  Espresso        3.0
Charlie     Mocha        5.0

Boom! 🎉 We’ve got a DataFrame!

🔎 Accessing Data#

Want to peek at the first few rows? Use .head():

Hey, this is a class method.

df.head()

	Customer	Order	Price ($)
0	Alice	Latte	4.5
1	Bob	Espresso	3.0
2	Charlie	Mocha	5.0

Need just one column? Use square brackets:

df["Order"]

     Latte
  Espresso
     Mocha
Name: Order, dtype: object

Wanna grab a single order? Use .loc or .iloc:

df.loc[1]  # Fetches row with index 1 (Bob's order)
df.iloc[2]  # Fetches row at position 2 (Charlie’s order)

Customer     Charlie
Order          Mocha
Price ($)        5.0
Name: 2, dtype: object

📌 Remember: .loc[] is for labels, .iloc[] is for positions!

🎛️ Filtering Data#

Let’s say we only want orders above $4:

expensive_orders = df[df["Price ($)"] > 4]
print(expensive_orders)

  Customer  Order  Price ($)
0    Alice  Latte        4.5
2  Charlie  Mocha        5.0

Pandas filters like a supercharged search engine! 🔥

🛠️ Modifying Data#

Oops! We had a Happy Hour discount—let’s apply a 10% discount:

df["Price ($)"] = df["Price ($)"] * 0.9

Pandas lets you modify data like a boss 😎.

📊 Summarizing Data#

Need a quick summary? Try .describe():

df.describe()

	Price ($)
count	3.00000
mean	3.75000
std	0.93675
min	2.70000
25%	3.37500
50%	4.05000
75%	4.27500
max	4.50000

Want to know how many of each drink was ordered?

df["Order"].value_counts()

Order
Latte       1
Espresso    1
Mocha       1
Name: count, dtype: int64

Pandas gives instant insights 📈.

🏎️ Speed Boost: Vectorized Operations#

Instead of looping through rows (which is slow 🐌), use pandas’ fast operations:

❌ Slow way:

df["Price with Tax"] = [price * 1.07 for price in df["Price ($)"]]

✅ Fast way (vectorized 🚀):

df["Price with Tax"] = df["Price ($)"] * 1.07

Pandas handles operations at warp speed! 🚀

🎭 Final Act: Exporting Data#

Want to save your hard work? Pandas supports:

📂 CSV: df.to_csv("orders.csv", index=False)

📊 Excel: df.to_excel("orders.xlsx", index=False)

📡 JSON: df.to_json("orders.json")

Boom! Your data is ready to travel! ✈️

🎉 Final Thoughts#

Pandas is like a data superhero 🦸‍♂️—it can:

✅ Read and write data 📂

✅ Slice and dice information 🔪

✅ Analyze and visualize 📈

✅ Handle massive datasets at lightning speed ⚡

So, whether you’re a data scientist, analyst, or just curious—pandas is your best friend! 🐼🔥

Want to go deeper? Explore:

📖 Official Docs: https://pandas.pydata.org/