Dictionaries: Key-Value Pairs
Using labeled data structures to build efficient and context-rich datasets.
Imagine you have a list of prices: [1200, 450, 300]. Which price belongs to the "Luxury Bag"? Which one is the "Keychain"? In a simple list, you have to remember the exact position (index) of every item. This is risky and slow.
In the world of Data, we need Context. Instead of just storing a value, we store a Key-Value Pair. This is exactly how a dictionary works: you look up a word (the Key) to find its definition (the Value). In Python, Dictionaries are the direct ancestor of the "Data Table" structure you will use later with tools like Pandas.
Dictionaries are not just lists with labels. Behind the scenes, Python uses a Hash Table to store keys. This means that looking up a key takes the same amount of time whether your dictionary has 10 items or 10 million.
# Create a dictionary product = { "name": "Luxury Leather Bag", "price": 1200, "stock": 15 } # Look up a value (The "Fast" way) print(product["name"]) # "Luxury Leather Bag"
Keys MUST be unique and immutable (Strings and Tuples are good keys; Lists are NOT). Values can be anything, even other dictionaries or lists!
One of the most common errors in data scripts is the KeyError. This happens when your code asks for a key that doesn't exist in the current record. In messy datasets, some records might be missing certain fields (e.g., a customer forgot to provide their phone number).
| Approach | Code | Result if Key Missing |
|---|---|---|
| Square Brackets | product["id"] | KeyError (Crash!) |
| .get() Method | product.get("id") | None (Safe) |
| .get() Default | product.get("id", 0) | 0 (Very Safe) |
# A list of dictionaries (A common data format) users = [ {"name": "Ahmed", "age": 25}, {"name": "Mona"} # Missing age! ] for u in users: # Use .get() to avoid crashing age = u.get("age", "Unknown") print(f"{u['name']} is {age}")
In professional data analysis, we rarely use a single dictionary. Instead, we use a List of Dictionaries. Think of it this way:
- Every Dictionary is a "Row" (one record).
- Every Key is a "Column" (one category).
- The List is the "Table" (the entire dataset).
sales_data = [ {"date": "Oct-1", "product": "Bag", "amount": 1200}, {"date": "Oct-1", "product": "Shoes", "amount": 850}, {"date": "Oct-2", "product": "Bag", "amount": 1200} ]
.get("key", 0) over ["key"] when reading a price from a dictionary?Let's build a mini-script that processes a raw data record and adds missing defaults.
# Raw record from a messy API raw_record = { "item": "Tablet", "price": "3500", "is_discounted": False } # 1. Accessing keys safely tax = raw_record.get("tax_rate", 0.05) # Default 5% if missing # 2. Adding new key-value pairs raw_record["category"] = "Electronics" # 3. Inspecting the "Table View" # We wrap .keys() in list() to make them easy to print and read print(f"Columns: {list(raw_record.keys())}") print(f"Values: {list(raw_record.values())}")
Let's break down what Python is telling us line by line:
Columns: Notice that the list shows four keys â item, price, is_discounted, and category â even though the original dictionary only had three. This is because line 2 of our script added a brand-new key ("category": "Electronics") directly into the dictionary. Dictionaries are mutable, so the new pair was inserted instantly.
Values: Notice that price appears as '3500' (with quotes), not as the number 3500. This is a very common issue in real data â the price was stored as a string in the original record, not as an integer. A real analyst would need to convert it with int(raw_record["price"]) before doing any calculations. Also notice that tax_rate does not appear in the output â because .get() only reads a default value without adding it to the dictionary.
When you call .keys() or .values(), Python provides a "View Object"âa live link to the dictionary. If you print a View Object directly, it looks a bit messy. By wrapping it in list(), we "capture" the data into a standard list format that is clean and easy to read.
- Dictionaries use
Key: Valuepairs for fast, labeled data storage. - .get() is essential for handling missing data without crashing.
- Lists of Dictionaries represent tabular data (Rows and Columns).
- Keys must be immutable (usually strings); Values can be any data type.
-
â
Dictionaries in Python (Real Python)
https://realpython.com/python-dicts/ -
â
Python Dictionary Methods (W3Schools)
https://www.w3schools.com/python/python_ref_dictionary.asp