Advanced Data Types from the collections library in python

Advanced Data Type - namedtuple

If you have read some basics or attended to classes in programming, following python data types only will be introduced -

  1. integer

  2. strings

  3. boolean

  4. list

  5. tuple

  6. set

  7. dictionary

Every data type has certain characteristic and features.

There are also more advanced data types in python than the above. Data types in python are extensible as well.

Some of the special types are enabled either by importing inbuilt package or by import external packages.

one such inbuilt package is the collections package. Lets see this in detail.

The following are some of the advanced data type from collections library of the python official documentation website,

  1. namedtuple

  2. deque

  3. ChainMap

  4. Counter

  5. OrderedDict

  6. defaultdict

  7. UserString

  8. UserList

  9. UserDict

I consider every object in python to have a create, read, update, delete access methods. There will be differences. But looking at a type from this perspective makes it easy to understand and remember as well. Lets see one by one.

Advanced Data Type - namedtuple

A namedtuple is used to create a data structure to store group of values together. Eg, To read and write from a database table, To update states of a UI, etc.

This is like creating a subclass only with attributes and no methods. We will provide a name to our namedtuple class and list of attributes within a list. And all the attributes can be accessed using the created namedtuple’s class object. Lets see the required steps below.

from collections import namedtuple   #(1)

#(2)
Farm = namedtuple('Farm', ['Crop', 'Fertilizer','Soil','StartDate','EndDate'])

#(3)
objFarm = Farm("Rice","Potassium","Clayey","1-2","4-5")

print(objFarm)
objFarm.Crop
objFarm.Fertilizer
objFarm.Soil
objFarm.StartDate
objFarm.EndDate
  1. Import the namedtuple class from the collections library.

  2. Create a namedtuple instance with 2 arguments, 1st is the class name & the 2nd is the list of attributes for the class. Assign the object to a identifier. This can be the same name as of class name that is supplied as 1st argument or different as well. We have used same name to avoid confusion.

  3. Using the created class in above step, instantiate an object by passing the values. We have supplied all 5 values as it expects from the class definition.

    • Now using the object access the individual attributes and print them.

    • Also check the type of the created object. This will be a Farm class. There are also other methods defined in namedtuple base class to access and modify the object. You can explore more.

Advanced Data Type - deque

A deque class implements a double ended queue. A queue is used for memory efficient operation and whenever thread safe objects are required, Eg: Interprocess communication.

The time complexity for append and pop of queue is O(1) in comparison, time complexity for insert and pop of list takes O(n).

q = deque()   #(1)
q.append('carrot')   #(2)
q.append('brinjal')
q.appendleft('chilli')   #(3)
print(q)
q = deque(maxlen=3)   #(4)
q.rotate(1)   #(5)
print(q)
  1. Create a empty queue using the deque class

  2. Append elements to the q object

  3. Elements can be appended in either direction, default is right. To append to the left use the appendleft method

  4. A queue can also be created to be of fixed length by defining maxlen argument while creating the queue.

  5. rotate operation on q, if is a positive value then moves the elements by number of position towards right i.e in this case pops 1 element and inserts on left.

Advanced Data Type - ChainMap

A chainmap can be used to combine multiple dictionary. It is a faster way than to create new dictionary or update operations.

from collections import ChainMap   #(1)

#(2)
baseline = {'music': 'bach', 'art': 'rembrandt'}
adjustments = {'art': 'van gogh', 'opera': 'carmen'}
combined = ChainMap(adjustments, baseline)   #(3)
print(combined)
list(combined)
combined['art']   #(4)
combined['music']
combined['opera']
combined['music'] = 'rock'

combined.maps[0]   #(5)
combined.maps[1]
  1. Import the Chainmap class

  2. Create 2 dicionaries baseline and adjustments

  3. Use the Chainmap class to combine the created dictionary and product combined as a Chainmap object.

    • Print the combined object to observe its structure. Each dictionary is stored in a separate index position.

  4. Access a Key value present in any of the dicitonary contained in Chainmap. If it is not present then KeyError will be thrown.

    • Assign a particular value to a key contained in the first dictionary by default present in the chainmap object. If it is not present then it will be created.

  5. Access individual dictionaries by its index position.

Advanced Data Type - Counter

A counter is a dictionary collection that stores a key and a numeric integer as its value. Counter object can be created from any iterable, mapping or keyword argument as inputs to the counter class.

With the created counter object certain math operations similar to a set can be performed.

from collections import Counter   #(1)

store = Counter(['apples','apples','banana','apples'])   #(2)

store1 = Counter(apples=3, oranges=1)   #(3)

store2 = Counter(apples=1, oranges=2)

# Add
store1 + store2   #(4)

# Retains positive value
store1 - store2   #(5)

# intersection
store1 & store2   #(6)

# union - max value
store1 | store2   #(7)
  1. Import the Counter class

  2. Pass the list object as input to Counter, number of occurences of each element will become the value and the element as the key.

  3. Create 2 more counter object by passing keyword arguments. print the counter object and observe the created dictionary values.

  4. Add operator performs a addition across key values.

  5. Subtraction operator subtracts across key values, and retains only positive.

  6. Intersection & produces the minimum values present across keys.

  7. Union | produces the maximum values present across keys.

Advanced Data Type - OrderedDict

An Ordered dictionary will remember the order in which key value pairs are added. In a normal dictionary the order will not be maintained.

But still an Ordered Dict can be compared with Normal Dictionary, Equality will match if the key value pairs are going to be same.

from collections import OrderedDict   #(1)

farm = {'Crop':'Rice', 'Fertilizer':'Potassium','Soil':'Clayey','StartDate':'1-2','EndDate':'4-5'}   #(2)

oFarm = OrderedDict(farm)   #(3)
oFarm.move_to_end('Crop')   #(4)
  1. Import the OrderedDict class

  2. Create a normal dictionary farm

  3. Pass the created dictionary as argument to OrderedDict class

  4. Using the OrderedDict object keys can be moved to last position using the move_to_end method.

Other methods of a dictionary like get, keys, values, pop, items etc can be operated on the OrderedDict object as well.

Advanced Data Type - defaultdict

A dictionary contains key:value pair and when it is created its value is stored as individual element. This behavior can be modified by using defaultdict class and this will accept an argument that will define to which type the dictionary values are to be stored.

The argument defines the default factory method. Some common types are list, integer, set and custom methods are also possible.

from collections import defaultdict   #(1)

# default_factory as list   #(2)
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
d = defaultdict(list)

for k, v in s:   #(3)
    d[k].append(v)

sorted(d.items())
  1. Import the defaultdict class

  2. Create a list of tuples, containing pairs of values. We will use this list to create a dictionary such that all values of same key will be appended together. We will see how this is achieved in next steps.

    • Createa defaultdict object using the class and pass list as the argument.

  3. Iterate the input list of tuples in s, in every iteration we will get 2 values contained inside the tuple, use the first value (k) as key and the second value (v) append as shown d[k].append(v). For every same key in future the values will be appended to its list instead of overwriting as in a regular dictionary.

Access the dictionary d and its methods outside the loop.

Advanced Data Type - UserString

UserString is a class used to create string objects. This is a alternative way.

from collections import UserString   #(1)

us = UserString('mississippi')   #(2)
us.data   #(3)
us.upper()
  1. As first step import the UserString class from collections library

  2. Second step create an object by passing our string value as input

  3. Access the data attribute of object to read the value

    • Other methods of a string can be accessed using the created object.

    • This UserString class is superseded by the str class which is the default string class.

Advanced Data Type - UserList

UserList is a class used to create list objects. This is a alternative way.

from collections import UserList   #(1)

ul = UserList(['apple','orange','grapes'])   #(2)
ul.data   #(3)
ul.append('guava')
  1. As first step import the UserList class from collections library

  2. Second step create an object by passing our list object as input

  3. Access the data attribute of object to read the original value

    • Other methods of a list can be accessed using the created object.

    • This UserList class is superseded by the list class which is the default.

Advanced Data Type - UserDict

UserDict is a class used to create dict objects. This is a alternative way.

from collections import UserDict   #(1)

ud = UserDict({'apple':25,'orange':30,'grapes':15})   #(2)
ud.data   #(3)
ud.update({'guava':40})
  1. As first step import the UserDict class from collections library

  2. Second step create an object by passing our list object as input

  3. Access the data attribute of object to read the original value

    • Other methods of a regular dict can be accessed using the created object as well.

    • This UserDict class is superseded by the dict class which is the default.