This notebook contains an excerpt from the Python Programming and Numerical Methods - A Guide for Engineers and Scientists, the content is also available at Berkeley Python Numerical Methods.

The copyright of the book belongs to Elsevier. We also have this interactive book online for a better learning experience. The code is released under the MIT license. If you find this content useful, please consider supporting the work on Elsevier or Amazon!

< 11.2 CSV Files | Contents | 11.4 JSON Files >

Pickle Files

In this section, we will introduce another way to store the data to the disk - pickle. We talked about saving data into text file or csv file. But in certain cases, we want to store dictionaries, tuples, lists, or any other data type to the disk and use them later or send them to some colleagues. This is where pickle comes in, it can serialize objects so that they can be saved into a file and loaded again later.

Pickle can be used to serialize Python object structures, which refers to the process of converting an object in the memory to a byte stream that can be stored as a binary file on disk. When we load it back to a Python program, this binary file can be de-serialized back to a Python object.

Write a pickle file

TRY IT! Create a dictionary, and save it to a pickle file on disk. To use a pickle, we need to import the module first.

import pickle
dict_a = {'A':0, 'B':1, 'C':2}
pickle.dump(dict_a, open('test.pkl', 'wb'))

To use pickle to serialize an object, we use the pickle.dump function, which takes two arguments: the first one is the object, and the second argument is a file object returned by the open function. Note here the mode of the open function is ‘wb’ which indicates write binary file.

Read a pickle file

Now let’s load the pickle file we just saved on the disk back using the pickle.load function.

my_dict = pickle.load(open('./test.pkl', 'rb'))
{'A': 0, 'B': 1, 'C': 2}

We can see the loading of a pickle file is very similar to the saving process, but here the mode of the open function is ‘rb’ indicates read the binary file. And this function will be de-serialize the binary file back to the original object, which is a dictionary in our case.

Read in Python 2 pickle file

Sometimes, you may need to open a pickle file from some colleague who generates it using Python 2 instead of Python 3. You could either unpickle it using Python 2, or use Python 3 with the *encoding=’latin1’ in the pickle.load function.

infile = open(filename,'rb')
new_dict = pickle.load(infile, encoding='latin1')

WARNING! One drawback of pickle file is that it is not a universal file format, which means that it is not easy for other programming languages to use it. The TXT and CSV files could be easily shared with other colleagues who are not use Python, and they could open it using R, Matlab, Java and so on. But for pickle file, it is specially designed for Python, therefore, not easy to use the data with other languages.

< 11.2 CSV Files | Contents | 11.4 JSON Files >