Recipe 4.1. Copying an Object
Credit: Anna Martelli Ravenscroft, Peter Cogolo
Problem
You
want to copy an object. However, when you assign an object, pass it
as an argument, or return it as a result, Python uses a reference to
the original object, without making a copy.
Solution
Module
copy in the standard Python library offers two
functions to create copies. The one you should generally use is the
function named copy, which returns a new object
containing exactly the same items and attributes as the object
you're copying:
import copy
new_list = copy.copy(existing_list)
On the rare occasions when you also want every item and attribute in
the object to be separately copied, recursively, use
deepcopy:
import copy
new_list_of_dicts = copy.deepcopy(existing_list_of_dicts)
Discussion
When you assign an object (or pass it as an argument, or return it as
a result), Python (like Java) uses a reference to the original
object, not a copy. Some other programming languages make copies
every time you assign something. Python never makes copies
"implicitly" just because
you're assigning: to get a copy, you must
specifically request a copy.
Python's behavior is simple, fast, and uniform.
However, if you do need a copy and do not ask for one, you may have
problems. For example:
>>> a = [1, 2, 3]
>>> b = a
>>> b.append(5)
>>> print a, b
[1, 2, 3, 5] [1, 2, 3, 5]
Here, the names a and
b both refer to the same object (a list),
so once we alter the object through one of these names, we later see
the altered object no matter which name we use for it. No original,
unaltered copy is left lying about anywhere.
 |
To become an effective Python programmer, it is crucial that you
learn to draw the distinction between altering an
object and assigning to a name, which
previously happened to refer to the object. These two kinds of
operations have nothing to do with each other. A statement such as
a=[ ] rebinds name a
but performs no alteration at all on the object that was previously
bound to name a. Therefore, the issue of
references versus copies just doesn't arise in this
case: the issue is meaningful only when you
alter some object.
|
|
If you are about to alter an object, but you want to keep the
original object unaltered, you must make a copy. As this
recipe's solution explains, the module
copy from the Python Standard Library offers two
functions to make copies. Normally, you use
copy.copy, which makes a shallow
copyit copies an object, but for each attribute or
item of the object, it continues to share references, which is faster
and saves memory.
Shallow copying, alas, isn't sufficient to entirely
"decouple" a copied object from the
original one, if you propose to alter the items or attributes of
either object, not just the object itself:
>>> list_of_lists = [ ['a'], [1, 2], ['z', 23] ]
>>> copy_lol = copy.copy(lists_of_lists)
>>> copy_lol[1].append('boo')
>>> print list_of_lists, copy_lol
[['a'], [1, 2, 'boo'], ['z', 23]] [['a'], [1, 2, 'boo'], ['z', 23]]
Here, the names list_of_lists and
copy_lol refer to distinct objects (two lists), so
we could alter either of them without affecting the other. However,
each item of list_of_lists is
the same object as the corresponding item of
copy_lol, so once we alter an item reached by
indexing either of these names, we later see the altered item no
matter which object we're indexing to reach it.
If you do need to copy some container object and
also recursively copy all objects it refers to (meaning all items,
all attributes, and also items of items, items of attributes, etc.),
use copy.deepcopysuch deep copying may cost
you substantial amounts of time and memory, but if you gotta, you
gotta. For deep copies, copy.deepcopy is the only
way to go.
For normal shallow copies, you may have good alternatives to
copy.copy, if you know the type of the object you
want to copy. To copy a list L, call
list(L);
to copy a dict d, call dict(d);
to copy a set s (in Python 2.4, which
introduces the built-in type set), call
set(s). (Since list,
dict, and, in 2.4, set, are
built-in names, you do not need to perform any
"preparation" before you use any of
them.) You get the general pattern: to copy a copyable object
o, which belongs to some built-in Python
type t, you may generally just call
t(o). dicts also offer a
dedicated method to perform a shallow copy: d.copy(
|