Taking JSON Parsing in Python to the Next Level
JSON
is the preferred format for information exchange between systems due to its simplicity when it comes to parsing it into native language databases.
In this discussion, we’ll explore some lesser-known features and useful hacks within the json
module that can enhance your experience when working with APIs.
Serializing and Deserializaing JSON
Serializing: Serializing means converting JSON-compatible data structures into JSON
string.
import json
data = {
"name": "John",
"age": 30,
"city": "New York"
}
json_string = json.dumps(data)
print(f"{json_string!r}")
# '{"name": "John", "age": 30, "city": "New York"}'
Deserializing: Converting JSON
into Python
native data structures is known as JSON
deserializing.
import json
data = '{"name": "John", "age": 30, "city": "New York", "created_at": "2021-01-01"}'
data_dict = json.loads(data)
print(f"{json_string!r}")
# {'name': 'John', 'age': 30, 'city': 'New York', 'created_at': '2021-01-01'}
json.loads
will return dict
instance now.
Parsing datetime object to JSON
Python standard json
will fail to convert datetime
instance toJSON.
import datetime
data = {
"name": "John",
"created_at": datetime.datetime(2021, 1, 1, 12, 0, 0)
}
json_string = json.dumps(data)
# TypeError: Object of type datetime is not JSON serializable
To fix this let's pass default=str
to convert non-JSON
serializable to JSON
.
import json
import datetime
data = {
"name": "John",
"created_at": datetime.datetime(2021, 1, 1, 12, 0, 0),
"object": object()
}
json_string = json.dumps(data, default=str)
print(json_string)
# '{"name": "John", "created_at": "2021-01-01 12:00:00", "object": "<object object at 0x7f517cc78f90>"}'
Parsing Decimal types
Decimal
is not JSON
serializable but the problem is even on default=str
Decimal
will be converted to string but idealy it should be to float.
Let’s see
from decimal import Decimal
import json
data = {
"name": "John",
"amount": Decimal("10.5")
}
json_string = json.dumps(data, default=str)
# '{"name": "John", "amount": "10.5"}'
To fix this we can use simplejson
instead of json.
pip install simplejson
from decimal import Decimal
import simplejson as json
data = {
"name": "John",
"amount": Decimal("10.5")
}
json_string = json.dumps(data, default=str)
# '{"name": "John", "amount": 10.5}'
Dumping and Loading from and to a file
- Dumping to a file
import json
data = {"name": "John", "age": 30, "city": "New York"}
with open("file.json", "w") as fp:
data = json.dump(data, fp)
# cat file.json
# {"name": "John", "age": 30, "city": "New York"}
- Loading from a file
import json
data = {"name": "John", "age": 30, "city": "New York"}
with open("file.json", "r") as fp:
data = json.load(fp)
print(data)
# {'name': 'John', 'age': 30, 'city': 'New York'}
Dumping JSON with indentation of levels.
Allow JSON string to be well spaced to read and understand.
import json
data = {
"name": "John",
"details": {
"age": 30,
"city": "New York"
}
}
print(json.dumps(data, indent=4))
# {
# "name": "John",
# "details": {
# "age": 30,
# "city": "New York"
# }
# }
Sorting Key while converting to JSON
import json
data = {
"name": "John",
"details": {
"age": 30,
"city": "New York"
}
}
print(json.dumps(data, indent=4, sort_keys=True))
# {
# "details": {
# "age": 30,
# "city": "New York"
# },
# "name": "John"
# }
Passing hooks to convert loaded JSON into Complex Python Objects.
By default json.loads
will convert JSON
string to native Python
data-structure like list
and dict
. Now let's see how to convert these to different objects
.
- Converting to
object
where each key inJSON
is an attribute.
import json
data = '''
{
"details": {
"age": 30,
"city": "New York"
},
"name": "John"
}
'''
class JSONObject:
def __init__(self, data) -> None:
self.__dict__ = data
json_object = json.loads(data, object_hook=JSONObject)
print("Name: ", json_object.name)
print("Age: ", json_object.details.age)
print("City: ", json_object.details.city)
# Name: John
# Age: 30
# City: New York
I found this super cool and helpful.
- Converting to
OrderedDict
import json
from collections import OrderedDict
data = '''
{
"details": {
"age": 30,
"city": "New York"
},
"name": "John"
}
'''
json_object = json.loads(data, object_pairs_hook=OrderedDict)
print(json_object)
# OrderedDict([('details', OrderedDict([('age', 30), ('city', 'New York')])), ('name', 'John')])
Serializing and Deserializing Complex Objects
Let's convert
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return f'{self.__class__.__name__}(x={self.x}, y={self.y})'
into JSON
.
p = Point(1, 2)
def serialize_instance(obj):
d = {'__classname__': type(obj).__name__}
d.update(vars(obj))
return d
p = json.dumps(p, default=serialize_instance)
# '{"__classname__": "Point", "x": 1, "y": 2}'
Now come on a bit challenging task converting JSON
to object
again.
classes = {
'Point': Point
}
def unserialize_object(d):
classname = d.pop('__classname__', None)
if classname:
cls = classes[classname]
object = cls.__new__(cls)
for key, value in d.items():
setattr(object, key, value)
return object
else:
return d
p = json.loads(p, object_hook=unserialize_object)
print(p)
# Point(x=1, y=2)
See we converting back it to Point
instance.
Thanks for your time; I hope you’ve discovered some new and useful tips here, which have proven beneficial to both you and me.
Leave a 👏 and follow Rahul Beniwal for more Python and Software Developement related blogs.